A vector database implemented from scratch in Python. Stores float32 vectors with optional metadata, supports approximate nearest-neighbour search via HNSW, and survives crashes using a write-ahead log.
┌──────────────────────────────────────────┐
│ FastAPI HTTP layer │
├──────────────────────────────────────────┤
│ WarpDB │
│ ┌───────────┐ ┌───────────┐ ┌──────────┐ │
│ │ HNSW │ │ Vector │ │ Metadata │ │
│ │ index │ │ store │ │ store │ │
│ │(in-memory)│ │(.f32 file)│ │ (SQLite) │ │
│ └───────────┘ └───────────┘ └──────────┘ │
│ ┌───────────┐ │
│ │ WAL │ │
│ │(.bin file)│ │
│ └───────────┘ │
└──────────────────────────────────────────┘
| Component | Implementation |
|---|---|
| Vector storage | Flat binary .f32 file, memory-mapped for reads |
| Metadata storage | SQLite |
| ANN index | HNSW (Hierarchical Navigable Small World) |
| Crash recovery | Write-ahead log with UPSERT/COMMIT records |
poetry installpoetry run uvicorn warpdb.api.server:app --reloadThe server starts on http://localhost:8000 with vector dimension fixed at 768.
poetry run pytestWith the server running, execute the benchmark suite in a separate terminal:
python benchmarks/bench.pyThis runs four workloads against the server:
| Workload | What it measures |
|---|---|
| Upsert throughput | Vectors/sec at varying concurrency levels |
| Search latency | p50/p95/p99 latency and queries/sec |
| Recall@10 | HNSW approximation accuracy vs brute-force |
| Compaction | Time to reclaim dead vectors from disk |
| Mixed 80/20 | Read/write contention over a timed window |
Run python benchmarks/bench.py --help for all options.