Vector Databases
Semantic search and similarity - Pinecone, Qdrant, Weaviate, pgvector - the storage layer for embeddings and RAG
Vector Databases
A vector database stores embeddings — high-dimensional numerical vectors that represent text, images, audio, or arbitrary content — and answers queries like "find the N items most similar to this one." It's the storage layer behind semantic search, RAG (Retrieval-Augmented Generation), recommendation systems, deduplication, and most modern AI applications.
This is distinct from keyword Search (Algolia / Meilisearch / Elasticsearch) — keyword search matches words, vector search matches meaning.
What a Vector Database Does
Document ──► Embedding model ──► Vector [0.21, -0.43, 0.88, ...] (e.g. 1536 dims)
"Espresso machine" └──► stored with metadata in vector DB
Query ──► Embedding model ──► Vector [0.19, -0.41, 0.85, ...]
"coffee maker" └──► find top-K nearest vectors
returns: ["Espresso machine", "French press", ...]The key operation is approximate nearest neighbor (ANN) search — finding the closest vectors by cosine similarity, dot product, or Euclidean distance.
Why Not a Regular Database
| Regular DB | Vector DB |
|---|---|
WHERE category = 'electronics' (equality / range) | Find by similarity |
| B-tree / hash indexes | HNSW / IVF / DiskANN indexes (ANN) |
| Exact answers | Approximate (configurable trade-off) |
| Hundreds of MB indexes are large | GB-scale indexes are normal |
| Doesn't know what content means | Embedding captures meaning |
Postgres can do vector search via pgvector, and for many workloads that's the right answer. Dedicated vector DBs win at scale (billions of vectors) or when you need ANN-specific features.
The Players
| Database | Type | Notes |
|---|---|---|
| pgvector | Postgres extension | Free; embedded in your existing DB; great for small to medium scale |
| Qdrant | Standalone (Rust) | Open-source; self-host or cloud; rich filtering; fast |
| Weaviate | Standalone (Go) | Open-source; modular; built-in embedding generators |
| Milvus | Standalone (Go) | Open-source; designed for huge scale (billions) |
| Pinecone | SaaS only | Pioneer; managed; simple API; pricier at scale |
| Chroma | Standalone / embedded (Python) | Dev-friendly; popular for prototypes |
| LanceDB | Embedded (Rust) | Local-first; columnar; great for desktop / Notebooks |
| Elasticsearch / OpenSearch | Search + vector | Adds vectors to existing search infrastructure |
| MongoDB Atlas Vector Search | Hosted | Bundled with MongoDB Atlas |
| Redis + RediSearch | In-memory | Vector search inside Redis |
| Turbopuffer / Aperture | Newer SaaS | Cheap at scale; columnar storage |
For new projects in 2026:
- Already on Postgres → pgvector. Lowest operational cost; good through ~10M vectors.
- Self-host, want feature-rich → Qdrant.
- Don't want to operate → Pinecone or Qdrant Cloud or Turbopuffer.
- Hyperscale (>100M vectors) → Milvus or specialized service.
Learning Path
1. Getting Started
Run Qdrant in Docker, generate embeddings, store and query - hello world RAG
2. Hybrid Search
Combining keyword and vector for better results; filters; reranking; chunking
3. Best Practices
Embedding models, dimensionality, indexes, metadata, observability, cost
What's an Embedding
A function (the embedding model) that maps content to a fixed-size vector. Properties that matter:
- Same model for documents and queries — vectors only mean the same thing within one model's vector space.
- Dimensionality — 384, 768, 1024, 1536, 3072 are common. Higher = more nuanced but bigger storage and compute.
- Semantic — similar meaning → similar vectors (cosine similarity ~1); different meaning → vectors far apart (~0 or even negative).
Popular embedding models in 2026:
| Model | Dims | Notes |
|---|---|---|
OpenAI text-embedding-3-small | 1536 | Cheap; good quality |
OpenAI text-embedding-3-large | 3072 | Better; pricier |
Cohere embed-v3 | 1024 | Multi-language; well-regarded |
Voyage AI voyage-3 | 1024-2048 | Quality-leading for code / docs |
BAAI bge-large-en-v1.5 | 1024 | Open-source; self-host with sentence-transformers |
nomic-embed-text-v1.5 | 64-768 (Matryoshka) | Open-source, configurable dims |
For self-host: bge-large or nomic-embed on a small GPU. For SaaS: OpenAI is the easy default; Voyage if quality matters.
Common Use Cases
| Use case | What you do |
|---|---|
| RAG (Retrieval-Augmented Generation) | Embed docs; for each query, retrieve top-K relevant docs; pass to LLM as context |
| Semantic search | "Find products like this" without keyword matching |
| Recommendation systems | Embed users + items; find users with similar tastes |
| Deduplication | Embed content; find near-duplicates with high cosine similarity |
| Image search | Embed images with CLIP-like models; search by image or text |
| Code search | Embed code; find similar functions across a codebase |
| Classification | Embed examples per class; classify by nearest centroid |
For most RAG and search applications, vector search alone underperforms — keyword matching catches things vectors miss (proper nouns, exact terms) and vector catches things keywords miss (synonyms, paraphrasing). The winning approach is hybrid — see Hybrid Search.