Steven's Knowledge

Vector Databases

Semantic search and similarity - Pinecone, Qdrant, Weaviate, pgvector - the storage layer for embeddings and RAG

Vector Databases

A vector database stores embeddings — high-dimensional numerical vectors that represent text, images, audio, or arbitrary content — and answers queries like "find the N items most similar to this one." It's the storage layer behind semantic search, RAG (Retrieval-Augmented Generation), recommendation systems, deduplication, and most modern AI applications.

This is distinct from keyword Search (Algolia / Meilisearch / Elasticsearch) — keyword search matches words, vector search matches meaning.

What a Vector Database Does

Document     ──► Embedding model ──► Vector [0.21, -0.43, 0.88, ...]  (e.g. 1536 dims)
"Espresso machine"                    └──► stored with metadata in vector DB

Query        ──► Embedding model ──► Vector [0.19, -0.41, 0.85, ...]
"coffee maker"                        └──► find top-K nearest vectors
                                           returns: ["Espresso machine", "French press", ...]

The key operation is approximate nearest neighbor (ANN) search — finding the closest vectors by cosine similarity, dot product, or Euclidean distance.

Why Not a Regular Database

Regular DBVector DB
WHERE category = 'electronics' (equality / range)Find by similarity
B-tree / hash indexesHNSW / IVF / DiskANN indexes (ANN)
Exact answersApproximate (configurable trade-off)
Hundreds of MB indexes are largeGB-scale indexes are normal
Doesn't know what content meansEmbedding captures meaning

Postgres can do vector search via pgvector, and for many workloads that's the right answer. Dedicated vector DBs win at scale (billions of vectors) or when you need ANN-specific features.

The Players

DatabaseTypeNotes
pgvectorPostgres extensionFree; embedded in your existing DB; great for small to medium scale
QdrantStandalone (Rust)Open-source; self-host or cloud; rich filtering; fast
WeaviateStandalone (Go)Open-source; modular; built-in embedding generators
MilvusStandalone (Go)Open-source; designed for huge scale (billions)
PineconeSaaS onlyPioneer; managed; simple API; pricier at scale
ChromaStandalone / embedded (Python)Dev-friendly; popular for prototypes
LanceDBEmbedded (Rust)Local-first; columnar; great for desktop / Notebooks
Elasticsearch / OpenSearchSearch + vectorAdds vectors to existing search infrastructure
MongoDB Atlas Vector SearchHostedBundled with MongoDB Atlas
Redis + RediSearchIn-memoryVector search inside Redis
Turbopuffer / ApertureNewer SaaSCheap at scale; columnar storage

For new projects in 2026:

  • Already on Postgrespgvector. Lowest operational cost; good through ~10M vectors.
  • Self-host, want feature-richQdrant.
  • Don't want to operatePinecone or Qdrant Cloud or Turbopuffer.
  • Hyperscale (>100M vectors)Milvus or specialized service.

Learning Path

What's an Embedding

A function (the embedding model) that maps content to a fixed-size vector. Properties that matter:

  • Same model for documents and queries — vectors only mean the same thing within one model's vector space.
  • Dimensionality — 384, 768, 1024, 1536, 3072 are common. Higher = more nuanced but bigger storage and compute.
  • Semantic — similar meaning → similar vectors (cosine similarity ~1); different meaning → vectors far apart (~0 or even negative).

Popular embedding models in 2026:

ModelDimsNotes
OpenAI text-embedding-3-small1536Cheap; good quality
OpenAI text-embedding-3-large3072Better; pricier
Cohere embed-v31024Multi-language; well-regarded
Voyage AI voyage-31024-2048Quality-leading for code / docs
BAAI bge-large-en-v1.51024Open-source; self-host with sentence-transformers
nomic-embed-text-v1.564-768 (Matryoshka)Open-source, configurable dims

For self-host: bge-large or nomic-embed on a small GPU. For SaaS: OpenAI is the easy default; Voyage if quality matters.

Common Use Cases

Use caseWhat you do
RAG (Retrieval-Augmented Generation)Embed docs; for each query, retrieve top-K relevant docs; pass to LLM as context
Semantic search"Find products like this" without keyword matching
Recommendation systemsEmbed users + items; find users with similar tastes
DeduplicationEmbed content; find near-duplicates with high cosine similarity
Image searchEmbed images with CLIP-like models; search by image or text
Code searchEmbed code; find similar functions across a codebase
ClassificationEmbed examples per class; classify by nearest centroid

For most RAG and search applications, vector search alone underperforms — keyword matching catches things vectors miss (proper nouns, exact terms) and vector catches things keywords miss (synonyms, paraphrasing). The winning approach is hybrid — see Hybrid Search.

On this page