Semantic search and similarity - Pinecone, Qdrant, Weaviate, pgvector - the storage layer for embeddings and RAG

Vector Databases

A vector database stores embeddings — high-dimensional numerical vectors that represent text, images, audio, or arbitrary content — and answers queries like "find the N items most similar to this one." It's the storage layer behind semantic search, RAG (Retrieval-Augmented Generation), recommendation systems, deduplication, and most modern AI applications.

This is distinct from keyword Search (Algolia / Meilisearch / Elasticsearch) — keyword search matches words, vector search matches meaning.

What a Vector Database Does

Document     ──► Embedding model ──► Vector [0.21, -0.43, 0.88, ...]  (e.g. 1536 dims)
"Espresso machine"                    └──► stored with metadata in vector DB

Query        ──► Embedding model ──► Vector [0.19, -0.41, 0.85, ...]
"coffee maker"                        └──► find top-K nearest vectors
                                           returns: ["Espresso machine", "French press", ...]

The key operation is approximate nearest neighbor (ANN) search — finding the closest vectors by cosine similarity, dot product, or Euclidean distance.

Why Not a Regular Database

Regular DB	Vector DB
`WHERE category = 'electronics'` (equality / range)	Find by similarity
B-tree / hash indexes	HNSW / IVF / DiskANN indexes (ANN)
Exact answers	Approximate (configurable trade-off)
Hundreds of MB indexes are large	GB-scale indexes are normal
Doesn't know what content means	Embedding captures meaning

Postgres can do vector search via pgvector, and for many workloads that's the right answer. Dedicated vector DBs win at scale (billions of vectors) or when you need ANN-specific features.

The Players

Database	Type	Notes
pgvector	Postgres extension	Free; embedded in your existing DB; great for small to medium scale
Qdrant	Standalone (Rust)	Open-source; self-host or cloud; rich filtering; fast
Weaviate	Standalone (Go)	Open-source; modular; built-in embedding generators
Milvus	Standalone (Go)	Open-source; designed for huge scale (billions)
Pinecone	SaaS only	Pioneer; managed; simple API; pricier at scale
Chroma	Standalone / embedded (Python)	Dev-friendly; popular for prototypes
LanceDB	Embedded (Rust)	Local-first; columnar; great for desktop / Notebooks
Elasticsearch / OpenSearch	Search + vector	Adds vectors to existing search infrastructure
MongoDB Atlas Vector Search	Hosted	Bundled with MongoDB Atlas
Redis + RediSearch	In-memory	Vector search inside Redis
Turbopuffer / Aperture	Newer SaaS	Cheap at scale; columnar storage

For new projects in 2026:

Already on Postgres → pgvector. Lowest operational cost; good through ~10M vectors.
Self-host, want feature-rich → Qdrant.
Don't want to operate → Pinecone or Qdrant Cloud or Turbopuffer.
Hyperscale (>100M vectors) → Milvus or specialized service.

Learning Path

1. Getting Started

Run Qdrant in Docker, generate embeddings, store and query - hello world RAG

2. Hybrid Search

Combining keyword and vector for better results; filters; reranking; chunking

3. Best Practices

Embedding models, dimensionality, indexes, metadata, observability, cost

What's an Embedding

A function (the embedding model) that maps content to a fixed-size vector. Properties that matter:

Same model for documents and queries — vectors only mean the same thing within one model's vector space.
Dimensionality — 384, 768, 1024, 1536, 3072 are common. Higher = more nuanced but bigger storage and compute.
Semantic — similar meaning → similar vectors (cosine similarity ~1); different meaning → vectors far apart (~0 or even negative).

Popular embedding models in 2026:

Model	Dims	Notes
OpenAI `text-embedding-3-small`	1536	Cheap; good quality
OpenAI `text-embedding-3-large`	3072	Better; pricier
Cohere `embed-v3`	1024	Multi-language; well-regarded
Voyage AI `voyage-3`	1024-2048	Quality-leading for code / docs
BAAI `bge-large-en-v1.5`	1024	Open-source; self-host with `sentence-transformers`
`nomic-embed-text-v1.5`	64-768 (Matryoshka)	Open-source, configurable dims

For self-host: bge-large or nomic-embed on a small GPU. For SaaS: OpenAI is the easy default; Voyage if quality matters.

Common Use Cases

Use case	What you do
RAG (Retrieval-Augmented Generation)	Embed docs; for each query, retrieve top-K relevant docs; pass to LLM as context
Semantic search	"Find products like this" without keyword matching
Recommendation systems	Embed users + items; find users with similar tastes
Deduplication	Embed content; find near-duplicates with high cosine similarity
Image search	Embed images with CLIP-like models; search by image or text
Code search	Embed code; find similar functions across a codebase
Classification	Embed examples per class; classify by nearest centroid

For most RAG and search applications, vector search alone underperforms — keyword matching catches things vectors miss (proper nouns, exact terms) and vector catches things keywords miss (synonyms, paraphrasing). The winning approach is hybrid — see Hybrid Search.

Vector Databases

Vector Databases

What a Vector Database Does

Why Not a Regular Database

The Players

Learning Path

1. Getting Started

2. Hybrid Search

3. Best Practices

What's an Embedding

Common Use Cases

On this page