Best Practices
Production search - indexing strategy, relevance tuning, security, multi-tenancy, observability, pitfalls
Best Practices
Patterns that apply whether you're on Algolia, Meilisearch, Typesense, or Elasticsearch. Tuning details differ; principles don't.
Index Design
The most important decision after picking an engine.
One Document Per Search Result
Build the index around the unit users search for. A product detail page → one document per product. An article → one document per article. Don't search a normalized join — pre-denormalize at index time.
// Product document for search
{
id: "prod-42",
name: "Espresso Maker XL",
brand: "Acme",
brand_id: "acme", // for filtering
category: "Kitchen > Coffee Machines", // hierarchical for facets
price: 199.99,
currency: "USD",
in_stock: true,
popularity: 832, // for custom ranking
description: "...",
tags: ["espresso", "coffee", "kitchen"],
thumbnail: "https://...",
url: "/products/prod-42",
}Include what you need to filter, rank, and render — not the full database row.
Searchable vs Filterable vs Displayable
Most engines distinguish three roles for an attribute:
| Role | Used for | Example |
|---|---|---|
| Searchable | Full-text matching | name, description, tags |
| Filterable | Equality / range filters | category, price, in_stock |
| Sortable | Order results | price, popularity, created_at |
| Displayable | Render in UI | thumbnail, url |
Attributes can have multiple roles. Mark them appropriately — engines use this to optimize storage and queries.
Searchable Attribute Order
Multiple searchable attributes are usually ranked: a match in name outweighs a match in description. Algolia and Meilisearch let you set the order:
searchableAttributes: [
"name", // highest weight
"brand",
"category",
"tags",
"description", // lowest weight
]A bad order ruins relevance: if description outranks name, "espresso" matches anything that mentions espresso anywhere, not espresso machines first.
Relevance Tuning
Default relevance is good. Custom ranking signals are how you make it great.
Built-in Ranking Rules
Most engines rank by:
- Typos — fewer typos rank higher
- Proximity — query words close together rank higher
- Attribute — match in higher-weight attribute ranks higher
- Exactness — exact match outranks prefix match
- Custom signals (your turn)
Custom Ranking
Add signals that reflect business value:
- Popularity — items with more views/clicks rank higher
- Stock — in-stock items above out-of-stock
- Recency — newer items boosted
- Featured / promoted — manual boosts
// Meilisearch: rank tied results by these attributes
"rankingRules": [
"words", "typo", "proximity", "attribute", "exactness",
"popularity:desc", // custom: most popular wins ties
"in_stock:desc",
]Synonyms
// Algolia / Meilisearch synonyms
{
"couch": ["sofa", "loveseat"],
"sneakers": ["trainers", "tennis shoes"],
}Mine your search analytics for queries that should match: "what's the difference between sneakers and trainers" → same products.
Click-Through Tuning (the dream goal)
Top-tier search:
- Log every query + click.
- Build a model: which queries / contexts make which results valuable.
- Re-rank: rank up things that get clicked, down things that don't.
Algolia bakes click-tuning into their product. With Meilisearch/Typesense you can roll your own using a custom ranking attribute that you periodically recompute from click data.
Indexing Strategy
Batch on Initial Load, Stream on Updates
Initial indexing of millions of records — batch (5,000 docs per request, parallel). Once steady-state, push individual document updates as they happen (CDC stream, message queue, app-level hooks).
// Bulk indexing
const BATCH = 5000;
for (let i = 0; i < docs.length; i += BATCH) {
await index.addDocuments(docs.slice(i, i + BATCH));
}
// Streaming updates
await index.updateDocuments([changedDoc]); // single docAsync indexing is the norm — addDocuments returns immediately with a task ID; the engine indexes in the background. Poll the task or just trust it for non-critical paths.
CDC From Your Database
The reliable pattern: change-data-capture from Postgres / MySQL → message queue → indexer → search engine.
Postgres ──► Debezium ──► Kafka ──► indexer service ──► Meilisearch
(logical replication or wal2json)This guarantees the search index converges with the database, handles backfills, and survives transient failures.
For smaller scale, outbox pattern in your app: write to DB and an outbox table in the same transaction; a worker reads outbox and pushes to search.
Re-indexing
Sometimes you have to wipe and reindex (schema change, ranking config that requires it). Strategies:
| Strategy | Notes |
|---|---|
| Atomic swap | Index into products_v2; switch app to point to it; delete products. Zero downtime. |
| In-place re-index | Engine handles; brief period of split state |
| Reindex while writes continue | Pause writes (small downtime) OR dual-write to old and new |
Atomic swap (also called "blue-green index") is the safe default.
Security
Public Keys for Frontend
Never put admin / master keys in the browser. Issue search-only keys scoped to specific indexes:
// Meilisearch
{
description: "Public search key",
actions: ["search"],
indexes: ["products"],
expiresAt: null,
}Per-user scoped keys with attribute filters are useful for multi-tenant:
// Algolia secured API key — signs the user's tenant filter into the key
const secureKey = algoliasearch.generateSecuredApiKey(
searchKey,
{ filters: 'tenant_id:42' }
);The key embeds the filter — the user can search, but only within their tenant's data. They can't tamper with the filter (HMAC-signed).
Don't Index Sensitive Fields
Search indexes are less locked down than your DB. Don't index:
- Passwords, hashes, tokens
- PII you don't need for search (SSN, full credit card)
- Internal fields users shouldn't see
If you must index user-discriminating data (email, phone), at least make it filterable but not displayable.
Rate Limiting
Add gateway-level rate limits on search endpoints (see API Gateway). Search is cheap per request but search is also a perfect bot magnet — a script can hammer it to map your catalog.
Multi-Tenancy
Two approaches:
One Index per Tenant
products-tenant-1
products-tenant-2
products-tenant-3Pros: clean isolation; per-tenant ranking config; easy deletion. Cons: scales poorly past hundreds of tenants; cross-tenant queries are awkward.
One Index, Tenant ID Filter
products (single index)
- tenant_id is filterableSearch keys per tenant embed filter: tenant_id:N. The user can only see their data.
Pros: scales to millions of tenants; smaller infrastructure footprint. Cons: shared ranking; if you index sensitive data, a misconfigured key leaks it.
For most SaaS, "one index, filtered key" wins. For per-tenant customization, "one index per tenant."
Observability
| Signal | Why |
|---|---|
| Query latency p99 | User-facing; alert above SLO |
| Indexing lag | DB change → search index seconds-to-minutes; alert if growing |
| Zero-result rate | Queries returning nothing — opportunity for synonyms, query understanding |
| Top queries | What users actually look for; informs taxonomy and ranking |
| CTR per query | Which queries deliver value; tune low-CTR queries |
| Index size growth | Capacity planning |
All three engines export metrics to Prometheus or have managed dashboards (Algolia in their UI). Wire into your observability stack — see Prometheus & Grafana.
Common Pitfalls
| Pitfall | Symptom | Fix |
|---|---|---|
| Searching a normalized join | Slow, irrelevant | Denormalize at index time |
| Master key in browser | Hijacked account | Use search-only keys |
| All attributes searchable | Noisy results | Configure searchable attributes |
| No custom ranking | Defaults rule | Use popularity / business-value signals |
| Re-indexing in place during traffic | Half-state visible | Atomic swap pattern |
| Putting raw HTML in indexable text | Tags in results | Strip at index time |
| Logging full search queries with PII | Privacy issue | Hash or strip personal queries |
| One huge "everything" index | Slow, complex tuning | One index per searchable entity type |
| Indexing every database row | Index larger than DB | Index only displayable + filterable fields |
| No fallback for engine down | Site partially broken | Graceful degradation; cache top queries |
Search UX
Beyond the engine, the UI makes or breaks search:
- Search as you type (instant search) — modern bar.
- Highlight matches in results.
- "Did you mean..." for low-result queries.
- Filters as facets — counts visible (
Color: Red (43)). - Empty state — suggest popular searches, recent searches, categories.
- Mobile-first — search is huge on mobile.
- Track which results users click; feed that back to ranking.
InstantSearch.js (works with all three engines) gives you most of this out of the box.
Checklist
Production search checklist
- Search-only API keys for the browser; admin keys backend-only
- Searchable / filterable / sortable / displayable roles configured per attribute
- Custom ranking attribute (popularity / stock / recency)
- Synonyms list maintained from search analytics
- Initial bulk index + streaming updates (CDC or outbox)
- Atomic-swap re-indexing for breaking schema changes
- Per-tenant scoped keys (HMAC-signed filters) for multi-tenant apps
- PII / secrets not indexed
- Rate limits on the search endpoint at the gateway
- Query latency, zero-result rate, top queries monitored
- InstantSearch.js or equivalent UI library
- Empty-state UX with suggestions
- Graceful degradation if search is down (cached top results)
- Backups (self-host) or trust the SaaS SLA (Algolia)
- Capacity planned for index growth