Production search - indexing strategy, relevance tuning, security, multi-tenancy, observability, pitfalls

Best Practices

Patterns that apply whether you're on Algolia, Meilisearch, Typesense, or Elasticsearch. Tuning details differ; principles don't.

Index Design

The most important decision after picking an engine.

One Document Per Search Result

Build the index around the unit users search for. A product detail page → one document per product. An article → one document per article. Don't search a normalized join — pre-denormalize at index time.

// Product document for search
{
  id: "prod-42",
  name: "Espresso Maker XL",
  brand: "Acme",
  brand_id: "acme",                        // for filtering
  category: "Kitchen > Coffee Machines",   // hierarchical for facets
  price: 199.99,
  currency: "USD",
  in_stock: true,
  popularity: 832,                          // for custom ranking
  description: "...",
  tags: ["espresso", "coffee", "kitchen"],
  thumbnail: "https://...",
  url: "/products/prod-42",
}

Include what you need to filter, rank, and render — not the full database row.

Searchable vs Filterable vs Displayable

Most engines distinguish three roles for an attribute:

Role	Used for	Example
Searchable	Full-text matching	`name`, `description`, `tags`
Filterable	Equality / range filters	`category`, `price`, `in_stock`
Sortable	Order results	`price`, `popularity`, `created_at`
Displayable	Render in UI	`thumbnail`, `url`

Attributes can have multiple roles. Mark them appropriately — engines use this to optimize storage and queries.

Searchable Attribute Order

Multiple searchable attributes are usually ranked: a match in name outweighs a match in description. Algolia and Meilisearch let you set the order:

searchableAttributes: [
  "name",          // highest weight
  "brand",
  "category",
  "tags",
  "description",   // lowest weight
]

A bad order ruins relevance: if description outranks name, "espresso" matches anything that mentions espresso anywhere, not espresso machines first.

Relevance Tuning

Default relevance is good. Custom ranking signals are how you make it great.

Built-in Ranking Rules

Most engines rank by:

Typos — fewer typos rank higher
Proximity — query words close together rank higher
Attribute — match in higher-weight attribute ranks higher
Exactness — exact match outranks prefix match
Custom signals (your turn)

Custom Ranking

Add signals that reflect business value:

Popularity — items with more views/clicks rank higher
Stock — in-stock items above out-of-stock
Recency — newer items boosted
Featured / promoted — manual boosts

// Meilisearch: rank tied results by these attributes
"rankingRules": [
  "words", "typo", "proximity", "attribute", "exactness",
  "popularity:desc",          // custom: most popular wins ties
  "in_stock:desc",
]

Synonyms

// Algolia / Meilisearch synonyms
{
  "couch": ["sofa", "loveseat"],
  "sneakers": ["trainers", "tennis shoes"],
}

Mine your search analytics for queries that should match: "what's the difference between sneakers and trainers" → same products.

Click-Through Tuning (the dream goal)

Top-tier search:

Log every query + click.
Build a model: which queries / contexts make which results valuable.
Re-rank: rank up things that get clicked, down things that don't.

Algolia bakes click-tuning into their product. With Meilisearch/Typesense you can roll your own using a custom ranking attribute that you periodically recompute from click data.

Indexing Strategy

Batch on Initial Load, Stream on Updates

Initial indexing of millions of records — batch (5,000 docs per request, parallel). Once steady-state, push individual document updates as they happen (CDC stream, message queue, app-level hooks).

// Bulk indexing
const BATCH = 5000;
for (let i = 0; i < docs.length; i += BATCH) {
  await index.addDocuments(docs.slice(i, i + BATCH));
}

// Streaming updates
await index.updateDocuments([changedDoc]);   // single doc

Async indexing is the norm — addDocuments returns immediately with a task ID; the engine indexes in the background. Poll the task or just trust it for non-critical paths.

CDC From Your Database

The reliable pattern: change-data-capture from Postgres / MySQL → message queue → indexer → search engine.

Postgres ──► Debezium ──► Kafka ──► indexer service ──► Meilisearch
            (logical replication or wal2json)

This guarantees the search index converges with the database, handles backfills, and survives transient failures.

For smaller scale, outbox pattern in your app: write to DB and an outbox table in the same transaction; a worker reads outbox and pushes to search.

Re-indexing

Sometimes you have to wipe and reindex (schema change, ranking config that requires it). Strategies:

Strategy	Notes
Atomic swap	Index into `products_v2`; switch app to point to it; delete `products`. Zero downtime.
In-place re-index	Engine handles; brief period of split state
Reindex while writes continue	Pause writes (small downtime) OR dual-write to old and new

Atomic swap (also called "blue-green index") is the safe default.

Security

Public Keys for Frontend

Never put admin / master keys in the browser. Issue search-only keys scoped to specific indexes:

// Meilisearch
{
  description: "Public search key",
  actions: ["search"],
  indexes: ["products"],
  expiresAt: null,
}

Per-user scoped keys with attribute filters are useful for multi-tenant:

// Algolia secured API key — signs the user's tenant filter into the key
const secureKey = algoliasearch.generateSecuredApiKey(
  searchKey,
  { filters: 'tenant_id:42' }
);

The key embeds the filter — the user can search, but only within their tenant's data. They can't tamper with the filter (HMAC-signed).

Don't Index Sensitive Fields

Search indexes are less locked down than your DB. Don't index:

Passwords, hashes, tokens
PII you don't need for search (SSN, full credit card)
Internal fields users shouldn't see

If you must index user-discriminating data (email, phone), at least make it filterable but not displayable.

Rate Limiting

Add gateway-level rate limits on search endpoints (see API Gateway). Search is cheap per request but search is also a perfect bot magnet — a script can hammer it to map your catalog.

Multi-Tenancy

Two approaches:

One Index per Tenant

products-tenant-1
products-tenant-2
products-tenant-3

Pros: clean isolation; per-tenant ranking config; easy deletion. Cons: scales poorly past hundreds of tenants; cross-tenant queries are awkward.

One Index, Tenant ID Filter

products (single index)
   - tenant_id is filterable

Search keys per tenant embed filter: tenant_id:N. The user can only see their data.

Pros: scales to millions of tenants; smaller infrastructure footprint. Cons: shared ranking; if you index sensitive data, a misconfigured key leaks it.

For most SaaS, "one index, filtered key" wins. For per-tenant customization, "one index per tenant."

Observability

Signal	Why
Query latency p99	User-facing; alert above SLO
Indexing lag	DB change → search index seconds-to-minutes; alert if growing
Zero-result rate	Queries returning nothing — opportunity for synonyms, query understanding
Top queries	What users actually look for; informs taxonomy and ranking
CTR per query	Which queries deliver value; tune low-CTR queries
Index size growth	Capacity planning

All three engines export metrics to Prometheus or have managed dashboards (Algolia in their UI). Wire into your observability stack — see Prometheus & Grafana.

Common Pitfalls

Pitfall	Symptom	Fix
Searching a normalized join	Slow, irrelevant	Denormalize at index time
Master key in browser	Hijacked account	Use search-only keys
All attributes searchable	Noisy results	Configure searchable attributes
No custom ranking	Defaults rule	Use popularity / business-value signals
Re-indexing in place during traffic	Half-state visible	Atomic swap pattern
Putting raw HTML in indexable text	Tags in results	Strip at index time
Logging full search queries with PII	Privacy issue	Hash or strip personal queries
One huge "everything" index	Slow, complex tuning	One index per searchable entity type
Indexing every database row	Index larger than DB	Index only displayable + filterable fields
No fallback for engine down	Site partially broken	Graceful degradation; cache top queries

Search UX

Beyond the engine, the UI makes or breaks search:

Search as you type (instant search) — modern bar.
Highlight matches in results.
"Did you mean..." for low-result queries.
Filters as facets — counts visible (Color: Red (43)).
Empty state — suggest popular searches, recent searches, categories.
Mobile-first — search is huge on mobile.
Track which results users click; feed that back to ranking.

InstantSearch.js (works with all three engines) gives you most of this out of the box.

Best Practices

Best Practices

Index Design

One Document Per Search Result

Searchable vs Filterable vs Displayable

Searchable Attribute Order

Relevance Tuning

Built-in Ranking Rules

Custom Ranking

Synonyms

Click-Through Tuning (the dream goal)

Indexing Strategy

Batch on Initial Load, Stream on Updates

CDC From Your Database

Re-indexing

Security

Public Keys for Frontend

Don't Index Sensitive Fields

Rate Limiting

Multi-Tenancy

One Index per Tenant

One Index, Tenant ID Filter

Observability

Common Pitfalls

Search UX

Checklist

On this page