Steven's Knowledge

Best Practices

Production feature flags - flag lifecycle, debt management, evaluation latency, fail-safe defaults, ops

Best Practices

Feature flags are easy to add and easy to forget. The patterns below stop them from becoming the next pile of technical debt.

Flag Lifecycle

Every release flag should have a scheduled death:

proposed → built behind flag → rolled out → 100% → flag removed from code → flag deleted from platform

Track it. Every flag in the platform should have:

  • Owner — a team or person who decides when to retire it.
  • Type — release / experiment / permission / ops.
  • Expected retirement date — even just "End of Q3."
  • Linked PR or ticket for the cleanup.

Without ownership and a date, flags accumulate forever.

Manage Flag Debt

Flags are debt. Plan to repay them.

PracticeEffect
Monthly flag-debt reviewWalk through aging release flags; either retire or justify
Quarterly mass cleanupTag flags with "retire-next-cleanup", batch-delete
Stale flag alertsPlatforms flag definitions older than N days; review them
Code search before deletinggrep -r 'flags.isEnabled.*new-checkout'
Dual checksAfter removing the code, leave the flag for 1 release in case of rollback

A common rhythm: release flags > 60 days old are reviewed; > 90 days old need an explicit justification or get removed.

Fail-Safe Defaults

Always specify a default value in code that's correct if the flag system is unreachable:

// Bad — if the SDK fails to fetch, who knows what happens
if (await flags.isEnabled('show_new_ui')) { ... }

// Good — explicit default; safe if flag system is down
if (await flags.getBooleanValue('show_new_ui', false, ctx)) { ... }

// Better — name the default to make intent explicit
const useNewUI = await flags.getBooleanValue('show_new_ui', /* default */ false, ctx);

Pick the safe default — usually the current/conservative behavior. A flag system outage shouldn't change product behavior.

Evaluation Latency

Every isEnabled call should be a local-cache lookup, not a network call:

PatternLatencyRisk
Fetch all flags on startup, poll for updates (Unleash, LaunchDarkly default)MicrosecondsSlightly stale (up to refresh interval)
Streaming updates (LaunchDarkly stream)MicrosecondsServer-Sent Events from the platform
Per-call API check5-50 ms per checkBad — multiplies across every request

If your SDK doesn't cache, you're using it wrong. Hot paths with hundreds of flag checks per request can fall apart otherwise.

Edge Considerations

For edge functions (Cloudflare Workers, Lambda@Edge), flag SDK choice matters:

  • CDN-cacheable flag bundles (LaunchDarkly Relay Proxy, Unleash Edge) — fetched once per edge POP, served from cache.
  • Lightweight SDKs that don't depend on long-lived processes (the regular Node SDK holds a TCP connection — not great in a short-lived edge runtime).

Server-Side vs Client-Side, Revisited

A subtle pitfall: don't ship the full flag ruleset to the browser. It exposes:

  • All flag names (reveals roadmap).
  • All targeting rules (reveals customer segments).
  • Other users' percentages (information leakage).

Better:

  • Server-side evaluation, then send only the resolved booleans to the client ({newCheckout: true}).
  • Gateway pattern: an internal endpoint your frontend calls that returns {userId, evaluatedFlags} — never raw rules.
  • LaunchDarkly Relay Proxy / Unleash Edge with client-SDK-keys scoped to subset of flags.

Targeting and Privacy

Targeting context can contain PII (user IDs, emails, geographic regions). Treat it accordingly:

  • Don't log full targeting context — at minimum hash user IDs.
  • Don't send unhashed PII to a SaaS flag platform unless you've reviewed their data-residency.
  • Use targetingKey (an opaque ID), not email/name.
  • Audit access: who can change flag rules? Production rule changes should require review.

Observability

A flag change is a deploy. Track it like one.

SignalWhy
Flag evaluation rateSpike = something started checking the flag; valley = something stopped
Per-flag % trueLets you confirm the rollout is at the percentage you set
Per-flag latencyIf evaluation is slow, hot paths suffer
Audit log of flag changes"Who turned this on at 3am and why?"
Change annotations in dashboardsOverlay flag changes on your Grafana — correlate with metrics

Most platforms emit these via webhooks or APIs. Hook them into your existing observability — see Prometheus & Grafana.

Testing With Flags

Two principles:

  1. Test both branches. Your CI should run the same test suite with the flag both on and off — at least for release flags about to ship.
  2. Don't mock the flag service in unit tests. Use the SDK's bootstrap/stubProvider API to set deterministic values in tests.
// In tests, set known values without hitting the network
const provider = new InMemoryProvider({
  'new-checkout': { defaultVariant: 'on', variants: { on: true, off: false } },
});
await OpenFeature.setProviderAndWait(provider);

Coordination With Deploys

A flag is a config change; a deploy is a code change. They interact:

  • New flag + new code — code expects the flag to exist. Create the flag first, then deploy code that reads it. Defaults to off / safe.
  • Removing a flag — remove the code reading it first (defaulting to the desired behavior), then delete the flag from the platform. Otherwise an SDK call to a deleted flag is awkward.
  • Same-PR rule — flag creation/removal in the same PR as the code change is the cleanest pattern, with the flag config done via IaC or scripted API calls.

Anti-Patterns

Anti-patternWhy it's bad
Permanent release flagsThe whole point is to retire them
Flag-driven business logicFlags are for behavior change, not "what plan a customer has" — entitlements need a real model
Flags that depend on response timeNetwork calls inside a flag eval kill latency
Nesting flags 3+ deepCombinatorial explosion of test cases
Flags that gate infrastructure changesSchema migrations, partitions, etc — flags can't bridge those
"We'll clean it up later"Later never comes. Set a date.

Checklist

Production feature-flag checklist

  • Platform chosen (self-host or SaaS) with an OpenFeature-compatible SDK
  • Server-side evaluation by default; client only sees resolved booleans
  • Every flag has an owner, type, and expected retirement date
  • Fail-safe defaults set in code for every flag check
  • Local caching SDK in use; no per-call network round-trip
  • Bucketing on a stable targetingKey, not raw PII
  • Audit log of flag changes shipped off-platform
  • Flag changes annotated on observability dashboards
  • Monthly review of release flags; deletion ritual quarterly
  • CI tests run with flags both on and off for soon-to-ship features
  • Documented ops runbook listing kill switches and how to flip them
  • Production rule changes require review (config-as-code or platform approval)

On this page