Best Practices
Production CDN - multi-CDN, security (WAF, bot mitigation), observability, cost control, pitfalls
Best Practices
The CDN sits in the critical path of every external request. A CDN outage is a site outage. These patterns keep that risk small and the bill smaller.
Hash Your Asset Filenames
The single most impactful pattern. Filenames change when content changes:
/assets/main.a4f9e21c.css # immutable; ship a year-long TTL
/assets/main.b3d72ed8.css # next deploySet Cache-Control: public, max-age=31536000, immutable. No purging, ever — the URL of the new version is different, so it's a new cache entry.
Webpack, Vite, Next.js, Rails Sprockets — all do this out of the box. The only thing you must purge is the HTML that references these assets.
Origin Protection
Your origin should not be reachable directly from the internet. Otherwise:
- Attackers bypass the CDN by hitting origin IPs, defeating WAF and DDoS protection.
- A misconfigured DNS record exposes you.
- Cost protection (origin egress fees) is undermined.
Lock it down:
| Strategy | How |
|---|---|
| CDN-specific firewall | Cloudflare Tunnel (cloudflared) — origin has no public IP at all |
| IP allowlist | Origin firewall accepts only the CDN's published IP ranges |
| Shared secret header | Origin requires X-CDN-Secret: ... that only the CDN sends |
| mTLS between CDN and origin | CDN presents a client cert origin trusts |
At least one of these is non-negotiable in production.
WAF and Bot Mitigation
CDNs ship with WAF rule sets. Turn them on:
- OWASP Core Ruleset — SQL injection, XSS, common exploits.
- Provider managed rulesets — Cloudflare Managed Rules, Fastly Next-Gen WAF.
- Rate limiting per IP — defaults are too generous; tune per endpoint.
- Bot management — score requests; challenge or block suspicious ones.
Start in detect-only mode, review the dashboard for a week, then flip to enforce. Real users get caught by overzealous WAF rules; tune before enforcing.
DDoS Considerations
All major CDNs absorb L3/L4 DDoS as a baseline service. For L7 (HTTP) DDoS:
- Cache as much as possible — a 200 GB/s attack on cached content is harmless.
- Rate limiting on
/login-style endpoints. The expensive ones. - Bot challenges for routes that shouldn't see bots.
- Anomaly detection — sudden 100× traffic to one path is suspicious.
Test your protections occasionally with a controlled load test from outside your network.
Multi-CDN
When one CDN's outage is unacceptable:
| Pattern | Notes |
|---|---|
| DNS-level failover | Active/passive; DNS provider health-checks; slow failover (TTL-bound) |
| DNS-level weighted | Active/active; split traffic; managed via Route 53, NS1, Cloudflare Load Balancer |
| Header-driven | App layer picks CDN per request (cookie/header) — complex |
Multi-CDN doubles cost and operational complexity. Reserve for situations where seconds of downtime cost serious money. For most teams, one CDN with auto-failover (CDN to alternate origin) is enough.
Observability
| Signal | What to watch |
|---|---|
| Hit ratio (per route, overall) | < 90% on public static content = misconfiguration |
| Origin bandwidth | A spike when traffic is flat = cache miss event |
| 5xx rate from origin (visible at CDN) | Origin in trouble; stale-if-error saves you |
| 5xx rate from CDN | The CDN itself is in trouble; consider failover |
| Latency p50/p99 per region | One bad POP hurts users there |
| Purge frequency | Spike = something is wrong with cache strategy |
CDNs export logs and metrics; pipe them into your observability stack. Cloudflare Logpush, Fastly real-time logs, CloudFront access logs all dump to S3 / Splunk / a SaaS — see ELK.
Cost Control
CDN bills can surprise. Things to watch:
- Bandwidth out — usually the biggest line item. Higher cache hit rate = lower bill.
- Cache misses to expensive regions — Australia and South America egress to origin is pricey.
- Image transformations — per-transformation cost; cache the transforms.
- Edge functions — per-invocation cost; expensive on cache-miss-heavy traffic.
- Free tier limits — Cloudflare's "free" tier has tight limits for paid features (Workers, Cache Reserve).
Best practices:
| Knob | Effect |
|---|---|
Aggressive s-maxage | Origin bandwidth way down |
| Compress everything (Brotli) | 30%+ bandwidth reduction |
| Image optimization | Often 50-80% size reduction |
| Cache Reserve (Cloudflare) / Origin Shield | Cache miss traffic to origin drops sharply |
| Block bots early | Don't pay to serve them |
Common Pitfalls
| Pitfall | Symptom | Fix |
|---|---|---|
Set-Cookie on a cacheable response | All hits become DYNAMIC | Make response cookie-free or strip at edge |
Vary: User-Agent | Hit rate near zero | Normalize on origin; vary on a small bucket |
| Caching auth-required URLs | Logged-in users see other users' content | Mark private; bypass cache for cookie present |
| Purge-everything on every deploy | Cache constantly cold | Purge specific URLs/tags; hash filenames |
| Long TTL with no purge plan | Stale content for hours | Either purge on update or short TTL + stale-while-revalidate |
| Trusting CDN with secrets in URL | Logged in URLs (/admin/...) hit CDN | Different domain, or Cache-Control: private enforced |
| Mixing CDN IPs in WAF allowlist | Block CDN itself | Use the CDN's published IP-range JSON; rotate automatically |
Region-Specific Considerations
- China: most Western CDNs work poorly inside the GFW; specific China-tier services (Cloudflare China, Alibaba CDN, ChinaCache) needed. Requires an ICP license.
- Russia: similar but with different sanctions/regulations to navigate.
- South America / Africa: fewer POPs; prefer CDNs with explicit presence there.
- Latency-sensitive APIs: measure real RUM data; the CDN that's fastest globally may not be fastest in your audience's region.
Testing CDN Changes
Three safety nets:
- Preview / staging environment with the same CDN config as production.
- Synthetic checks that hit a few representative URLs from multiple regions and alert on cache misses, slow loads, or content drift.
- Real user monitoring (RUM) — actual visitor latency tracked over time.
A common mistake: change cache rules on Friday, watch the bill on Monday.
Checklist
Production CDN checklist
- Hashed filenames for all static assets; year-long
immutablecache - HTML cached with short
s-maxage+stale-while-revalidate - Origin not reachable from public internet (tunnel / IP allowlist / shared secret)
- WAF / managed ruleset enabled (start in detect mode, then enforce)
- Rate limiting on
/login,/signup, password-reset endpoints - Image optimization on; appropriate
<img srcset>markup - Surrogate keys / cache tags used for fine-grained purges
- Cookie/query-string normalization rules in place
- Cache hit ratio monitored; alert on sudden drops
- Origin bandwidth alerted on (spike = cache breakage)
- Logs streamed off-CDN to your observability stack
- DDoS protection enabled at L3/L4; L7 rate limits configured
- CDN failover path documented and tested
- Compression (Brotli + gzip fallback) enabled
- Negative caching for 404s on public paths