Patterns
The core API gateway patterns - JWT/OIDC, mTLS, rate limiting strategies, request transforms, BFF, schema enforcement
Patterns
The handful of gateway patterns you'll reach for repeatedly. Each one is small in isolation; the value is all of these enforced consistently at one point.
Authentication
Three flavors, increasing in flexibility and operational cost:
API Keys
plugins:
- name: key-auth
config:
key_names: ["X-API-Key"]
hide_credentials: true- Simplest. Static per consumer.
- Good for: server-to-server, partner integrations, internal tools.
- Bad for: end users (no rotation per device, hard to scope).
JWT (JSON Web Tokens)
plugins:
- name: jwt
config:
key_claim_name: iss # which claim identifies the issuer
claims_to_verify: ["exp"] # validate expiration
uri_param_names: [] # don't accept tokens in URL
cookie_names: ["session"] # OK in cookiesThe gateway verifies the signature against pre-registered public keys / JWKS URLs, checks exp, and forwards the token (or just the claims as headers) upstream.
- Good for: stateless auth, mobile/web clients, microservices.
- Watch out for: long-lived JWTs are dangerous (no revocation). Use short TTLs + a refresh-token flow.
OAuth 2.0 / OIDC
The gateway acts as an OAuth Resource Server, validating bearer tokens issued by an Identity Provider (Auth0, Keycloak, Okta, Google):
plugins:
- name: openid-connect # Kong Enterprise; Envoy has ext_authz
config:
issuer: "https://idp.example.com/realms/main"
client_id: "api-gateway"
client_secret: "..."
auth_methods: ["bearer", "session"]
scopes_required: ["api:read"]- Good for: end-user-facing APIs, single-sign-on, federated identity.
- Use introspection (call the IdP) for opaque tokens; JWKS (verify locally) for JWT-format tokens.
mTLS
For service-to-service or partner integrations where TLS client certs make sense:
plugins:
- name: mtls-auth
config:
ca_certificates: ["<uuid of CA cert>"]
revocation_check_mode: SKIP- Strong identity tied to a cert.
- Operational cost: cert distribution and rotation.
Rate Limiting
Beyond "5 per minute," real rate limiting has axes to choose:
Dimensions
| Dimension | Example |
|---|---|
| Per consumer (API key, user ID) | alice gets 1000/min |
| Per IP | Mitigate scrapers from one source |
| Global | Protect a downstream that can't scale |
| Per endpoint | /login 5/min; /users 100/min |
| Tiered | Free 100/day, Pro 10000/day |
Algorithms
| Algorithm | Behavior |
|---|---|
| Fixed window | Bucket per N seconds; spiky at window boundaries |
| Sliding window | Smoothed; more accurate |
| Token bucket | Allows bursts up to a budget; refills steadily |
| Leaky bucket | Smooth output rate; queues bursts |
For user-facing APIs, token bucket with sensible burst. For "protect this downstream" use, sliding window.
Distributed Rate Limiting
A single gateway instance is easy. Multiple gateway nodes need shared state — Redis (Kong's redis policy), or a distributed datastore. Gauge that the latency from gateway → Redis isn't worse than the calls being limited.
Communicate Limits
Always return the headers (RFC 9325 / RateLimit-*):
RateLimit-Limit: 1000
RateLimit-Remaining: 783
RateLimit-Reset: 38
Retry-After: 38 # on 429sClients that respect them won't pound on you.
Request / Response Transforms
The gateway can rewrite headers, paths, query params:
plugins:
- name: request-transformer
config:
add:
headers: ["X-Internal-Version:v2", "X-Forwarded-Tier:$(tier)"]
remove:
headers: ["Authorization"] # don't leak the bearer to backend
replace:
uri: "/v2/$(uri_captures.rest)" # rewrite /api/* → /v2/*Sensible uses:
- Strip the
Authorizationheader after auth — pass identity asX-User-ID: 42instead. - Inject a
X-Request-Idfor correlation. - Rewrite legacy paths to new backend paths without breaking external URLs.
Uses that go too far:
- Translating REST ↔ GraphQL.
- Aggregating data from multiple backends.
- Anything stateful.
When you need that, build a BFF (Backend-for-Frontend) service behind the gateway. Don't bake business logic into the gateway.
BFF (Backend for Frontend)
A small service per client type (web, iOS, Android, partner-API) that aggregates / shapes data from underlying services into what that specific client needs.
[ Web ] ───► [ web-bff ] ─┐
[ iOS ] ───► [ ios-bff ] ─┼─► users-svc, orders-svc, inventory-svc
[ Android ]───► [ and-bff ] ─┘- BFFs go behind the gateway, not in front.
- The gateway terminates external auth; the BFF speaks internal protocols (mTLS, gRPC, ...).
- Each BFF can evolve at its client's pace without coordinating with the others.
The pattern is most valuable when clients diverge a lot (mobile bandwidth-sensitive, web feature-rich, partner contract-stable). For one web client and one mobile client that show similar data, a single shared API often beats two BFFs.
Schema Enforcement
Reject invalid requests at the edge before they hit your code:
plugins:
- name: request-validator
config:
version: kong
body_schema: |
[
{ "name": "email", "type": "string", "required": true, "format": "email" },
{ "name": "age", "type": "integer", "required": false }
]For full OpenAPI-driven validation, point the plugin at a spec:
plugins:
- name: oas-validation
config:
api_spec: <openapi.yaml content>
validate_request_body: true
validate_request_headers: true
validate_response: falseBenefits:
- Invalid input never reaches the backend (smaller blast radius).
- The spec becomes executable, not aspirational.
- Generated client SDKs can trust the contract.
Cost:
- Schema drift becomes a deploy blocker.
- Schema-evolution discipline now belongs to the platform team.
Circuit Breaking and Health-Aware Routing
plugins:
- name: proxy-cache
config:
strategy: memory
content_type: ["application/json"]
cache_ttl: 30
- name: passive-health-check # tracks upstream health automatically
config: { healthy: { successes: 3 }, unhealthy: { http_failures: 5 } }When a backend is sick, the gateway routes around it — or serves a stale cache instead of returning 502s. Tune the failure thresholds carefully; aggressive circuit breaking can amplify outages.
CORS
plugins:
- name: cors
config:
origins: ["https://app.example.com", "https://admin.example.com"]
methods: ["GET", "POST", "PUT", "DELETE", "OPTIONS"]
headers: ["Authorization", "Content-Type"]
exposed_headers: ["X-Request-Id"]
credentials: true
max_age: 3600Putting CORS at the gateway means backends don't sprinkle it across every handler. Set the right origins (* is rarely correct).
What's Next
Patterns are the building blocks. Best Practices covers the operational side — HA, versioning, observability, what to monitor.