Best Practices
Production object storage - security, encryption, cost, observability, naming, pitfalls, IAM
Best Practices
Object storage is "boring infrastructure" until you wake up to a public-bucket leak, a five-figure egress bill, or a deleted bucket. These habits keep it boring.
Security: Don't Make a Public Bucket Accident
The #1 cloud security incident is "bucket left public." Defenses:
| Mechanism | What it does |
|---|---|
| Block Public Access (BPA) at account level | Hard-rejects any policy that would make objects public |
| BPA at bucket level | Same, per-bucket |
| Resource policies (bucket policy) | Explicit allow/deny; deny wins |
| Object Ownership: Bucket Owner Enforced | Disables ACLs entirely (the cause of historical leaks) |
| Periodic scanning | AWS Macie / open-source s3-scanner / your own audit |
For new accounts: turn on BPA at the account level, set Object Ownership to "Bucket Owner Enforced" on every bucket, deny * Principal in bucket policies. Static-site hosting goes through a CDN with origin auth — the bucket itself stays private.
IAM: Least Privilege
Per-object-storage roles, not "S3 full access":
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject"],
"Resource": "arn:aws:s3:::user-uploads/*"
},
{
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": "arn:aws:s3:::user-uploads",
"Condition": {
"StringLike": { "s3:prefix": "uploads/${aws:userid}/*" }
}
}
]
}| Privilege | Who gets it |
|---|---|
s3:ListBucket | Reads bucket-level metadata; surprisingly powerful |
s3:GetObject | Read objects |
s3:PutObject | Write |
s3:DeleteObject | Delete (consider Deny with MFA for production) |
s3:PutBucketPolicy | Change bucket policy — admins only |
s3:DeleteBucket | Drop the entire bucket — admins only, MFA-required |
For CI/CD and runtime workloads, use OIDC to assume short-lived roles — never long-lived access keys (see Secrets Management).
Encryption
At Rest
Always on; the question is which key:
| Option | Notes |
|---|---|
SSE-S3 (AES256) | AWS-managed key; free; default for many providers |
SSE-KMS (aws:kms) | Customer-managed KMS key; auditable, rotatable; small cost |
| SSE-C | You supply the key per request; awkward; rarely worth it |
| Client-side encryption | You encrypt before upload; storage sees only ciphertext; max control, max complexity |
For most workloads: SSE-KMS with a customer-managed key. Auditable via CloudTrail; key access can be revoked.
In Transit
Always HTTPS. Bucket policy can enforce it:
{
"Effect": "Deny",
"Principal": "*",
"Action": "s3:*",
"Resource": ["arn:aws:s3:::mybucket", "arn:aws:s3:::mybucket/*"],
"Condition": { "Bool": { "aws:SecureTransport": "false" } }
}Bucket Naming
A naming convention you don't have to think about:
<org>-<env>-<purpose>-<region>
example-prod-uploads-us-east-1
example-staging-logs-eu-west-1
example-prod-backups-ap-southeast-2- Org prefix because S3 bucket names are globally unique.
- Env in name prevents staging/prod confusion at a glance.
- Purpose so logs aren't mixed with user data.
- Region suffix so multi-region setups are obvious.
Avoid dots in bucket names — they cause TLS issues with virtual-hosted URLs. Lowercase, hyphens only.
Object Key Design
Keys aren't directories, but they look like them. Two principles:
Don't put hot prefixes at the front
Old S3 partitioned by key prefix. Modern S3 (2018+) auto-partitions and this matters less, but for R2 and other providers, hot prefixes still matter. Pattern:
Bad: 2025-05-21/user-42/event-... (everyone writes to today's prefix)
OK: user-42/2025-05-21/event-... (writes spread across users)
Best: <hash-prefix>/user-42/... (truly even distribution)For uploads from many users, putting user-id first naturally spreads load.
Make keys deterministic where it helps
For idempotent uploads, derive the key from content (sha256:abc...) — same input always produces the same key, accidental re-uploads are no-ops. For per-user uploads, prefix with user-id so per-user IAM policies are easy.
Cost Control
Object storage looks cheap but bills surprise. Watch:
| Line item | What it means |
|---|---|
| Storage (per GB-month) | The big bucket of bytes |
| Requests (per 1000 reads/writes) | Many small files cost more than fewer big ones |
| Egress (per GB) | Often the biggest line; near-zero on R2, hefty on S3 |
| Inter-region transfer | Replication, cross-region access |
| Storage class transitions | Moving between tiers costs per object |
| Restore from Glacier | One-time retrieval fees plus storage |
Optimization checklist
- Lifecycle to IA/Glacier for cold data.
- Compress before upload if applicable (gzip text, webp images).
- CloudFront / CDN in front to avoid egress on cache hits.
- Use R2 if you have heavy distribution traffic.
- Abort incomplete multipart uploads (lifecycle rule).
- Delete old versions after a retention window.
- Right-size storage class —
STANDARDfor everything is rarely optimal.
For AWS, Storage Lens and Cost Explorer show the breakdown. R2 has its dashboard. Audit monthly.
Observability
| Signal | Why |
|---|---|
| Request error rate (5xx) | Provider issue or a misconfigured policy |
| 4xx rate | Often misconfigured client; could be an attack |
| Latency | TTFB on GetObject; >100 ms suggests cold tier or distant region |
| Bytes downloaded / uploaded | Cost forecasting |
| Replication lag | If you've enabled cross-region replication |
| Lifecycle transitions | Confirms rules are running |
| Object count per prefix | Quick capacity check |
Most providers ship metrics to their monitoring service (CloudWatch, Cloudflare Analytics). Pipe interesting ones into Prometheus & Grafana for cross-cutting dashboards.
Server access logging (S3 server access logs, R2's analytics) goes to another bucket. Useful for security audits and debugging.
Common Pitfalls
| Pitfall | Symptom | Fix |
|---|---|---|
| Public bucket from a too-broad policy | Data leak in the news | Block Public Access; periodic scanning |
| Streaming uploads through your app | App OOMs under load | Presigned URLs |
| No lifecycle rules | Bill grows linearly forever | Set tier transitions + version cleanup |
| Many tiny files | Disproportionate request costs | Bundle small files; or use a real DB |
ListObjectsV2 without pagination | Truncated results | Use the paginator |
Setting Content-Type based on extension only | text/html for .html.txt etc | Sniff or trust client only after validation |
| Public bucket as CDN origin | Slow + public exposure | Private bucket + CDN with OAC |
| No backup beyond versioning | Account compromise = total loss | Cross-account / cross-provider mirror |
| Hardcoded long-lived access keys | One leak = full access forever | OIDC + short-lived credentials |
| Mixing storage providers without abstraction | Lock-in pain | Use S3-compatible SDK; one client across providers |
Compliance and Object Lock
For regulated data (PCI, HIPAA, financial records):
- Object Lock in Compliance mode — objects can't be deleted before retention period, even by root. Use for audit trails and financial records.
- Object Lock in Governance mode — admins with the right permission can delete; default for "important but not legally locked."
- Bucket policies that deny delete without MFA.
- Audit logs (CloudTrail / equivalent) shipped off-account.
Multi-Provider Strategy
For redundancy across providers:
- Primary bucket on R2 for cheap egress.
- Replicated to S3 for AWS-side integrations.
- Cold-copy to Backblaze quarterly for DR / cost.
Tools: rclone for ad-hoc sync, Cyberduck / s3-mirror / various pipelines for production. Storage cost roughly doubles; some peace of mind.
Checklist
Production object storage checklist
- Block Public Access enabled at account + bucket level
- Object Ownership set to "Bucket Owner Enforced" (no ACLs)
- Bucket policies deny non-TLS access
- Encryption at rest with customer-managed KMS key
- OIDC short-lived credentials in CI/CD and workloads
- Lifecycle rules: incomplete multipart abort, old version cleanup, tier transitions
- Versioning on for buckets holding irreplaceable data
- Object Lock for compliance-critical data
- Cross-region or cross-account replication for DR-critical data
- Server access logs to a separate bucket; shipped to SIEM
- Periodic public-bucket scan (Macie / s3-scanner / etc)
- Per-team IAM roles; no
s3:*Resource* - Naming convention:
<org>-<env>-<purpose>-<region> - Direct browser uploads use presigned URLs/POST policies
- CDN in front of buckets serving public traffic
- Cost monitoring; alert on storage / egress anomalies
- Documented disaster-recovery procedure with tested restore