Time-Series Databases
InfluxDB, TimescaleDB, VictoriaMetrics, QuestDB - purpose-built storage for timestamps, metrics, sensors, and event streams
Time-Series Databases
A time-series database (TSDB) is built around one assumption: most of your data is timestamped, append-mostly, and rarely updated. That assumption changes everything — storage format, compression, query language, retention. The result is a database that holds billions of points per server cheaply, and answers "how did this metric look over the last 24 hours" in milliseconds.
You already use TSDBs without thinking about it. Prometheus is a TSDB. CloudWatch is one. DataDog is one. But there's a category of general-purpose TSDBs you reach for when Prometheus' scope (operational metrics, short retention) isn't enough — IoT data, business analytics, financial ticks, application traces in raw form.
Why a TSDB and Not Postgres
You can store timestamps in Postgres. At small scale it's fine. At any real scale it falls over:
| Concern | Postgres (vanilla) | Purpose-built TSDB |
|---|---|---|
| Insert rate | ~10k/s on a big box | 100k–1M+/s per node |
| Compression | OK with TOAST | 10-50× via columnar + delta + Gorilla |
| Retention / downsampling | Manual scripts | First-class policies |
| Time-range queries | OK with btree, but big | Designed for it; sub-ms on huge ranges |
| Aggregations over windows | Slow without tuning | Optimized hot path |
| Cardinality (many series) | Indexes blow up | Handled (with limits) |
The breaking point is usually when you have millions of series × billions of points and queries scan large time ranges. At that scale, a TSDB is 10-100× cheaper and faster than Postgres on the same hardware.
The Players
| Tool | Storage style | Best for |
|---|---|---|
| InfluxDB (v2/v3) | Custom; v3 is Apache Arrow/Parquet | General-purpose; IoT; metrics; Flux/SQL |
| TimescaleDB | Postgres extension (chunked btrees + columnar) | Want SQL + Postgres ecosystem + time-series perf |
| Prometheus (TSDB) | Custom; in-memory + blocks | Operational metrics, short retention |
| VictoriaMetrics | Prometheus-compatible | Long-term Prometheus storage; HA; cheap |
| QuestDB | Columnar + time-partitioned | Tick data; finance; ultra-low-latency queries |
| ClickHouse | Columnar OLAP | Time-series-shaped analytics; trillions of rows |
| Apache Druid | Distributed columnar OLAP | Real-time + batch analytics on event streams |
| AWS Timestream | Managed serverless | Don't want to run anything; AWS-native |
| Azure Data Explorer (Kusto) | Columnar | Telemetry + logs + ad-hoc analytics |
| TDengine | C-based; per-device tables | IoT at massive device count |
| M3DB (Uber) | Distributed | Open-source, was Uber's metrics platform |
| Mimir (Grafana) | Prometheus long-term | Cloud-native Prometheus at scale |
How to pick:
- Need SQL and Postgres ecosystem → TimescaleDB
- Operational metrics already on Prometheus → VictoriaMetrics or Mimir for long-term
- IoT / massive write rate → InfluxDB v3 or TDengine
- Finance / tick data → QuestDB or kdb+ (commercial)
- Trillions of rows analytics → ClickHouse (technically OLAP, but eats time-series)
- Don't want to run anything (AWS) → Timestream
For most teams starting out: TimescaleDB if you already use Postgres; InfluxDB otherwise.
What's Different About Time-Series
The shape of the data changes everything:
| Trait | Why it matters |
|---|---|
| Mostly appends, almost no updates | Storage can be log-structured / immutable |
| Timestamps are queried in ranges | Time-partitioning beats general indexing |
| Values are similar to neighbors | Delta encoding + Gorilla compression hit 10-50× |
| Old data is read less, eventually deleted | First-class retention + downsampling |
| Aggregates (avg, max, P99) over windows are the main query | Continuous queries pre-compute |
| Lots of series (metric × labels) | Cardinality is the silent killer |
Cardinality: The Silent Killer
A series is one unique combination of metric + tag values. Each is a distinct stream of points. If you have:
http_requests{service="checkout", region="us-east-1", status="200", user_id="u_123"}Then user_id is the killer — every user creates a new series. High-cardinality labels destroy TSDBs. Rules of thumb:
| Label | Cardinality (typical) | Safe? |
|---|---|---|
service (5-50 services) | Low | Yes |
region (3-10 regions) | Low | Yes |
status (5 status classes) | Low | Yes |
endpoint (50-500) | Medium | Usually |
user_id (1M users) | High | No |
request_id (every request) | Very High | No |
Cardinality budget: most TSDBs handle 1-10M active series per node well; above that, things start hurting. Prometheus, InfluxDB v1/v2 are most sensitive; v3 / VictoriaMetrics / Mimir handle more.
If you need per-user metrics, use logs/traces (low aggregation cost) — not high-cardinality time-series.
Retention and Downsampling
Time-series data ages predictably. Treat older data differently:
0-7 days: full resolution (1s)
7-30 days: 1min downsampled (60x cheaper)
30-90 days: 5min downsampled
90-365 days: 1h downsampled
> 1 year: archived or droppedThis is continuous queries in InfluxDB, continuous aggregates in TimescaleDB, recording rules in Prometheus. The downsampled data lives next to the raw data; queries hit whichever resolution matches the time range.
Without downsampling, year-old 1-second metrics cost the same to store as today's — that bill grows linearly forever.
Learning Path
1. Getting Started
Run TimescaleDB and InfluxDB locally; load sensor data; query the basics; explore in Grafana
2. Patterns
Schema design, continuous aggregates, retention, downsampling, joining time-series with relational data
3. Best Practices
Cardinality control, capacity planning, backup, high availability, query optimization, common pitfalls
Time-Series vs Adjacent Storage
| Storage | When |
|---|---|
| TSDB (Prometheus, Timescale, Influx) | Many small writes, range queries, ops/IoT/finance |
| OLAP (ClickHouse, Druid, BigQuery) | Analytical queries over large datasets; trillions of rows |
| Search (Elasticsearch) | Free-text + filters; logs with rich queries |
| Logs (Loki, OpenSearch) | Append-mostly text; lower query needs |
| Relational (Postgres) | Transactions, joins, OLTP |
| Vector (pgvector / Pinecone) | Semantic search; embeddings |
The lines blur. ClickHouse routinely handles "TSDB" workloads better than dedicated TSDBs above a certain scale. TimescaleDB is Postgres so does both transactional and time-series in one place. Apache Druid straddles OLAP and TSDB. Pick the database that fits your query patterns, not the marketing category.
One thing teams underestimate: a TSDB isn't a write-only black box. The point isn't just to store metrics — it's to query them in seconds, build dashboards on them, and alert on them. The right TSDB is the one whose query model fits how you'll actually use the data. Try the queries before committing to the storage.