Steven's Knowledge

Time-Series Databases

InfluxDB, TimescaleDB, VictoriaMetrics, QuestDB - purpose-built storage for timestamps, metrics, sensors, and event streams

Time-Series Databases

A time-series database (TSDB) is built around one assumption: most of your data is timestamped, append-mostly, and rarely updated. That assumption changes everything — storage format, compression, query language, retention. The result is a database that holds billions of points per server cheaply, and answers "how did this metric look over the last 24 hours" in milliseconds.

You already use TSDBs without thinking about it. Prometheus is a TSDB. CloudWatch is one. DataDog is one. But there's a category of general-purpose TSDBs you reach for when Prometheus' scope (operational metrics, short retention) isn't enough — IoT data, business analytics, financial ticks, application traces in raw form.

Why a TSDB and Not Postgres

You can store timestamps in Postgres. At small scale it's fine. At any real scale it falls over:

ConcernPostgres (vanilla)Purpose-built TSDB
Insert rate~10k/s on a big box100k–1M+/s per node
CompressionOK with TOAST10-50× via columnar + delta + Gorilla
Retention / downsamplingManual scriptsFirst-class policies
Time-range queriesOK with btree, but bigDesigned for it; sub-ms on huge ranges
Aggregations over windowsSlow without tuningOptimized hot path
Cardinality (many series)Indexes blow upHandled (with limits)

The breaking point is usually when you have millions of series × billions of points and queries scan large time ranges. At that scale, a TSDB is 10-100× cheaper and faster than Postgres on the same hardware.

The Players

ToolStorage styleBest for
InfluxDB (v2/v3)Custom; v3 is Apache Arrow/ParquetGeneral-purpose; IoT; metrics; Flux/SQL
TimescaleDBPostgres extension (chunked btrees + columnar)Want SQL + Postgres ecosystem + time-series perf
Prometheus (TSDB)Custom; in-memory + blocksOperational metrics, short retention
VictoriaMetricsPrometheus-compatibleLong-term Prometheus storage; HA; cheap
QuestDBColumnar + time-partitionedTick data; finance; ultra-low-latency queries
ClickHouseColumnar OLAPTime-series-shaped analytics; trillions of rows
Apache DruidDistributed columnar OLAPReal-time + batch analytics on event streams
AWS TimestreamManaged serverlessDon't want to run anything; AWS-native
Azure Data Explorer (Kusto)ColumnarTelemetry + logs + ad-hoc analytics
TDengineC-based; per-device tablesIoT at massive device count
M3DB (Uber)DistributedOpen-source, was Uber's metrics platform
Mimir (Grafana)Prometheus long-termCloud-native Prometheus at scale

How to pick:

  • Need SQL and Postgres ecosystemTimescaleDB
  • Operational metrics already on PrometheusVictoriaMetrics or Mimir for long-term
  • IoT / massive write rateInfluxDB v3 or TDengine
  • Finance / tick dataQuestDB or kdb+ (commercial)
  • Trillions of rows analyticsClickHouse (technically OLAP, but eats time-series)
  • Don't want to run anything (AWS)Timestream

For most teams starting out: TimescaleDB if you already use Postgres; InfluxDB otherwise.

What's Different About Time-Series

The shape of the data changes everything:

TraitWhy it matters
Mostly appends, almost no updatesStorage can be log-structured / immutable
Timestamps are queried in rangesTime-partitioning beats general indexing
Values are similar to neighborsDelta encoding + Gorilla compression hit 10-50×
Old data is read less, eventually deletedFirst-class retention + downsampling
Aggregates (avg, max, P99) over windows are the main queryContinuous queries pre-compute
Lots of series (metric × labels)Cardinality is the silent killer

Cardinality: The Silent Killer

A series is one unique combination of metric + tag values. Each is a distinct stream of points. If you have:

http_requests{service="checkout", region="us-east-1", status="200", user_id="u_123"}

Then user_id is the killer — every user creates a new series. High-cardinality labels destroy TSDBs. Rules of thumb:

LabelCardinality (typical)Safe?
service (5-50 services)LowYes
region (3-10 regions)LowYes
status (5 status classes)LowYes
endpoint (50-500)MediumUsually
user_id (1M users)HighNo
request_id (every request)Very HighNo

Cardinality budget: most TSDBs handle 1-10M active series per node well; above that, things start hurting. Prometheus, InfluxDB v1/v2 are most sensitive; v3 / VictoriaMetrics / Mimir handle more.

If you need per-user metrics, use logs/traces (low aggregation cost) — not high-cardinality time-series.

Retention and Downsampling

Time-series data ages predictably. Treat older data differently:

0-7 days:    full resolution (1s)
7-30 days:   1min downsampled  (60x cheaper)
30-90 days:  5min downsampled
90-365 days: 1h downsampled
> 1 year:    archived or dropped

This is continuous queries in InfluxDB, continuous aggregates in TimescaleDB, recording rules in Prometheus. The downsampled data lives next to the raw data; queries hit whichever resolution matches the time range.

Without downsampling, year-old 1-second metrics cost the same to store as today's — that bill grows linearly forever.

Learning Path

Time-Series vs Adjacent Storage

StorageWhen
TSDB (Prometheus, Timescale, Influx)Many small writes, range queries, ops/IoT/finance
OLAP (ClickHouse, Druid, BigQuery)Analytical queries over large datasets; trillions of rows
Search (Elasticsearch)Free-text + filters; logs with rich queries
Logs (Loki, OpenSearch)Append-mostly text; lower query needs
Relational (Postgres)Transactions, joins, OLTP
Vector (pgvector / Pinecone)Semantic search; embeddings

The lines blur. ClickHouse routinely handles "TSDB" workloads better than dedicated TSDBs above a certain scale. TimescaleDB is Postgres so does both transactional and time-series in one place. Apache Druid straddles OLAP and TSDB. Pick the database that fits your query patterns, not the marketing category.

One thing teams underestimate: a TSDB isn't a write-only black box. The point isn't just to store metrics — it's to query them in seconds, build dashboards on them, and alert on them. The right TSDB is the one whose query model fits how you'll actually use the data. Try the queries before committing to the storage.

On this page