InfluxDB, TimescaleDB, VictoriaMetrics, QuestDB - purpose-built storage for timestamps, metrics, sensors, and event streams

Time-Series Databases

A time-series database (TSDB) is built around one assumption: most of your data is timestamped, append-mostly, and rarely updated. That assumption changes everything — storage format, compression, query language, retention. The result is a database that holds billions of points per server cheaply, and answers "how did this metric look over the last 24 hours" in milliseconds.

You already use TSDBs without thinking about it. Prometheus is a TSDB. CloudWatch is one. DataDog is one. But there's a category of general-purpose TSDBs you reach for when Prometheus' scope (operational metrics, short retention) isn't enough — IoT data, business analytics, financial ticks, application traces in raw form.

Why a TSDB and Not Postgres

You can store timestamps in Postgres. At small scale it's fine. At any real scale it falls over:

Concern	Postgres (vanilla)	Purpose-built TSDB
Insert rate	~10k/s on a big box	100k–1M+/s per node
Compression	OK with TOAST	10-50× via columnar + delta + Gorilla
Retention / downsampling	Manual scripts	First-class policies
Time-range queries	OK with btree, but big	Designed for it; sub-ms on huge ranges
Aggregations over windows	Slow without tuning	Optimized hot path
Cardinality (many series)	Indexes blow up	Handled (with limits)

The breaking point is usually when you have millions of series × billions of points and queries scan large time ranges. At that scale, a TSDB is 10-100× cheaper and faster than Postgres on the same hardware.

The Players

Tool	Storage style	Best for
InfluxDB (v2/v3)	Custom; v3 is Apache Arrow/Parquet	General-purpose; IoT; metrics; Flux/SQL
TimescaleDB	Postgres extension (chunked btrees + columnar)	Want SQL + Postgres ecosystem + time-series perf
Prometheus (TSDB)	Custom; in-memory + blocks	Operational metrics, short retention
VictoriaMetrics	Prometheus-compatible	Long-term Prometheus storage; HA; cheap
QuestDB	Columnar + time-partitioned	Tick data; finance; ultra-low-latency queries
ClickHouse	Columnar OLAP	Time-series-shaped analytics; trillions of rows
Apache Druid	Distributed columnar OLAP	Real-time + batch analytics on event streams
AWS Timestream	Managed serverless	Don't want to run anything; AWS-native
Azure Data Explorer (Kusto)	Columnar	Telemetry + logs + ad-hoc analytics
TDengine	C-based; per-device tables	IoT at massive device count
M3DB (Uber)	Distributed	Open-source, was Uber's metrics platform
Mimir (Grafana)	Prometheus long-term	Cloud-native Prometheus at scale

How to pick:

Need SQL and Postgres ecosystem → TimescaleDB
Operational metrics already on Prometheus → VictoriaMetrics or Mimir for long-term
IoT / massive write rate → InfluxDB v3 or TDengine
Finance / tick data → QuestDB or kdb+ (commercial)
Trillions of rows analytics → ClickHouse (technically OLAP, but eats time-series)
Don't want to run anything (AWS) → Timestream

For most teams starting out: TimescaleDB if you already use Postgres; InfluxDB otherwise.

What's Different About Time-Series

The shape of the data changes everything:

Trait	Why it matters
Mostly appends, almost no updates	Storage can be log-structured / immutable
Timestamps are queried in ranges	Time-partitioning beats general indexing
Values are similar to neighbors	Delta encoding + Gorilla compression hit 10-50×
Old data is read less, eventually deleted	First-class retention + downsampling
Aggregates (avg, max, P99) over windows are the main query	Continuous queries pre-compute
*Lots of series* (metric × labels)**	Cardinality is the silent killer

Cardinality: The Silent Killer

A series is one unique combination of metric + tag values. Each is a distinct stream of points. If you have:

http_requests{service="checkout", region="us-east-1", status="200", user_id="u_123"}

Then user_id is the killer — every user creates a new series. High-cardinality labels destroy TSDBs. Rules of thumb:

Label	Cardinality (typical)	Safe?
`service` (5-50 services)	Low	Yes
`region` (3-10 regions)	Low	Yes
`status` (5 status classes)	Low	Yes
`endpoint` (50-500)	Medium	Usually
`user_id` (1M users)	High	No
`request_id` (every request)	Very High	No

Cardinality budget: most TSDBs handle 1-10M active series per node well; above that, things start hurting. Prometheus, InfluxDB v1/v2 are most sensitive; v3 / VictoriaMetrics / Mimir handle more.

If you need per-user metrics, use logs/traces (low aggregation cost) — not high-cardinality time-series.

Retention and Downsampling

Time-series data ages predictably. Treat older data differently:

0-7 days:    full resolution (1s)
7-30 days:   1min downsampled  (60x cheaper)
30-90 days:  5min downsampled
90-365 days: 1h downsampled
> 1 year:    archived or dropped

This is continuous queries in InfluxDB, continuous aggregates in TimescaleDB, recording rules in Prometheus. The downsampled data lives next to the raw data; queries hit whichever resolution matches the time range.

Without downsampling, year-old 1-second metrics cost the same to store as today's — that bill grows linearly forever.

Learning Path

1. Getting Started

Run TimescaleDB and InfluxDB locally; load sensor data; query the basics; explore in Grafana

2. Patterns

Schema design, continuous aggregates, retention, downsampling, joining time-series with relational data

3. Best Practices

Cardinality control, capacity planning, backup, high availability, query optimization, common pitfalls

Time-Series vs Adjacent Storage

Storage	When
TSDB (Prometheus, Timescale, Influx)	Many small writes, range queries, ops/IoT/finance
OLAP (ClickHouse, Druid, BigQuery)	Analytical queries over large datasets; trillions of rows
Search (Elasticsearch)	Free-text + filters; logs with rich queries
Logs (Loki, OpenSearch)	Append-mostly text; lower query needs
Relational (Postgres)	Transactions, joins, OLTP
Vector (pgvector / Pinecone)	Semantic search; embeddings

The lines blur. ClickHouse routinely handles "TSDB" workloads better than dedicated TSDBs above a certain scale. TimescaleDB is Postgres so does both transactional and time-series in one place. Apache Druid straddles OLAP and TSDB. Pick the database that fits your query patterns, not the marketing category.

One thing teams underestimate: a TSDB isn't a write-only black box. The point isn't just to store metrics — it's to query them in seconds, build dashboards on them, and alert on them. The right TSDB is the one whose query model fits how you'll actually use the data. Try the queries before committing to the storage.

Time-Series Databases

1. Getting Started

2. Patterns

3. Best Practices

On this page