Steven's Knowledge

Getting Started

Run TimescaleDB and InfluxDB locally, load sensor data, query the basics, explore in Grafana

Getting Started

This page runs two time-series databases side by side so you can feel the differences: TimescaleDB (SQL, Postgres extension) and InfluxDB (its own line protocol and query language). Both are excellent; the right choice depends on your team and ecosystem.

Setup

# docker-compose.yml
version: '3.8'
services:
  timescale:
    image: timescale/timescaledb:latest-pg16
    ports: ['5432:5432']
    environment:
      POSTGRES_PASSWORD: tsdbpw
    volumes: ['ts-data:/var/lib/postgresql/data']

  influx:
    image: influxdb:2.7
    ports: ['8086:8086']
    environment:
      DOCKER_INFLUXDB_INIT_MODE: setup
      DOCKER_INFLUXDB_INIT_USERNAME: admin
      DOCKER_INFLUXDB_INIT_PASSWORD: influxpw1
      DOCKER_INFLUXDB_INIT_ORG: demo
      DOCKER_INFLUXDB_INIT_BUCKET: sensors
      DOCKER_INFLUXDB_INIT_ADMIN_TOKEN: my-super-secret-token
    volumes: ['influx-data:/var/lib/influxdb2']

  grafana:
    image: grafana/grafana:latest
    ports: ['3000:3000']
    environment:
      GF_AUTH_ANONYMOUS_ENABLED: 'true'
      GF_AUTH_ANONYMOUS_ORG_ROLE: Editor
    volumes: ['grafana-data:/var/lib/grafana']

volumes:
  ts-data: {}
  influx-data: {}
  grafana-data: {}
docker-compose up -d
# Wait ~10s
docker-compose ps

You now have TimescaleDB on :5432, InfluxDB UI on :8086, and Grafana on :3000.

TimescaleDB: Hello, Hypertables

A hypertable is the core TimescaleDB abstraction: it looks like a Postgres table but is automatically partitioned (chunked) by time.

-- Connect
psql -h localhost -U postgres
-- Password: tsdbpw

CREATE EXTENSION IF NOT EXISTS timescaledb;

-- Sensor schema
CREATE TABLE sensors (
  time        TIMESTAMPTZ NOT NULL,
  device_id   TEXT,
  temperature DOUBLE PRECISION,
  humidity    DOUBLE PRECISION
);

-- Turn it into a hypertable, partitioned by time (1-day chunks)
SELECT create_hypertable('sensors', 'time', chunk_time_interval => INTERVAL '1 day');

CREATE INDEX ON sensors (device_id, time DESC);

Generate some data:

INSERT INTO sensors (time, device_id, temperature, humidity)
SELECT
  now() - (random() * INTERVAL '7 days'),
  'device_' || ((random() * 10)::int),
  20 + random() * 10,
  40 + random() * 30
FROM generate_series(1, 1000000);

Query like ordinary SQL:

-- Hourly average temperature for the last 24h, per device
SELECT
  time_bucket('1 hour', time) AS bucket,
  device_id,
  AVG(temperature) AS avg_temp,
  MAX(temperature) AS max_temp
FROM sensors
WHERE time > NOW() - INTERVAL '24 hours'
GROUP BY bucket, device_id
ORDER BY bucket DESC, device_id;

time_bucket is TimescaleDB's date_trunc for time-series — flexible bucket sizes ('5 minutes', '15 minutes', '1 hour').

TimescaleDB: Continuous Aggregates

Pre-computed, incrementally maintained aggregates:

CREATE MATERIALIZED VIEW sensors_hourly
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 hour', time) AS bucket,
  device_id,
  AVG(temperature) AS avg_temp,
  MAX(temperature) AS max_temp,
  COUNT(*) AS readings
FROM sensors
GROUP BY bucket, device_id;

-- Refresh policy
SELECT add_continuous_aggregate_policy('sensors_hourly',
  start_offset => INTERVAL '1 day',
  end_offset => INTERVAL '1 hour',
  schedule_interval => INTERVAL '10 minutes');

Querying sensors_hourly for last 30 days is now sub-second, even with billions of rows.

TimescaleDB: Retention

-- Drop chunks older than 30 days
SELECT add_retention_policy('sensors', INTERVAL '30 days');

The continuous aggregate still has data older than 30 days (because it's materialized separately) — typical pattern: drop raw, keep aggregates.

InfluxDB: Line Protocol

InfluxDB ingests via line protocol — a text format:

measurement,tag1=v1,tag2=v2 field1=10.5,field2=20.3 1716220800000000000

Push some data:

TOKEN="my-super-secret-token"
ORG="demo"
BUCKET="sensors"

for i in {1..1000}; do
  TS=$(date +%s%N)
  TEMP=$(echo "20 + $RANDOM % 10" | bc)
  HUM=$(echo "40 + $RANDOM % 30" | bc)
  DEV="device_$(($RANDOM % 10))"
  echo "sensors,device_id=$DEV temperature=$TEMP,humidity=$HUM $TS"
done | curl -s -XPOST \
  "http://localhost:8086/api/v2/write?org=$ORG&bucket=$BUCKET" \
  -H "Authorization: Token $TOKEN" \
  --data-binary @-

InfluxDB: Query with Flux (and SQL in v3)

Flux is InfluxDB's functional query language:

from(bucket: "sensors")
  |> range(start: -24h)
  |> filter(fn: (r) => r._measurement == "sensors" and r._field == "temperature")
  |> aggregateWindow(every: 1h, fn: mean, createEmpty: false)
  |> group(columns: ["device_id"])

InfluxDB v3 also supports SQL via DataFusion:

SELECT
  date_trunc('hour', time) AS bucket,
  device_id,
  AVG(temperature) AS avg_temp
FROM sensors
WHERE time > now() - INTERVAL '24 hours'
GROUP BY bucket, device_id
ORDER BY bucket DESC;

For most new projects, SQL is easier. If you adopt InfluxDB and your team isn't keen on Flux, go v3 + SQL.

Connect Grafana

Open http://localhost:3000. Add two data sources:

  1. PostgreSQL → Host timescale:5432, DB postgres, user postgres, password tsdbpw. Important: check "TimescaleDB" toggle.
  2. InfluxDB → URL http://influx:8086, query language Flux (or SQL for v3), org demo, token my-super-secret-token, default bucket sensors.

Build a dashboard:

  • Time-series panel, query SELECT time_bucket('1 hour', time) AS time, device_id, AVG(temperature) FROM sensors WHERE $__timeFilter(time) GROUP BY 1, device_id (TimescaleDB).
  • Stat panel with MAX(temperature) last 1h.

Grafana's $__timeFilter macro adapts to whatever range the user picks in the dashboard.

A Realistic Mini-Workload

Try a sustained write rate:

# 10k writes/sec into Timescale via psycopg2
cat <<'PY' | python3
import psycopg2, random, time
from datetime import datetime, timezone

conn = psycopg2.connect("dbname=postgres user=postgres password=tsdbpw host=localhost")
cur = conn.cursor()
batch = []
start = time.time()
for i in range(100_000):
    batch.append((
        datetime.now(timezone.utc),
        f"device_{random.randint(0,99)}",
        20 + random.random()*10,
        40 + random.random()*30,
    ))
    if len(batch) == 5000:
        cur.executemany(
            "INSERT INTO sensors (time, device_id, temperature, humidity) VALUES (%s,%s,%s,%s)",
            batch
        )
        conn.commit()
        batch.clear()
print("done in", time.time()-start, "s")
PY

Then watch the database size:

SELECT pg_size_pretty(hypertable_size('sensors'));
SELECT chunk_name, pg_size_pretty(total_bytes)
FROM chunks_detailed_size('sensors') ORDER BY range_start DESC LIMIT 10;

When to Use Which

You wantPick
SQL, joins with existing Postgres data, in-app metricsTimescaleDB
Operational metrics, scrape model, alertingPrometheus
IoT, very high write rate, custom protocolsInfluxDB v3 (or TDengine)
Long-term Prometheus storageVictoriaMetrics or Mimir
Trillions of rows, analyticsClickHouse
No infra to run, AWS-nativeTimestream

Cleanup

docker-compose down -v

What's Next

  • Patterns — schema, continuous aggregates, retention, downsampling, joining with relational data
  • Best Practices — cardinality, capacity, HA, query optimization, pitfalls

On this page