Instrumentation
Auto vs manual spans, attributes, error tracking, context propagation across HTTP and queues
Instrumentation
Tracing is only as useful as what you instrument. This page is the playbook for getting useful traces — not just "we have traces."
Auto vs Manual
| Auto-instrumentation | Manual spans |
|---|---|
| Patches popular libraries (HTTP, DB, queues, frameworks) | Code you write explicitly |
| Free coverage; minimal setup | Business-domain spans |
Generic attributes (http.method, db.statement) | Domain-specific attributes (user.tier, order.total) |
| Updates with library versions | You maintain |
Use both. Auto for the boring transport-level stuff, manual for the things that matter to your business.
What Auto-Instrumentation Covers
In Node:
| Library | Spans created |
|---|---|
express / fastify / koa | One span per request |
http / https | One span per outbound request |
pg, mysql, mysql2, mongodb, redis, ioredis, cassandra | One span per query |
kafkajs, amqplib, aws-sdk (SQS/SNS/Kinesis) | One span per produce/consume |
grpc, aws-sdk (other services) | One span per RPC/API call |
graphql | One span per resolver |
Similar coverage in Python (opentelemetry-instrumentation-*), Java (the agent auto-discovers), Go (per-library packages), etc.
Manual Spans for Business Logic
const { trace, SpanStatusCode } = require('@opentelemetry/api');
const tracer = trace.getTracer('orders');
async function processOrder(order) {
return await tracer.startActiveSpan('process-order', async (span) => {
try {
span.setAttributes({
'order.id': order.id,
'order.total_cents': order.total,
'order.item_count': order.items.length,
'user.id': order.userId,
});
const validated = await validate(order); // gets its own span if instrumented
const charged = await charge(order);
const fulfilled = await fulfill(order);
span.setAttribute('order.fulfilled', true);
return { validated, charged, fulfilled };
} catch (err) {
span.recordException(err);
span.setStatus({ code: SpanStatusCode.ERROR, message: err.message });
throw err;
} finally {
span.end();
}
});
}startActiveSpan makes this span the current context — child spans created by code inside automatically link.
Attributes
Attributes are key/value pairs on a span. They're searchable, filterable, and aggregatable in the UI.
Semantic Conventions
OTel defines standard names. Use them where they apply:
| Convention | Examples |
|---|---|
| HTTP | http.method, http.status_code, http.url, http.target |
| Database | db.system, db.statement, db.name, db.operation |
| Messaging | messaging.system, messaging.destination.name, messaging.operation |
| Cloud | cloud.region, cloud.availability_zone, cloud.account.id |
| User | enduser.id, enduser.role |
| Custom | app.* prefix for your own |
Standard names mean your dashboards work across services and your tooling understands them.
What to Add
| Add | Don't add |
|---|---|
| User ID / account ID | Email / PII (unless hashed) |
| Resource IDs (order, product) | Full request body |
| Business tier (free / pro / enterprise) | Auth tokens / passwords |
| Feature-flag variants | High-cardinality IDs that won't help search |
| Outcomes ("error.type=PaymentDeclined") | Stack traces (use events / status) |
span.setAttribute('user.id', userId);
span.setAttribute('app.feature_flag.new_checkout', 'treatment');
span.setAttribute('app.payment.method', 'card');
span.setAttribute('app.cart.items.count', items.length);Errors
Two ways to record errors:
// 1. Record the exception (preserves stack, type)
span.recordException(err);
// 2. Mark the span as errored
span.setStatus({
code: SpanStatusCode.ERROR,
message: err.message,
});Do both. recordException adds a structured error event; setStatus sets the span's status. Together your trace shows the error inline and the failed span is highlighted in the UI.
For HTTP errors, status codes ≥ 500 automatically mark spans as errored in most auto-instrumentations. Be careful with 4xx — many auto-instrumentors leave them as OK by default; configure to your taste.
Events
Events are timestamped messages within a span:
span.addEvent('retry', {
'retry.count': attempt,
'retry.reason': 'connection-refused',
});
span.addEvent('cache.miss', { 'cache.key': cacheKey });Events are like log lines, but tied to the span. Easier to correlate than separate logs.
Context Propagation Across Protocols
HTTP (automatic with OTel)
client.fetch(...) → injects traceparent header
server middleware → extracts, creates child spanJust use the auto-instrumented HTTP client. You don't write any code.
Kafka
const { propagation, context } = require('@opentelemetry/api');
// Producer
const span = tracer.startSpan('publish-order-created');
const ctx = trace.setSpan(context.active(), span);
const headers = {};
propagation.inject(ctx, headers);
await producer.send({ topic: 'orders', messages: [{ value, headers }] });
span.end();
// Consumer
await consumer.run({
eachMessage: async ({ message }) => {
const parentCtx = propagation.extract(context.active(), message.headers);
await tracer.startActiveSpan('process-order-created', { kind: 4 }, parentCtx, async (span) => {
// process...
span.end();
});
},
});Kafka auto-instrumentation (when enabled) handles this for you in Node; in Go and other languages you often write it.
Background Jobs (Sidekiq / BullMQ / Celery)
Pattern: inject the trace context as job metadata at enqueue; extract and continue in the worker.
// Enqueue
await queue.add('send-email', {
to: user.email,
body: '...',
_trace: getCurrentTraceContext(), // your serialization
});
// Worker
const job = await queue.process(async (data) => {
await tracer.startActiveSpan('send-email', { kind: 4 }, contextFromHeaders(data._trace), async (span) => {
// ...
});
});Many job runners now have OTel integrations; check yours first.
Database Queries
Database queries are auto-spanned. The valuable bit is correlating with DB-side logs. Add the trace ID to your SQL queries via comments:
const traceId = trace.getActiveSpan()?.spanContext().traceId;
await db.query(`/* trace_id=${traceId} */ SELECT * FROM users WHERE id = ?`, [id]);Now your slow query log shows the trace ID — you can jump from a slow query to its parent trace in your APM.
Trace IDs in Logs
Critically: emit the trace ID with every log line. Otherwise correlating logs to traces is painful.
const pino = require('pino')({
mixin: () => ({
trace_id: trace.getActiveSpan()?.spanContext().traceId,
span_id: trace.getActiveSpan()?.spanContext().spanId,
}),
});
pino.info({ user_id: 42 }, 'order placed');Now every log line has trace_id. Your log aggregator (ELK, Loki, Datadog) can link directly to your tracing UI.
Baggage: Cross-Span Context
Baggage is OTel's way of propagating key/value across all spans in a trace:
const { propagation } = require('@opentelemetry/api');
// Set baggage
const baggage = propagation.getBaggage(context.active()) || propagation.createBaggage();
const newBaggage = baggage.setEntry('user.tier', { value: 'enterprise' });
const newCtx = propagation.setBaggage(context.active(), newBaggage);
context.with(newCtx, async () => {
// baggage is automatically propagated across HTTP calls
await downstreamCall();
});
// Read baggage anywhere
const tier = propagation.getBaggage(context.active())?.getEntry('user.tier')?.value;Useful for context that should flow throughout the request: user tier, region, feature-flag bucket. Avoid baggage for high-cardinality data — it's sent on every HTTP call in the trace.
Sampling at the SDK
For volume control (we'll cover head/tail sampling in Best Practices):
const { TraceIdRatioBasedSampler } = require('@opentelemetry/sdk-trace-base');
const sdk = new NodeSDK({
sampler: new TraceIdRatioBasedSampler(0.1), // sample 10%
// ...
});Decision happens at the head — the first service to see the trace decides for everyone. Errored traces should always be sampled — see Best Practices.
What's Next
You can instrument anything and propagate context across protocols. Best Practices covers operating tracing at scale — sampling strategy, retention, costs, correlating with logs and metrics.