Kafka vs RabbitMQ
Side-by-side comparison of Apache Kafka and RabbitMQ - architecture, semantics, performance, and a selection guide
Kafka vs RabbitMQ
The two systems solve overlapping problems with fundamentally different architectures. This page compares them head-to-head; for hands-on detail see Kafka and RabbitMQ, and for a hello-world walkthrough see Getting Started.
Architecture in One Picture
KAFKA — append-only log, consumers track their own offset
Producer ──► Topic (partition 0) [m1, m2, m3, m4, m5, m6, ...]
──► Consumer Group A (at offset 4)
└─► Consumer Group B (at offset 6)
──► Topic (partition 1) [m1, m2, m3, m4, ...]
──► Consumer Group A (at offset 3)
└─► Consumer Group B (at offset 4)
RABBITMQ — exchange routes to queues, consumers pop messages off
Producer ──► Exchange ──── binding ──► Queue 1 ──► Consumer A
└── binding ──► Queue 2 ──► Consumer B
└── binding ──► Queue 3 ──► Consumer CThat picture is most of the difference. The rest follows.
Core Comparison
| Aspect | Kafka | RabbitMQ |
|---|---|---|
| Model | Distributed log / event streaming | Message broker / task queue |
| Retention | Configurable (hours to forever) | Until consumed and acked |
| Replay | Built-in (reset offset) | Not natively (use Streams or DLX patterns) |
| Ordering | Per-partition, guaranteed | Per-queue, single consumer only |
| Routing | By key → partition (simple) | Direct, topic, fanout, headers (flexible) |
| Delivery model | Pull (consumer polls) | Push (broker delivers) |
| Consumer scaling | Add consumers up to partition count | Add consumers to a queue freely |
| Protocol | Custom binary | AMQP 0-9-1, plus STOMP / MQTT plugins |
| Throughput | 100K – 1M+ msg/s/node | 20K – 50K msg/s/queue |
| Latency (p50) | 2 – 5 ms | < 1 ms |
| Storage | Sequential disk (very efficient) | Memory + disk overflow |
| Cluster coord. | KRaft (built in) | Erlang cluster + quorum queues |
Semantic Differences That Matter
| Question | Kafka | RabbitMQ |
|---|---|---|
| "Can a second consumer read the same data?" | Yes — different consumer group | Need a fanout exchange + per-consumer queue |
| "Can I re-process last week's events?" | Yes — reset offset | No (unless you used RabbitMQ Streams) |
| "How do I route by content?" | App-level partition key | Topic/headers exchange does it natively |
| "How do I scale out a consumer?" | Add consumers to the group (up to partition count) | Add consumers to the queue (any number) |
| "What happens when a consumer is slow?" | Lag grows; data still there | Queue grows; messages still there |
| "Where do failed messages go?" | Stay in the log; app-level retry/skip | Native dead-letter exchange |
| "Can I prioritise messages?" | No (without app-level partitioning) | Native priority queues |
| "Do I get back-pressure for free?" | Yes (consumer pulls when ready) | Yes (consumer prefetch) |
Performance: Honest Numbers
Per-node throughput (single broker, healthy hardware, typical message size):
| Metric | Kafka | RabbitMQ |
|---|---|---|
| Throughput | 100K – 1M msg/s | 20K – 50K msg/s/queue |
| Latency p50 | 2 – 5 ms | < 1 ms |
| Latency p99 | 10 – 50 ms | 5 – 20 ms |
| Max message size | 1 MB default (configurable) | No hard limit (practical ~128 MB) |
| Consumer lag handling | Excellent (log replay) | Poor (messages are gone post-ack) |
For most teams, both are fast enough. The choice rarely comes down to raw throughput — it comes down to whether you need the log model.
Decision Guide
Choose Kafka when
- You build event-driven architectures with multiple independent consumers.
- You need replay — re-process events when a downstream service ships a new feature, or when you fix a bug.
- You ingest at high volumes (≥ 100K msg/s aggregate, or hundreds of GB/day).
- You feed analytics, data lake, or stream processing (Spark, Flink, ksqlDB).
- You want compacted topics for current-state snapshots (user profiles, balances).
- Your team is willing to invest in operating a distributed system (or pay for managed Confluent / MSK).
Choose RabbitMQ when
- You need task queues with workers competing for jobs.
- You need flexible routing — fanout, topic patterns, header matching.
- You need request/reply with AMQP
correlation_id/reply_to. - You need priority queues natively.
- Low latency matters more than raw throughput.
- Your scale is modest (tens of thousands of messages per second is the comfort zone).
- You want simpler operations — RabbitMQ clusters are smaller and less ceremonious than Kafka.
Use both when
- You have both event streaming and traditional task-queue workloads.
- You'd rather pick the right tool per workload than force one to do both.
This is more common than people think. Kafka for the data pipeline, RabbitMQ for the tasking. They don't compete inside the same use case.
Common Mistakes
- Using Kafka as a task queue. Possible but awkward — Kafka's partition-based parallelism doesn't match work-queue semantics (any consumer takes any task).
- Using RabbitMQ as an event log. Possible (RabbitMQ Streams) but you're swimming against the grain — Kafka was built for this.
- Picking either based on "we already have it." Operational simplicity is real; pick what you already run unless the use case truly requires the other.
- Sizing for throughput you'll never reach. A single Kafka broker handles hundreds of MB/s. Most "we need Kafka for throughput" claims overestimate by 10–100×.
Summary Table
| Criterion | Winner |
|---|---|
| Raw throughput | Kafka |
| Latency | RabbitMQ |
| Message replay | Kafka |
| Routing flexibility | RabbitMQ |
| Operational simplicity | RabbitMQ |
| Horizontal scaling | Kafka |
| Stream processing ecosystem | Kafka |
| Priority queues | RabbitMQ |
| Protocol breadth | RabbitMQ |
| Data-pipeline ecosystem | Kafka |
| Best default for "I just need a queue" | RabbitMQ |
| Best default for "I'm building an event-driven platform" | Kafka |