user@techdebt:~/blog$_
$ cd ..

The Queue

mode:
$ queue inspector
click Step or Play to inspect queue state
$ simulation.log

the problem

Service A calls Service B synchronously. If B is slow, A is slow. If B is down, A fails. If A produces work faster than B can consume it, requests pile up and both services degrade.

Synchronous communication creates tight coupling. Every service in the chain must be available and fast for the whole system to work. One slow service slows everything downstream.

A message queue breaks this coupling. A produces messages into the queue and moves on. B consumes messages from the queue at its own pace. If B is slow, messages accumulate in the queue instead of backing up into A. If B is down, messages wait in the queue until B recovers. A and B never need to be available at the same time.

point-to-point

The simplest pattern. One queue, one or more producers, one or more consumers. Each message is delivered to exactly one consumer. When multiple consumers are connected, messages are distributed among them (like a load balancer for work items).

Producer --> [Queue: M1, M2, M3] --> Consumer 1
                                --> Consumer 2
M1 -> Consumer 1
M2 -> Consumer 2
M3 -> Consumer 1

Each message is processed once. If Consumer 1 fails to process M1, the message returns to the queue and is delivered to another consumer (or retried). After too many failures, the message goes to a dead-letter queue for investigation.

Point-to-point is ideal for work distribution: sending emails, processing payments, resizing images. The queue acts as a buffer between producers and workers. Add more workers to increase throughput.

pub-sub

Publish-subscribe inverts the model. Instead of one consumer getting each message, every subscriber gets every message. A producer publishes to a topic, and all subscribers receive a copy.

Producer --> [Topic: M1] --> Subscriber A (gets M1)
                         --> Subscriber B (gets M1)
                         --> Subscriber C (gets M1)

Pub-sub is ideal for event broadcasting: a user signs up, and the welcome email service, the analytics service, and the notification service all need to know. The producer publishes one event; each subscriber processes it independently.

The producer does not know (or care) how many subscribers exist. New subscribers can be added without modifying the producer. This is the decoupling that makes event-driven architectures work.

consumer groups

Consumer groups combine the best of point-to-point and pub-sub. A topic is divided into partitions. Each partition is consumed by exactly one member of the consumer group. Different consumer groups independently consume the same topic.

Topic with 3 partitions:
  P0 --> Consumer Group A: Consumer 1
  P1 --> Consumer Group A: Consumer 2
  P2 --> Consumer Group A: Consumer 3

  P0 --> Consumer Group B: Consumer 4
  P1 --> Consumer Group B: Consumer 4
  P2 --> Consumer Group B: Consumer 4

Within Group A, each partition goes to one consumer (parallel processing). Group B independently consumes all partitions (fan-out). This is the Kafka model: consumer groups enable both parallel consumption and independent subscriptions.

Partitions are the unit of parallelism. More partitions allow more consumers in a group. If you have 3 partitions, at most 3 consumers in a group can be active (a 4th consumer would be idle). Plan partition count based on expected peak parallelism.

delivery guarantees

Three levels of delivery guarantee, each with different tradeoffs:

At-most-once: Fire and forget. The producer sends the message and does not wait for acknowledgment. Fast, but messages can be lost if the broker crashes before persisting.

At-least-once: The producer retries until the broker acknowledges. Messages are never lost, but duplicates are possible (the broker received the message but the ack was lost, so the producer retries). This is the most common guarantee.

Exactly-once: Each message is processed exactly once. The holy grail, but extremely expensive to implement. Requires idempotent producers, transactional consumers, and coordination between the queue and the processing logic. Kafka supports it within a single Kafka cluster. Across systems, true exactly-once is effectively impossible without idempotency on the consumer side.

In practice, design for at-least-once delivery and make consumers idempotent. An idempotent consumer produces the same result whether it processes a message once or twice. This is simpler and more robust than trying to guarantee exactly-once delivery.

where it shows up

  • Apache Kafka: Distributed log with partitioned topics, consumer groups, and strong ordering within partitions. The standard for high-throughput event streaming.
  • RabbitMQ: Traditional message broker with exchanges, queues, and routing. Supports AMQP protocol. Good for complex routing patterns and task queues.
  • Amazon SQS: Fully managed queue service. Standard queues (at-least-once, best-effort ordering) and FIFO queues (exactly-once, strict ordering). Scales automatically.
  • Apache Pulsar: Similar to Kafka but with built-in multi-tenancy, geo-replication, and tiered storage. Separates compute (brokers) from storage (BookKeeper).
  • Redis Streams: Lightweight, fast message stream built into Redis. Good for small-to-medium throughput where you already have Redis deployed.