user@techdebt:~/blog$_
$ cd ..

The Consistency

mode:
$ consistency inspector
click Step or Play to inspect consistency state
$ simulation.log

the problem

You write x = 5 to a distributed database. A millisecond later, you read x. What do you get?

On a single machine, the answer is obvious: 5. But in a distributed system with replicas, the write might have reached one replica but not another. Your read might hit a stale replica and return the old value. Or it might return 5. Or, depending on the system, it might return a value that was never written at all.

Consistency models are the contract between the system and the programmer. They define exactly which values a read is allowed to return, and when.

linearizability

The strongest practical model. Every operation appears to take effect at a single, instantaneous point between its start and end. Once a write completes, every subsequent read must see it.

Think of it as a single global timeline where operations snap into place one at a time. Even though operations take real time (network round trips, disk writes), the system behaves as if each one happened at a single atomic instant.

Client A: |---W(x=1)---|
Client B:      |---R(x)---|  -> must return 1
                                (write completed before read started)

If a write finishes before a read starts, the read must see it. If they overlap, the read can return either the old or new value, but once any read returns the new value, all subsequent reads must also return it.

Linearizability is expensive. It typically requires coordination (consensus) on every operation. Spanner achieves it globally using GPS and atomic clocks (TrueTime). Most systems offer it only within a single partition or not at all.

sequential consistency

Slightly weaker. Operations from each individual process appear in program order, but there is no guarantee about real-time ordering across processes.

Client A: W(x=1)  then  R(x)    -> must see 1 (program order)
Client B: W(x=2)  then  R(x)    -> must see 2 (program order)
Client C: R(x)=2  then  R(x)=1  -> valid if global order is [B:W(2), A:W(1)]

Client C reading 2 then 1 looks wrong, but it is valid under sequential consistency. There exists a total ordering of all operations (B writes 2, then A writes 1) that is consistent with each process’s program order. C just sees the effects in global order.

But if C reads 1 then 2, it has committed to seeing A’s write before B’s. All of C’s future reads must be consistent with that ordering.

eventual consistency

The weakest useful model. The only guarantee: if no new writes happen, all replicas will eventually converge to the same value. During propagation, reads can return any previously written value.

Write x=1 to Replica 1
Read from Replica 2:  x=0  (stale, but allowed)
Read from Replica 3:  x=0  (stale, but allowed)
... time passes, replication propagates ...
Read from Replica 2:  x=1  (caught up)
Read from Replica 3:  x=1  (caught up)

Eventual consistency is fast. Writes return immediately after hitting one replica. Reads are local. No coordination, no waiting for quorum. The price: your application must handle stale data.

the spectrum

From strongest to weakest:

  1. Linearizable: Real-time ordering. Gold standard. Highest latency.
  2. Sequential: Program order per process. No real-time guarantees.
  3. Causal: Operations that are causally related maintain their order. Concurrent operations can be seen in any order.
  4. Eventual: Convergence only. No ordering guarantees during propagation.

Each step down trades correctness for performance. Linearizability requires coordination on every operation. Eventual consistency requires none. Most real systems land somewhere in between, offering tunable consistency or different guarantees for different operations.

where it shows up

  • Spanner: Linearizable reads and writes globally, using TrueTime (GPS + atomic clocks) to bound clock uncertainty. The gold standard for strong consistency at global scale.
  • PostgreSQL: Serializable isolation (close to linearizable) within a single node. Streaming replication to replicas is asynchronous by default (eventual for reads from replicas).
  • Cassandra: Tunable consistency. QUORUM reads and writes give you something close to linearizable. ONE gives you eventual consistency with lower latency.
  • DynamoDB: Eventually consistent reads by default. Strongly consistent reads available at 2x the cost (reads from the leader replica).
  • Redis: Single-threaded, so linearizable on a single node. Replication is asynchronous, so reads from replicas are eventually consistent.