The “Consistency, Not Accuracy” Principle

Preamble: Read my post “The CAP Theorem distilled”

In my previous post, I started talking about the “Consistency, Not Accuracy” Principle (a.k.a. The CNA Principle)

Essentially, in order to scale your web site and to keep running amidst unpredictable network and system outages, you need to have a replicated, fault-tolerant data store that accepts reads and writes in multiple locations. One replica might be in California and another might be in Virginia. If California were to fall into the great Pacific, your web site should still work and your users should be none the wiser.

Read More

The CAP Theorem Distilled

Anyone who has studied Distributed Computing in a graduate school computer science course appreciates that one of the hardest problems to solve is that of keeping writes in multiple locations synchronized. Are the writes synchronous or asynchronous? If the latter, what is the replication delay in the average and worst cases? How are write-inconsistencies resolved in the face of parallel asynchronous writes? Can you ensure that the last write wins? How?

If synchronous, then why don’t you worry about availability, write-throughput, and latency?

The CAP theorem states that it is possible to optimize at most 2 of ‘C’, ‘A’, and ‘P’. ‘C’ refers to strong consistency, ‘A’ refers to availability, and ‘P’ refers to network partition tolerance.

Read More

About Me

A blog describing my work in building websites that millions of people visit. I'm a senior member of LinkedIn's Distributed Data Systems team. I previously held technical and leadership roles at Netflix, Etsy, eBay & Siebel Systems.
Tumblelogs I follow: