Brewer’s CAP theorem

… because everything in life is a choice

choices

In the last days I’m involved in some architectural decisions about how our data is / will be managed in a distributed architectural model. During a meeting I quoted the CAP theorem and some people didn’t know about it. So I decided to write here and share an overview of my understanding.

Backing to 2000 at Principles of Distributed Computing ACM Symposium, Eric Brewer said:

as applications become more web-based we should stop worrying about data consistency, because if we want high availability in these new distributed applications, then guaranteed consistency of data is something we cannot have, thus giving anyone with three servers and a keen eye for customer experience permission to start an internet scale business.

* here is his keynote speech; In 2002 the theorem was officially proved by MIT.

Basically, the theorem is about trade-off decisions which need to be made in highly scalable / distributed systems and that it is impossible for a web service to provide the following three requirements at the same time: Consistency, Availability and Partition tolerance.

Consistency

“Do or do not, there is no try” – Yoda.

Looking MIT paper they use the word “atomic” instead of consistent because, roughly speaking, consistent is the C in ACID as applied to the properties of database transactions and means that data will never be persisted if something goes wrong.

Availability

“The probability that the system is operating at a specified time t.” – Blanchard

The client should be able to retrieve the stored data in a distributed system no matter what happens under the hood. If the client makes a request then he must be able to get a response from the system whether a node (or N nodes) in the cluster are down.

Partition tolerance

“No set of failures less than total network failure is allowed to cause the system to respond incorrectly” – Gilbert & Lynch

Means that the data is network partitioned (aka cluster) and if the network stops delivering messages between two sets of partitions the system will continue to work correctly. Data must be available even when there is a “partition” breaks between nodes. Michael Stonebraker says: “If there is a network failure that splits the processing nodes into two groups that cannot talk to each other, then the goal would be to allow processing to continue in both subgroups”.

CAP in practice

Ok, show me the code! CAP theorem is at the base of developing distributed system, that means I need an extra effort to reproduce something worth. My idea here is not describe such a complex system at the code level. So, I’m gonna show a brilliant data store comparison (I found on web) and how they handle CAP’s trade off trying to elucidate the concept.

cap

Based on CAP model, let’s take some market examples:

Consistent + Available (CA) = problems with partitions and the typically solution is replication. Example: traditional RDBMSs like Postgres, MySQL, Oracle, etc;

Consistent + Partition-Tolerant (CP) = problems with availability considering the cost to keeping data consistent across nodes. Examples: MongoDBBigTableTerrastore, Hypertable, HBaseRedisMemcacheDB.

Available, Partition-Tolerant (AP) = problems with consistency perhaps can achieve “eventual consistency” through replication and verification. Examples: CassandraCouchDBRiak.

Eric Brewer recently wrote a topic of his theorem titled CAP Twelve Years Later: How the “Rules” Have Changed. The theorem still strong considering the approach and amplitude of distributed systems these days (cloud, database, data centers, …). The tradeoff between latency, consistency and availability are deep-seated.

Lastly, there are many discussions about how true is CAP (maybe next post), anyway I do believe it is a powerful tool in order to understand the challenges of distributed systems and the designed behavior of specific systems.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s