• The CAP Theorem. Source: Brewer 2000.
    image
  • Comparing FLP with CAP. Source: Dinh 2016.
    image
  • When partitioned, the service becomes inconsistent though still available. Source: Whittaker 2014.
    image

CAP Theorem

userimg
lokesh.rawat
799 DevCoins
userimg
arvindpdmn
4 DevCoins
2 authors have contributed to this article
Last updated by arvindpdmn
on 2019-06-10 02:44:24
Created by lokesh.rawat
on 2019-05-28 08:20:34
Improve this article. Show messages

Summary

 image
The CAP Theorem. Source: Brewer 2000.

A well-design cloud-based application often stores its data across multiple servers. For faster response, data is often stored closer to clients in that geography. Due to the distributed nature of this system, it's impossible to design a perfect system. The network may be unreliable or slow at times. Therefore, there are trade-offs to be made. CAP Theorem gives system designers a method to think through and evaluate the trade-offs at the design stage.

The three parts of the CAP Theorem are Consistency, Availability, and Partition Tolerance. The theorem states that it's impossible to guarantee all three in a distributed data store. We can meet any two of them but not all three.

Over the years, designers have misinterpreted the CAP Theorem. To reflect read-world scenarios, modifications to the theorem have been proposed.

Milestones

1985
image

Originally presented in 1983, researchers Fisher, Lynch and Paterson (FLP) show that distributed consensus is impossible in a fault-tolerant manner in an asynchronous system. Distributed consensus is related to the problem of atomic storage addressed by CAP Theorem.

1996

Before CAP Theorem is formalized, researchers have been working on similar ideas. One example is the paper titled Trading Consistency for Availability in Distributed Systems by two researchers at Cornell University.

2000

Eric Brewer presents the CAP Theorem at the 19th Annual ACM Symposium on Principles of Distributed Computing (PODC). It's early history can be traced to 1998 and first published in 1999. Brewer points out that distributed computing has unduly focused on computation and not on data.

2002

MIT researchers Seth Gilbert and Nancy Lynch offer a formal proof of the CAP Theorem in their paper titled Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services. In asynchronous systems, the impossibility result is strong. In partially synchronous systems, we can achieve a practical compromise between consistency and availability.

2010

Daniel Abadi proposes the PACELC Theorem as an alternative to the CAP Theorem. When data is replicated, there's a trade-off between latency and consistency. PACELC makes this explicit: during partitions (P), trade-off is AC; else, trade-off is LC. Default versions of Dynamo, Cassandra, and Riak are PA/EL systems. VoltDB/H-Store and Megastore are PC/EC. MongoDB is PA/EC.

Discussion

  • What's the definition of CAP Theorem?

    A formal definition of CAP Theorem is, "It is impossible in the asynchronous network model to implement a read/write data object that guarantees the following properties: availability, atomic consistency, in all fair executions (including those in which messages are lost)".

    A simplified definition states that,

    In a network subject to communication failures, it is impossible for any web service to implement an atomic read/write shared memory that guarantees a response to every request.

    The word "atomic" used above means that although it's a distributed system, requests are modelled as if they are executing on a single node. This gives us an easy model for consistency.

  • Could you explain the CAP Theorem?
     image
    When partitioned, the service becomes inconsistent though still available. Source: Whittaker 2014.

    The parts of the CAP Theorem can be understood as follows:

    • Consistency: When a request is made, the server returns the right response. What is "right" depends on the service. For example, reading a value from a database might mean that the most recent write to that value should be returned.
    • Availability: A request always receives a response from the server. No constraint is placed on how quickly the response must be received.
    • Partition Tolerance: The underlying network is not reliable and servers may get partitioned into non-communicating groups. Despite this, the service should continue to work as desired.

    As an example, consider two nodes G1 and G2 that have been partitioned. A client changes a value from v0 to v1 on G1. However, the same value is not updated on G2 due to the partition. Hence, when G2 is queried it returns the old value v0. Thus, the service is available but not consistent.

    Sometimes the terms safety (consistency) and liveness (availability) are used in the generalized sense. Safety means "nothing bad ever happens". Liveness means "eventually something good happens".

  • What's the implication of the CAP Theorem when designing distributed systems?

    When CAP Theorem was proposed, the understanding was that system designers had three options:

    • CA Systems: Sacrifice partition tolerance. Single-site or cluster databases using two-phase commit are examples.
    • CP Systems: Sacrifice availability. If there's a partition, for consistency we make the service unavailable: return a timeout error or lock operations.
    • AP Systems: Sacrifice consistency. If there's a partition, we continue accepting requests but reconcile them later (writes) or return stale values (reads).

    In practice, we deal with network partitions at least some of the time. The choice is really between consistency and availability. For databases, consistency can be achieved by enabling reads after completing writes on several nodes. Availability can be achieved by replicating data across nodes. In fact, permanent partitions are rare. So the choice is temporary.

    Designers don't have to give up one of the three to build a distributed system. In fact, it's possible to have all three under normal network conditions. There's trade-off only when the network is partitioned. It's also helpful to think probabilistically. We can design a CA system if probability of a partition is far less than that of other systemic failures.

  • Could you share real-world applications of the CAP Theorem?

    Databases that follow ACID (Atomicity, Consistency, Isolation, and Durability) give priority to consistency. However, NoSQL distributed databases prefer availability over consistency since availability is often part of commercial service guarantees. So caching and logging were used for eventual consistency. This leads to what we call BASE (Basically Available, Soft state, and Eventually consistent). As examples, Zookeeper prefers consistency while Amazon's Dynamo prefers availability.

    Maintaining consistency over a wide area network increases latency. Therefore, Yahoo's PNUTS system is inconsistent because it maintains remote copies asynchronously. A particular user's data is partitioned locally and accessed with low latency. Facebook's prefers to update a non-partitioned master copy. User has a more local but potentially stale copy, until it gets updated.

    A web browser can go offline if it loses connection to the server. The web app can fall back to on-client persistent storage. Hence, availability is preferred over consistency to sustain long partitions. Likewise, Akamai's web caching offers best effort consistency with high level of availability.

    In Google, primary partition usually resides within one datacenter, where both availability and consistency can be maintained. Outside this partition, service becomes unavailable.

  • What are some criticisms of the CAP Theorem?

    Although the Theorem doesn't specify an upper bound on response time for availability, in practice, there's exists a timeout. CAP Theorem ignores latency, which is an important consideration in practice. Timeouts are often implemented in services. During a partition, if we cancel a request, we maintain consistency but forfeit availability. In fact, latency can be seen as another word for availability.

    In NoSQL distributed databases, CAP Theorem has led to the belief that eventual consistency provides better availability than strong consistency. Some believe this is an outdated notion. It's better to factor in sensitivity to network delays.

    CAP Theorem suggests a binary decision. In reality, it's a continuum. There are different degrees of consistency implemented via "read your writes, monotonic reads and causal consistency".

References

  1. Abadi, Daniel. 2010. "Problems with CAP, and Yahoo’s little known NoSQL system." DBMS Musings, April 23. Accessed 2019-06-09.
  2. Abadi, Daniel J. 2012. "Consistency Tradeoffs in Modern Distributed Database System Design." IEEE Computer Society, pp. 37-42, February. Accessed 2019-05-28.
  3. Birman, Ken and Roy Friedman. 1996. "Trading Consistency for Availability in Distributed Systems." Technical Report TR96-1579, Cornell University, April. Accessed 2019-06-07.
  4. Brewer, Eric. 2000. "Towards Robust Distributed Systems." PODC Keynote, July 19. Accessed 2019-06-09.
  5. Brewer, Eric. 2012. "CAP Twelve Years Later: How the 'Rules' Have Changed." InfoQ, May 30. Accessed 2019-05-28.
  6. Brooker, Marc. 2014. "CAP and PACELC: Thinking More Clearly About Consistency." Blog, July 16. Accessed 2019-05-28.
  7. Dinh, Anh. 2016. "History of the Impossibles - CAP and FLP." July 15. Accessed 2019-05-28.
  8. Fischer, Michael J., Nancy A. Lynch, and Michael S. Paterson. 1985. "Impossibility of Distributed Consensus with One Faulty Process." Journal of the Association for Computing Machinery, vol. 32, no. 2, pp. 374-382, April. Accessed 2019-05-28.
  9. Gilbert, Seth and Nancy Lynch. 2002. "Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services." ACM SIGACT News, vol. 33, no. 2, pp. 51-59, June. Accessed 2019-06-09.
  10. Gilbert, Seth and Nancy A. Lynch. 2012. "Perspectives on the CAP Theorem." Computer, vol. 45, no. 2, pp. 30-36, February. Accessed 2019-05-28.
  11. Greiner, Robert. 2014. "CAP Theorem: Revisited." August 14. Accessed 2019-05-28.
  12. Kleppmann, Martin. 2015. "A Critique of the CAP Theorem." arXiv, September 18. Accessed 2019-05-28.
  13. Messinger, Lior. 2013. "Better explaining the CAP Theorem." DZone, February 17. Accessed 2019-06-09.
  14. Robinson, Henry. 2019. "The CAP FAQ." The Paper Trail. Accessed 2019-05-28.
  15. Whittaker, Michael. 2014. "An Illustrated Proof of the CAP Theorem." Blog on GitHub.io, August 16. Accessed 2019-05-28.

Milestones

1985
image

Originally presented in 1983, researchers Fisher, Lynch and Paterson (FLP) show that distributed consensus is impossible in a fault-tolerant manner in an asynchronous system. Distributed consensus is related to the problem of atomic storage addressed by CAP Theorem.

1996

Before CAP Theorem is formalized, researchers have been working on similar ideas. One example is the paper titled Trading Consistency for Availability in Distributed Systems by two researchers at Cornell University.

2000

Eric Brewer presents the CAP Theorem at the 19th Annual ACM Symposium on Principles of Distributed Computing (PODC). It's early history can be traced to 1998 and first published in 1999. Brewer points out that distributed computing has unduly focused on computation and not on data.

2002

MIT researchers Seth Gilbert and Nancy Lynch offer a formal proof of the CAP Theorem in their paper titled Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services. In asynchronous systems, the impossibility result is strong. In partially synchronous systems, we can achieve a practical compromise between consistency and availability.

2010

Daniel Abadi proposes the PACELC Theorem as an alternative to the CAP Theorem. When data is replicated, there's a trade-off between latency and consistency. PACELC makes this explicit: during partitions (P), trade-off is AC; else, trade-off is LC. Default versions of Dynamo, Cassandra, and Riak are PA/EL systems. VoltDB/H-Store and Megastore are PC/EC. MongoDB is PA/EC.

Tags

See Also

  • Distributed Computing
  • ACID Transactions
  • CRUD
  • Two-Phase Commit Protocol
  • Command Query Responsibility Segregation
  • Byzantine Generals Problem

Further Reading

  1. Brewer, Eric. 2012. "CAP Twelve Years Later: How the 'Rules' Have Changed." InfoQ, May 30. Accessed 2019-05-28.
  2. Gilbert, Seth and Nancy A. Lynch. 2012. "Perspectives on the CAP Theorem." Computer, vol. 45, no. 2, pp. 30-36, February. Accessed 2019-05-28.
  3. Abadi, Daniel J. 2012. "Consistency Tradeoffs in Modern Distributed Database System Design." IEEE Computer Society, pp. 37-42, February. Accessed 2019-05-28.
  4. Robinson, Henry. 2019. "The CAP FAQ." The Paper Trail. Accessed 2019-05-28.

Article Stats

Author-wise Stats for Article Edits

Author
No. of Edits
No. of Chats
DevCoins
2
0
799
1
0
4
1210
Words
0
Chats
3
Edits
0
Likes
147
Hits

Cite As

Devopedia. 2019. "CAP Theorem." Version 3, June 10. Accessed 2019-06-27. https://devopedia.org/cap-theorem