In
theoretical computer science
Theoretical computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the theory of computation, lambda calculus, and type theory.
It is difficult to circumsc ...
, the CAP theorem, also named Brewer's theorem after computer scientist
Eric Brewer, states that any
distributed data store
A distributed data store is a computer network where information is stored on more than one node, often in a replicated fashion. It is usually specifically used to refer to either a distributed database where users store information on a ''numb ...
can provide only
two of the following three guarantees:
[Seth Gilbert and Nancy Lynch]
"Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services"
''ACM SIGACT News'', Volume 33 Issue 2 (2002), pg. 51–59. .
;
Consistency
In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent ...
: Every read receives the most recent write or an error.
;
Availability
In reliability engineering, the term availability has the following meanings:
* The degree to which a system, subsystem or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at ...
: Every request receives a (non-error) response, without the guarantee that it contains the most recent write.
;
Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.
When a
network partition
A network partition is a division of a computer network into relatively independent subnets, either by design, to optimize them separately, or due to the failure of network devices. Distributed software must be designed to be partition-tolerant, t ...
failure happens, it must be decided whether to do one of the following:
* cancel the operation and thus decrease the availability but ensure consistency
* proceed with the operation and thus provide availability but risk inconsistency.
Thus, if there is a network partition, one has to choose between consistency or availability. Note that consistency as defined in the CAP theorem is quite different from the consistency guaranteed in
ACID
In computer science, ACID ( atomicity, consistency, isolation, durability) is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a sequ ...
database transaction
A database transaction symbolizes a unit of work, performed within a database management system (or similar system) against a database, that is treated in a coherent and reliable way independent of other transactions. A transaction generally rep ...
s.
Eric Brewer argues that the often-used "two out of three" concept can be somewhat misleading because system designers need only to sacrifice consistency or availability in the presence of partitions, but that in many systems partitions are rare.
Explanation
No distributed system is safe from network failures, thus network partitioning generally has to be tolerated. In the presence of a partition, one is then left with two options: consistency or
availability
In reliability engineering, the term availability has the following meanings:
* The degree to which a system, subsystem or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at ...
. When choosing consistency over availability, the system will return an error or a time out if particular information cannot be guaranteed to be up to date due to network partitioning. When choosing availability over consistency, the system will always process the query and try to return the most recent available version of the information, even if it cannot guarantee it is up to date due to network partitioning.
In the absence of a partition, both availability and consistency can be satisfied.
Database systems designed with traditional
ACID
In computer science, ACID ( atomicity, consistency, isolation, durability) is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a sequ ...
guarantees in mind such as
RDBMS choose
consistency
In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent ...
over availability, whereas systems designed around the
BASE philosophy, common in the
NoSQL
A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
movement for example, choose availability over consistency.
History
According to
University of California, Berkeley
The University of California, Berkeley (UC Berkeley, Berkeley, Cal, or California) is a public land-grant research university in Berkeley, California. Established in 1868 as the University of California, it is the state's first land-grant u ...
computer scientist
Eric Brewer, the theorem first appeared in autumn 1998.
[Eric Brewer]
"CAP twelve years later: How the 'rules' have changed"
''Computer'', Volume 45, Issue 2 (2012), pg. 23–29. . It was published as the CAP principle in 1999
[Armando Fox and Eric Brewer, "Harvest, Yield and Scalable Tolerant Systems", ''Proc. 7th Workshop Hot Topics in Operating Systems (HotOS 99)'', IEEE CS, 1999, pg. 174–178. ] and presented as a
conjecture
In mathematics, a conjecture is a conclusion or a proposition that is proffered on a tentative basis without proof. Some conjectures, such as the Riemann hypothesis (still a conjecture) or Fermat's Last Theorem (a conjecture until proven in 19 ...
by Brewer at the 2000
Symposium on Principles of Distributed Computing (PODC).
[Eric Brewer]
"Towards Robust Distributed Systems"
/ref> In 2002, Seth Gilbert and Nancy Lynch of MIT
The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of the m ...
published a formal proof of Brewer's conjecture, rendering it a theorem
In mathematics, a theorem is a statement that has been proved, or can be proved. The ''proof'' of a theorem is a logical argument that uses the inference rules of a deductive system to establish that the theorem is a logical consequence of th ...
.
In 2012, Brewer clarified some of his positions, including why the often-used "two out of three" concept can be somewhat misleading because system designers only need to sacrifice consistency or availability in the presence of partitions; partition management and recovery techniques exist. Brewer also noted the different definition of consistency used in the CAP theorem relative to the definition used in ACID
In computer science, ACID ( atomicity, consistency, isolation, durability) is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a sequ ...
.
A similar theorem stating the trade-off between consistency and availability in distributed systems was published by Birman and Friedman in 1996. Birman and Friedman's result restricted this lower bound to non-commuting operations.
The PACELC theorem, introduced in 2010, builds on CAP by stating that even in the absence of partitioning, there is another trade-off between latency and consistency. PACELC means, if partition (P) happens, the trade-off is between availability (A) and consistency (C); Else (E), the trade-off is between latency (L) and consistency (C).
Blockchain technology often sacrifices immediate consistency for availability and partition tolerance. By requiring a specific number of "confirmations", Blockchain consensus algorithms are basically reduced to eventual consistency. [Bashir, Imran. (2018). ''Mastering blockchain''. Birmingham, England: Packt Publishing. p. 41. .]
See also
* Fallacies of distributed computing
* PACELC theorem
* Paxos (computer science)
* Raft (computer science)
* Zooko's triangle
References
External links
CAP Twelve Years Later: How the "Rules" Have Changed
Brewer's 2012 article on conflict-free replicated data types (CRDT)
Spanner, TrueTime and the CAP Theorem
A Critique of the CAP Theorem
Kleppmann's 2015 blog post corresponding with the publication of "A Critique of the CAP Theorem"
{{DEFAULTSORT:Cap Theorem
Distributed computing