HOME

TheInfoList



OR:

In
database theory Database theory encapsulates a broad range of topics related to the study and research of the theoretical realm of databases and database management systems. Theoretical aspects of data management include, among other areas, the foundations of q ...
, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can provide at most two of the following three guarantees: ;
Consistency In deductive logic, a consistent theory is one that does not lead to a logical contradiction. A theory T is consistent if there is no formula \varphi such that both \varphi and its negation \lnot\varphi are elements of the set of consequences ...
: Every read receives the most recent write or an error. Note that consistency as defined in the CAP theorem is quite different from the consistency guaranteed in
ACID An acid is a molecule or ion capable of either donating a proton (i.e. Hydron, hydrogen cation, H+), known as a Brønsted–Lowry acid–base theory, Brønsted–Lowry acid, or forming a covalent bond with an electron pair, known as a Lewis ...
database transactions. ; Availability: Every request received by a non-failing node in the system must result in a response. This is the definition of availability in CAP theorem as defined by Gilbert and Lynch. Note that availability as defined in CAP theorem is different from
high availability High availability (HA) is a characteristic of a system that aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period. There is now more dependence on these systems as a result of modernization ...
in software architecture. ; Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes. When a
network partition A network partition is a division of a computer network into relatively independent subnets, either by design, to optimize them separately, or due to the failure of network devices. Distributed software must be designed to be partition-tolerant, ...
failure happens, it must be decided whether to do one of the following: * cancel the operation and thus decrease the availability but ensure consistency * proceed with the operation and thus provide availability but risk inconsistency. Note this doesn't necessarily mean that system is highly available to its users. Thus, if there is a network partition, one has to choose between consistency or availability.


Explanation

No distributed system is safe from network failures, thus network partitioning generally has to be tolerated. In the presence of a partition, one is then left with two options: consistency or availability. When choosing consistency over availability, the system will return an error or a time out if particular information cannot be guaranteed to be up to date due to network partitioning. When choosing availability over consistency, the system will always process the query and try to return the most recent available version of the information, even if it cannot guarantee it is up to date due to network partitioning. In the absence of a network partition, both availability and consistency can be satisfied. Database systems designed with traditional
ACID An acid is a molecule or ion capable of either donating a proton (i.e. Hydron, hydrogen cation, H+), known as a Brønsted–Lowry acid–base theory, Brønsted–Lowry acid, or forming a covalent bond with an electron pair, known as a Lewis ...
guarantees in mind such as RDBMS choose
consistency In deductive logic, a consistent theory is one that does not lead to a logical contradiction. A theory T is consistent if there is no formula \varphi such that both \varphi and its negation \lnot\varphi are elements of the set of consequences ...
over availability, whereas systems designed around the BASE philosophy, common in the
NoSQL NoSQL (originally meaning "Not only SQL" or "non-relational") refers to a type of database design that stores and retrieves data differently from the traditional table-based structure of relational databases. Unlike relational databases, which ...
movement for example, choose availability over consistency. Some cloud services choose strong consistency but use worldwide private fiber networks and GPS clock synchronization to minimize the frequency of network partitions. Finally, consistent shared-nothing architectures may use techniques such as geographic sharding to maintain availability of data owned by the queried node, but without being available for arbitrary requests during a network partition.


History

According to computer scientist Eric Brewer of the
University of California, Berkeley The University of California, Berkeley (UC Berkeley, Berkeley, Cal, or California), is a Public university, public Land-grant university, land-grant research university in Berkeley, California, United States. Founded in 1868 and named after t ...
, the theorem first appeared in autumn 1998. It was published as the CAP principle in 1999 and presented as a
conjecture In mathematics, a conjecture is a conclusion or a proposition that is proffered on a tentative basis without proof. Some conjectures, such as the Riemann hypothesis or Fermat's conjecture (now a theorem, proven in 1995 by Andrew Wiles), ha ...
by Brewer at the 2000 Symposium on Principles of Distributed Computing (PODC). In 2002, Seth Gilbert and Nancy Lynch of MIT published a formal proof of Brewer's conjecture, rendering it a
theorem In mathematics and formal logic, a theorem is a statement (logic), statement that has been Mathematical proof, proven, or can be proven. The ''proof'' of a theorem is a logical argument that uses the inference rules of a deductive system to esta ...
. In 2012, Brewer clarified some of his positions, including why the often-used "two out of three" concept can be somewhat misleading because system designers only need to sacrifice consistency or availability in the presence of partitions; partition management and recovery techniques exist. Brewer also noted the different definition of consistency used in the CAP theorem relative to the definition used in
ACID An acid is a molecule or ion capable of either donating a proton (i.e. Hydron, hydrogen cation, H+), known as a Brønsted–Lowry acid–base theory, Brønsted–Lowry acid, or forming a covalent bond with an electron pair, known as a Lewis ...
. A similar theorem stating the trade-off between consistency and availability in distributed systems was published by Birman and Friedman in 1996. Birman and Friedman's result restricted this lower bound to non-commuting operations. The PACELC theorem, introduced in 2010, builds on CAP by stating that even in the absence of partitioning, there is another trade-off between latency and consistency. PACELC means, if partition (P) happens, the trade-off is between availability (A) and consistency (C); Else (E), the trade-off is between latency (L) and consistency (C). Some experts like Marc Brooker argue that the CAP theorem is particularly relevant in intermittently connected environments, such as those related to the
Internet of Things Internet of things (IoT) describes devices with sensors, processing ability, software and other technologies that connect and exchange data with other devices and systems over the Internet or other communication networks. The IoT encompasse ...
(IoT) and
mobile app A mobile application or app is a computer program or software application designed to run on a mobile device such as a smartphone, phone, tablet computer, tablet, or smartwatch, watch. Mobile applications often stand in contrast to desktop appli ...
lications. In these contexts, devices may become partitioned due to challenging physical conditions, such as
power outage A power outage, also called a blackout, a power failure, a power blackout, a power loss, a power cut, or a power out is the complete loss of the electrical power network supply to an end user. There are many causes of power failures in an el ...
s or when entering confined spaces like elevators. For distributed systems, such as cloud applications, it is more appropriate to use the PACELC theorem, which is more comprehensive and considers trade-offs such as latency and
consistency In deductive logic, a consistent theory is one that does not lead to a logical contradiction. A theory T is consistent if there is no formula \varphi such that both \varphi and its negation \lnot\varphi are elements of the set of consequences ...
even in the absence of network partitions.


See also

* Fallacies of distributed computing * Lambda architecture (solution) * PACELC theorem * Paxos (computer science) * Raft (computer science) * Zooko's triangle * Inconsistent triad * Trilemma


References

{{DEFAULTSORT:Cap Theorem Distributed computing Database management systems