HOME

TheInfoList



OR:

In
theoretical computer science Theoretical computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the theory of computation, lambda calculus, and type theory. It is difficult to circumsc ...
, the PACELC theorem is an extension to the CAP theorem. It states that in case of network partitioning (P) in a distributed computer system, one has to choose between availability (A) and consistency (C) (as per the CAP theorem), but else (E), even when the system is running normally in the absence of partitions, one has to choose between latency (L) and consistency (C).


Overview

PACELC builds on the CAP theorem. Both theorems describe how distributed databases have limitations and tradeoffs regarding consistency, availability, and partition tolerance. PACELC however goes further and states that an additional trade-off exists: between latency and consistency, even in absence of partitions, thus providing a more complete portrayal of the potential consistency trade-offs for distributed systems. A high availability requirement implies that the system must replicate data. As soon as a distributed system replicates data, a trade-off between consistency and latency arises. The PACELC theorem was first described by Daniel J. Abadi from
Yale University Yale University is a private research university in New Haven, Connecticut. Established in 1701 as the Collegiate School, it is the third-oldest institution of higher education in the United States and among the most prestigious in the wo ...
in 2010 in a blog post, which he later clarified in a paper in 2012. The purpose of PACELC is to address his thesis that "Ignoring the consistency/latency trade-off of replicated systems is a major oversight
n CAP The term N cap (N-cap, Ncap) describes an amino acid in a particular position within a protein or polypeptide.{{cite journal, last=Leader, first=DP, author2=Milner-White EJ , title=The structure of the ends of helices in globular proteins, journal= ...
as it is present at all times during system operation, whereas CAP is only relevant in the arguably rare case of a network partition." The PACELC theorem was proved formally in 2018 in a SIGACT News article.Wojciech Golab
"Proving PACELC"
''ACM SIGACT News'', Volume 49 Issue 1 (2018), pg. 73–81. .


Database PACELC ratings

Original database PACELC ratings are from. Subsequent updates contributed by wikipedia community. * The default versions o
Amazon's early (internal) Dynamo
Cassandra Cassandra or Kassandra (; Ancient Greek: Κασσάνδρα, , also , and sometimes referred to as Alexandra) in Greek mythology was a Trojan priestess dedicated to the god Apollo and fated by him to utter true prophecies but never to be believe ...
,
Riak Riak (pronounced "ree-ack" ) is a distributed NoSQL key-value data store based on Amazon's Dynamo paper, including its "tunable AP" approach, that is tunable consistency, to the tradeoffs imposed by the CAP Theorem. Riak offers high availability, ...
and
Cosmos DB Azure Cosmos DB is Microsoft's proprietary globally distributed, multi-model database service "for managing data at planet-scale" launched in May 2017. It is schema-agnostic, horizontally scalable, and generally classified as a NoSQL database. ...
are PA/EL systems: if a partition occurs, they give up consistency for availability, and under normal operation they give up consistency for lower latency. * Fully ACID systems such as
VoltDB Volt Active Data (formerly VoltDB) is an in-memory database designed by Michael Stonebraker, Sam Madden, and Daniel Abadi. It is an ACID-compliant RDBMS that uses a shared-nothing architecture, and is derived from work done by Stonebraker on O ...
/H-Store, Megastore,
MySQL Cluster MySQL Cluster is a technology providing shared-nothing clustering and auto-sharding for the MySQL database management system. It is designed to provide high availability and high throughput with low latency, while allowing for near linear sca ...
and
PostgreSQL PostgreSQL (, ), also known as Postgres, is a free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. It was originally named POSTGRES, referring to its origins as a successor to the In ...
are PC/EC: they refuse to give up consistency, and will pay the availability and latency costs to achieve it.
BigTable Bigtable is a fully managed wide-column and key-value NoSQL database service for large analytical and operational workloads as part of the Google Cloud portfolio. History Bigtable development began in 2004.. It is now used by a number of Googl ...
and related systems such as
HBase HBase is an open-source non-relational distributed database modeled after Google's Bigtable and written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File Sys ...
are also PC/EC. * Amazon DynamoDB (launched January 2012) is quite different from th
early (Amazon internal) Dynamo
which was considered for the PACELC paper. DynamoDB follows a strong leader model, where every write is strictly serialized (and conditional writes carry no penalty) and supports read-after-write consistency. This guarantee does not apply to "Global Tables" across regions. The DynamoDB SDKs use eventually consistent reads by default (improved availability and throughput), but when a consistent read is requested the service will return either a current view to the item or an error. *Couchbase provides a range of consistency and availability options during a partition, and equally a range of latency and consistency options with no partition. Unlike most other databases, Couchbase doesn't have a single API set nor does it scale/replicate all data services homogeneously. For writes, Couchbase favors Consistency over Availability making it formally CP, but on read there is more user-controlled variability depending on index replication, desired consistency level and type of access (single document lookup vs range scan vs full-text search, etc.). On top of that, there is then further variability depending on cross-datacenter-replication (XDCR) which takes multiple CP clusters and connects them with asynchronous replication and Couchbase Lite which is an embedded database and creates a fully multi-master (with revision tracking) distributed topology. *
Cosmos DB Azure Cosmos DB is Microsoft's proprietary globally distributed, multi-model database service "for managing data at planet-scale" launched in May 2017. It is schema-agnostic, horizontally scalable, and generally classified as a NoSQL database. ...
supports five tunable consistency levels that allow for tradeoffs between C/A during P, and L/C during E.
Cosmos DB Azure Cosmos DB is Microsoft's proprietary globally distributed, multi-model database service "for managing data at planet-scale" launched in May 2017. It is schema-agnostic, horizontally scalable, and generally classified as a NoSQL database. ...
never violates the specified consistency level, so it's formally CP. *
MongoDB MongoDB is a source-available cross-platform document-oriented database program. Classified as a NoSQL database program, MongoDB uses JSON-like documents with optional schemas. MongoDB is developed by MongoDB Inc. and licensed under the Serve ...
can be classified as a PA/EC system. In the baseline case, the system guarantees reads and writes to be consistent. * PNUTS is a PC/EL system. * Hazelcast IMDG and indeed most in-memory data grids are an implementation of a PA/EC system; Hazelcast can be configured to be EL rather than EC. Concurrency primitives (Lock, AtomicReference, CountDownLatch, etc.) can be either PC/EC or PA/EC. * FaunaDB implements Calvin, a transaction protocol created by Dr. Daniel Abadi and author of PACELC theorem, and offers users adjustable controls for LC tradeoff. It is PC/EC for strictly serializable transactions, and EL for serializable reads.


See also

* CAP theorem *
Consistency model In computer science, a consistency model specifies a contract between the programmer and a system, wherein the system guarantees that if the programmer follows the rules for operations on memory, memory will be consistent and the results of readi ...
*
Fallacies of Distributed Computing The fallacies of distributed computing are a set of assertions made by L Peter Deutsch and others at Sun Microsystems describing false assumptions that programmers new to distributed applications invariably make. The fallacies The fallacies are ...
*
Paxos (computer science) Paxos is a family of protocols for solving consensus in a network of unreliable or fallible processors. Consensus is the process of agreeing on one result among a group of participants. This problem becomes difficult when the participants or their ...
*
Project management triangle The project management triangle (called also the ''triple constraint'', ''iron triangle'' and ''project triangle'') is a model of the constraints of project management. While its origins are unclear, it has been used since at least the 1950s. It ...
*
Raft (computer science) A raft is any flat structure for support or transportation over water. It is usually of basic design, characterized by the absence of a hull. Rafts are usually kept afloat by using any combination of buoyant materials such as wood, sealed barrel ...
*
Trilemma A trilemma is a difficult choice from three options, each of which is (or appears) unacceptable or unfavourable. There are two logically equivalent ways in which to express a trilemma: it can be expressed as a choice among three unfavourable option ...


Notes


References


External links


"Consistency Tradeoffs in Modern Distributed Database System Design", by Daniel J. Abadi, Yale University
Original paper that formalized PACELC

Original blog post that first described PACELC
"Proving PACELC", by Wojciech Golab, University of Waterloo
Formal proof of the PACELC theorem {{DEFAULTSORT:Pacelc Theorem Distributed computing Database theory