Distributed concurrency control
   HOME

TheInfoList



OR:

Distributed concurrency control is the
concurrency control In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while ...
of a system
distributed Distribution may refer to: Mathematics *Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations *Probability distribution, the probability of a particular value or value range of a varia ...
over a
computer network A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are ...
( Bernstein et al. 1987, Weikum and Vossen 2001). In ''
database systems In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spa ...
'' and ''
transaction processing Transaction processing is information processing in computer science that is divided into individual, indivisible operations called ''transactions''. Each transaction must succeed or fail as a complete unit; it can never be only partially compl ...
'' (''transaction management'') distributed concurrency control refers primarily to the concurrency control of a
distributed database A distributed database is a database in which data is stored across different physical locations. It may be stored in multiple computers located in the same physical location (e.g. a data centre); or maybe dispersed over a network of interconnect ...
. It also refers to the concurrency control in a multidatabase (and other multi-transactional object) environment (e.g.,
federated database A federated database system (FDBS) is a type of meta-database management system (DBMS), which transparently maps multiple autonomous database systems into a single federated database. The constituent databases are interconnected via a computer netwo ...
, grid computing, and
cloud computing Cloud computing is the on-demand availability of computer system resources, especially data storage ( cloud storage) and computing power, without direct active management by the user. Large clouds often have functions distributed over mu ...
environments. A major goal for distributed concurrency control is distributed
serializability In concurrency control of databases, Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman (1987)''Concurrency Control and Recovery in Database Systems''(free PDF download), Addison Wesley Publishing Company, Gerhard Weikum, Gottfried Vossen (20 ...
(or global serializability for multidatabase systems). Distributed concurrency control poses special challenges beyond centralized one, primarily due to communication and computer latency. It often requires special techniques, like
distributed lock manager Operating systems use lock managers to organise and serialise the access to resources. A distributed lock manager (DLM) runs in every machine in a cluster, with an identical copy of a cluster-wide lock database. In this way a DLM provides software ...
over fast
computer network A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are ...
s with low latency, like switched fabric (e.g.,
InfiniBand InfiniBand (IB) is a computer networking communications standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also use ...
).
Commitment ordering Commitment ordering (CO) is a class of interoperable '' serializability'' techniques in concurrency control of databases, transaction processing, and related applications. It allows optimistic (non-blocking) implementations. With the proliferation ...
(or commit ordering) is a general serializability technique that achieves distributed serializability (and global serializability in particular) effectively on a large scale, without concurrency control information distribution (e.g., local precedence relations, locks, timestamps, or tickets), and thus without performance penalties that are typical to other serializability techniques ( Raz 1992). The most common distributed concurrency control technique is ''strong strict two-phase locking'' ( SS2PL, also named ''rigorousness''), which is also a common centralized concurrency control technique. SS2PL provides both the ''serializability'', '' strictness'', and ''commitment ordering'' properties. Strictness, a special case of recoverability, is utilized for effective recovery from failure, and commitment ordering allows participating in a general solution for global serializability. For large-scale distribution and complex transactions, distributed locking's typical heavy performance penalty (due to delays, latency) can be saved by using the
atomic commitment Atomic may refer to: * Of or relating to the atom, the smallest particle of a chemical element that retains its chemical properties * Atomic physics, the study of the atom * Atomic Age, also known as the "Atomic Era" * Atomic scale, distances comp ...
protocol, which is needed in a distributed database for (distributed) transactions' atomicity (e.g.,
two-phase commit In transaction processing, databases, and computer networking, the two-phase commit protocol (2PC) is a type of atomic commitment protocol (ACP). It is a distributed algorithm that coordinates all the processes that participate in a distributed ...
, or a simpler one in a reliable system), together with some local commitment ordering variant (e.g., local SS2PL) instead of distributed locking, to achieve global serializability in the entire system. All the commitment ordering theoretical results are applicable whenever atomic commitment is utilized over partitioned, distributed recoverable (transactional) data, including automatic ''distributed deadlock'' resolution. Such technique can be utilized also for a large-scale parallel database, where a single large database, residing on many nodes and using a distributed lock manager, is replaced with a (homogeneous) multidatabase, comprising many relatively small databases (loosely defined; any process that supports transactions over partitioned data and participates in atomic commitment complies), fitting each into a single node, and using commitment ordering (e.g., SS2PL, strict CO) together with some appropriate atomic commitment protocol (without using a distributed lock manager).


See also

*Global concurrency control


References

*Phil Bernstein, Philip A. Bernstein, Vassos Hadzilacos, Nathan Goodman (1987)
''Concurrency Control and Recovery in Database Systems''
Addison Wesley Publishing Company, 1987,
*Gerhard Weikum, Gottfried Vossen (2001)
''Transactional Information Systems''
Elsevier, {{ISBN, 1-55860-508-8
*Yoav Raz (1992)
"The Principle of Commitment Ordering, or Guaranteeing Serializability in a Heterogeneous Environment of Multiple Autonomous Resource Managers Using Atomic Commitment."
''Proceedings of the Eighteenth International Conference on Very Large Data Bases'' (VLDB), pp. 292-312, Vancouver, Canada, August 1992. (also DEC-TR 841, Digital Equipment Corporation, November 1990)
Data management Distributed computing problems Databases Concurrency control Transaction processing