HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, a timestamp-based concurrency control algorithm is a
non-lock concurrency control In Computer Science, in the field of databases, non-lock concurrency control is a concurrency control method used in relational databases without using locking. There are several non-lock concurrency control methods, which involve the use of ti ...
method. It is used in some
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases ...
s to safely handle transactions, using
timestamp A timestamp is a sequence of characters or encoded information identifying when a certain event occurred, usually giving date and time of day, sometimes accurate to a small fraction of a second. Timestamps do not have to be based on some absolut ...
s.


Operation


Assumptions

* Every timestamp value is unique and accurately represents an instant in time. * A higher-valued timestamp occurs later in time than a lower-valued timestamp.


Generating a timestamp

A number of different ways have been used to generate timestamp * Use the value of the system's clock at the start of a transaction as the timestamp. * Use a thread-safe shared counter that is incremented at the start of a transaction as the timestamp. * A combination of the above two methods.


Formal

Each transaction (T_i) is an ordered list of actions (A_). Before the transaction performs its first action (A_), it is marked with the current
timestamp A timestamp is a sequence of characters or encoded information identifying when a certain event occurred, usually giving date and time of day, sometimes accurate to a small fraction of a second. Timestamps do not have to be based on some absolut ...
, or any other strictly totally ordered sequence: TS(T_i) = NOW(). Every transaction is also given an initially empty set of transactions upon which it depends, DEP(T_i) = [], and an initially empty set of old objects which it updated, OLD(T_i) = []. Each Object (computer science), object (O_j) in the database is given two timestamp fields which are not used other than for concurrency control: RTS(O_j) is the time at which the value of object was last used by a transaction, WTS(O_j) is the time at which the value of the object was last updated by a transaction. For all T_i: :For each action A_: ::If A_ wishes to use the value of O_j: :::If WTS(O_j) > TS(T_i) then abort (a more recent thread has overwritten the value), :::Otherwise update the set of dependencies DEP(T_i).\mathrm(WTS(O_j)) and set RTS(O_j) = \max(RTS(O_j), TS(T_i)); ::If A_ wishes to update the value of O_j: :::If RTS(O_j) > TS(T_i) then abort (a more recent thread is already relying on the old value), :::If WTS(O_j) > TS(T_i) then skip (the Thomas Write Rule), :::Otherwise store the previous values, OLD(T_i).\mathrm(O_j, WTS(O_j)), set WTS(O_j) = TS(T_i), and update the value of O_j. :While there is a transaction in DEP(T_i) that has not ended: wait :If there is a transaction in DEP(T_i) that aborted then abort :Otherwise: commit. To abort: :For each (\mathrmO_j, \mathrmWTS(O_j)) in OLD(T_i) ::If WTS(O_j) equals TS(T_i) then restore O_j = \mathrmO_j and WTS(O_j) = \mathrmWTS(O_j)


Informal

Whenever a transaction begins, it receives a timestamp. This timestamp indicates the order in which the transaction must occur, relative to the other transactions. So, given two transactions that affect the same object, the operation of the transaction with the earlier timestamp must execute before the operation of the transaction with the later timestamp. However, if the operation of the wrong transaction is actually presented first, then it is aborted and the transaction must be restarted. Every object in the database has a read timestamp, which is updated whenever the object's data is read, and a write timestamp, which is updated whenever the object's data is changed. If a transaction wants to read an object, * but the transaction started ''before'' the object's write timestamp it means that something changed the object's data after the transaction started. In this case, the transaction is canceled and must be restarted. * and the transaction started ''after'' the object's write timestamp, it means that it is ''safe'' to read the object. In this case, if the transaction timestamp is after the object's read timestamp, the read timestamp is set to the transaction timestamp. If a transaction wants to write to an object, * but the transaction started ''before'' the object's read timestamp it means that something has had a look at the object, and we assume it took a copy of the object's data. So we can't write to the object as that would make any copied data invalid, so the transaction is aborted and must be restarted. * and the transaction started ''before'' the object's write timestamp it means that something has changed the object since we started our transaction. In this case we use the Thomas write rule and simply skip our write operation and continue as normal; the transaction does not have to be aborted or restarted * otherwise, the transaction writes to the object, and the object's write timestamp is set to the transaction's timestamp.


Recoverability

{{for, an explanation of the terms ''recoverable'' (''RC''), ''avoids cascading aborts'' (''ACA'') and ''strict'' (''ST''), Schedule (computer science) Note that timestamp ordering in its basic form does not produce recoverable histories. Consider for example the following history with transactions T_1 and T_2: :W_1(x)\;R_2(x)\;W_2(y)\;C_2\;R_1(z)\;C_1 This could be produced by a TO scheduler, but is not recoverable, as T_2 commits even though having read from an uncommitted transaction. To make sure that it produces recoverable histories, a scheduler can keep a list of other transactions each transaction has ''read from'', and not let a transaction commit before this list consisted of only committed transactions. To avoid cascading aborts, the scheduler could tag data written by uncommitted transactions as ''dirty'', and never let a read operation commence on such a data item before it was untagged. To get a strict history, the scheduler should not allow any operations on dirty items.


Implementation issues


Timestamp resolution

This is the minimum time elapsed between two adjacent timestamps. If the resolution of the timestamp is too large (coarse), the possibility of two or more timestamps being equal is increased and thus enabling some transactions to commit out of correct order. For example, assuming that we have a system that can create one hundred unique timestamps per second, and given two events that occur 2 milliseconds apart, they will probably be given the same timestamp even though they actually occurred at different times.


Timestamp locking

Even though this technique is a non-locking one, in as much as the Object is not locked from concurrent access for the duration of a transaction, the act of recording each timestamp against the Object requires an extremely short duration lock on the Object or its proxy.


See also

*
Multiversion concurrency control Multiversion concurrency control (MCC or MVCC), is a concurrency control method commonly used by database management systems to provide concurrent access to the database and in programming languages to implement transactional memory. Description ...
* Timestamping (computing) Concurrency control Concurrency control algorithms Transaction processing