Column (data Store)
   HOME

TheInfoList



OR:

A column of a
distributed data store A distributed data store is a computer network where information is stored on more than one node, often in a replicated fashion. It is usually specifically used to refer to either a distributed database where users store information on a ''numb ...
is a
NoSQL A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
object of the lowest level in a keyspace. It is a
tuple In mathematics, a tuple is a finite ordered list (sequence) of elements. An -tuple is a sequence (or ordered list) of elements, where is a non-negative integer. There is only one 0-tuple, referred to as ''the empty tuple''. An -tuple is defi ...
(a key–value pair) consisting of three elements: * Unique name: Used to reference the column * Value: The content of the column. It can have different types, like AsciiType, LongType, TimeUUIDType, UTF8Type among others. *
Timestamp A timestamp is a sequence of characters or encoded information identifying when a certain event occurred, usually giving date and time of day, sometimes accurate to a small fraction of a second. Timestamps do not have to be based on some absolut ...
: The system timestamp used to determine the valid content.


Usage

A column is used as a store for the value and has a timestamp that is used to differentiate the valid content from stale ones. According to the
CAP theorem In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can provide only two of the following three guarantees:Seth Gilbert and Nancy Lynch"Brewer' ...
, distributed data stores cannot guarantee
consistency In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent ...
, as
availability In reliability engineering, the term availability has the following meanings: * The degree to which a system, subsystem or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at a ...
and partition tolerance are more important issues. Therefore, the data store or the application programmer will use the timestamp to find out which of the stored values in the backup nodes are up-to-date. Some data stores, like
Riak Riak (pronounced "ree-ack" ) is a distributed NoSQL key-value data store based on Amazon's Dynamo paper, including its "tunable AP" approach, that is tunable consistency, to the tradeoffs imposed by the CAP Theorem. Riak offers high availability, ...
, may use the more sophisticated
vector clock A vector clock is a data structure used for determining the partial ordering of events in a distributed system and detecting causality violations. Just as in Lamport timestamps, inter-process messages contain the state of the sending process's ...
instead of the timestamp to resolve stale information.


Differences from a relational database

In
relational database A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relatio ...
s, a column is a part of a relational table that can be seen in each row of the table. This is not the case in distributed data stores, where the concept of a table only vaguely exists. A column can be part of a ColumnFamily that resembles at most a relational row, but it may appear in one row and not in the others. Also, the number of columns may change from row to row, and new updates to the data store model may also modify the column number. So, all the work of keeping up with changes relies on the application programmer.


Examples

Three definitions of columns in JSON-like notation are given below: { street: {name: "street", value: "1234 x street", timestamp: 123456789}, city: {name: "city", value: "san francisco", timestamp: 123456789}, zip: {name: "zip", value: "94107", timestamp: 123456789}, }


See also

*
Super column A super column is a tuple (a pair) with a binary super column name and a value that maps it to many columns. They consist of a key–value pairs, where the values are columns. Theoretically speaking, super columns are ( sorted) associative array o ...


References

NoSQL