A distributed data store is a
computer network
A computer network is a collection of communicating computers and other devices, such as printers and smart phones. In order to communicate, the computers and devices must be connected by wired media like copper cables, optical fibers, or b ...
where information is stored on more than one
node
In general, a node is a localized swelling (a "knot") or a point of intersection (a vertex).
Node may refer to:
In mathematics
* Vertex (graph theory), a vertex in a mathematical graph
*Vertex (geometry), a point where two or more curves, lines ...
, often in a
replicated fashion. It is usually specifically used to refer to either a
distributed database
A distributed database is a database in which data is stored across different physical locations. It may be stored in multiple computers located in the same physical location (e.g. a data centre); or maybe dispersed over a computer network, netwo ...
where users store information on a ''number of nodes'', or a
computer network
A computer network is a collection of communicating computers and other devices, such as printers and smart phones. In order to communicate, the computers and devices must be connected by wired media like copper cables, optical fibers, or b ...
in which users store information on a ''number of peer network nodes''.
Distributed databases
Distributed database
A distributed database is a database in which data is stored across different physical locations. It may be stored in multiple computers located in the same physical location (e.g. a data centre); or maybe dispersed over a computer network, netwo ...
s are usually
non-relational databases that enable a quick access to data over a large number of nodes. Some distributed databases expose rich query abilities while others are limited to a
key-value store semantics. Examples of limited distributed databases are
Google
Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...
's
Bigtable, which is much more than a
distributed file system
A clustered file system (CFS) is a file system which is shared by being simultaneously Mount (computing), mounted on multiple Server (computing), servers. There are several approaches to computer cluster, clustering, most of which do not emplo ...
or a
peer-to-peer network,
Amazon
Amazon most often refers to:
* Amazon River, in South America
* Amazon rainforest, a rainforest covering most of the Amazon basin
* Amazon (company), an American multinational technology company
* Amazons, a tribe of female warriors in Greek myth ...
's
Dynamo
and
Microsoft Azure Storage.
As the ability of arbitrary querying is not as important as the
availability, designers of distributed data stores have increased the latter at an expense of consistency. But the high-speed read/write access results in reduced consistency, as it is not possible to guarantee both
consistency
In deductive logic, a consistent theory is one that does not lead to a logical contradiction. A theory T is consistent if there is no formula \varphi such that both \varphi and its negation \lnot\varphi are elements of the set of consequences ...
and availability on a partitioned network, as stated by the
CAP theorem
In database theory, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer (scientist), Eric Brewer, states that any distributed data store can provide at most Inconsistent triad, two of the following three guarantees:
; ...
.
Peer network node data stores
In peer network data stores, the user can usually reciprocate and allow other users to use their computer as a storage node as well. Information may or may not be accessible to other users depending on the design of the network.
Most
peer-to-peer
Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network, forming a peer-to-peer network of Node ...
networks do not have distributed data stores in that the user's data is only available when their node is on the network. However, this distinction is somewhat blurred in a system such as
BitTorrent
BitTorrent is a Protocol (computing), communication protocol for peer-to-peer file sharing (P2P), which enables users to distribute data and electronic files over the Internet in a Decentralised system, decentralized manner. The protocol is d ...
, where it is possible for the originating node to go offline but the content to continue to be served. Still, this is only the case for individual files requested by the redistributors, as contrasted with networks such as
Hyphanet,
Winny,
Share and
Perfect Dark where any node may be storing any part of the files on the network.
Distributed data stores typically use an
error detection and correction technique.
Some distributed data stores (such as
Parchive over NNTP) use
forward error correction techniques to recover the original file when parts of that file are damaged or unavailable.
Others try again to download that file from a different mirror.
Examples
Distributed non-relational databases
Peer network node data stores
*
BitTorrent
BitTorrent is a Protocol (computing), communication protocol for peer-to-peer file sharing (P2P), which enables users to distribute data and electronic files over the Internet in a Decentralised system, decentralized manner. The protocol is d ...
*
Blockchain (database)
*
Chord project
*
Freenet
Hyphanet (until mid-2023: Freenet) is a peer-to-peer platform for censorship-resistant, Anonymity application, anonymous communication. It uses a decentralized distributed data store to keep and deliver information, and has a suite of free soft ...
*
GNUnet
*
IPFS
*
Mnet
*
Napster
*
NNTP (the distributed data storage protocol used for
Usenet
Usenet (), a portmanteau of User's Network, is a worldwide distributed discussion system available on computers. It was developed from the general-purpose UUCP, Unix-to-Unix Copy (UUCP) dial-up network architecture. Tom Truscott and Jim Elli ...
news)
* Unity, of the software
Perfect Dark
*
Share
*
Siacoin
* DeNet
*
Storage@home
*
Tahoe-LAFS
*
Winny
*
ZeroNet
See also
*
Cooperative storage cloud
*
Data store
*
Keyspace, the DDS
schema
Schema may refer to:
Science and technology
* SCHEMA (bioinformatics), an algorithm used in protein engineering
* Schema (genetic algorithms), a set of programs or bit strings that have some genotypic similarity
* Schema.org, a web markup vocab ...
*
Distributed hash table
A distributed hash table (DHT) is a Distributed computing, distributed system that provides a lookup service similar to a hash table. Key–value pairs are stored in a DHT, and any participating node (networking), node can efficiently retrieve the ...
*
Distributed cache
*
Cyber Resilience
References
{{Reflist
Data management
ja:分散ファイルシステム#分散データストア