Distributed Data Storage
   HOME

TheInfoList



OR:

A distributed data store is a
computer network A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are ...
where information is stored on more than one
node In general, a node is a localized swelling (a "knot") or a point of intersection (a vertex). Node may refer to: In mathematics *Vertex (graph theory), a vertex in a mathematical graph *Vertex (geometry), a point where two or more curves, lines, ...
, often in a replicated fashion. It is usually specifically used to refer to either a
distributed database A distributed database is a database in which data is stored across different physical locations. It may be stored in multiple computers located in the same physical location (e.g. a data centre); or maybe dispersed over a network of interconnect ...
where users store information on a ''number of nodes'', or a
computer network A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are ...
in which users store information on a ''number of peer network nodes''.


Distributed databases

Distributed database A distributed database is a database in which data is stored across different physical locations. It may be stored in multiple computers located in the same physical location (e.g. a data centre); or maybe dispersed over a network of interconnect ...
s are usually non-relational databases that enable a quick access to data over a large number of nodes. Some distributed databases expose rich query abilities while others are limited to a key-value store semantics. Examples of limited distributed databases are
Google Google LLC () is an American multinational technology company focusing on search engine technology, online advertising, cloud computing, computer software, quantum computing, e-commerce, artificial intelligence, and consumer electronics. ...
's
Bigtable Bigtable is a fully managed wide-column and key-value NoSQL database service for large analytical and operational workloads as part of the Google Cloud portfolio. History Bigtable development began in 2004.. It is now used by a number of Googl ...
, which is much more than a
distributed file system A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for ...
or a
peer-to-peer network Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer n ...
,
Amazon Amazon most often refers to: * Amazons, a tribe of female warriors in Greek mythology * Amazon rainforest, a rainforest covering most of the Amazon basin * Amazon River, in South America * Amazon (company), an American multinational technology c ...
's
Dynamo file:DynamoElectricMachinesEndViewPartlySection USP284110.png, "Dynamo Electric Machine" (end view, partly section, ) A dynamo is an electrical generator that creates direct current using a commutator (electric), commutator. Dynamos were the f ...
and Microsoft Azure Storage. As the ability of arbitrary querying is not as important as the
availability In reliability engineering, the term availability has the following meanings: * The degree to which a system, subsystem or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at a ...
, designers of distributed data stores have increased the latter at an expense of consistency. But the high-speed read/write access results in reduced consistency, as it is not possible to guarantee both
consistency In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent ...
and availability on a partitioned network, as stated by the
CAP theorem In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can provide only two of the following three guarantees:Seth Gilbert and Nancy Lynch"Brewer' ...
.


Peer network node data stores

In peer network data stores, the user can usually reciprocate and allow other users to use their computer as a storage node as well. Information may or may not be accessible to other users depending on the design of the network. Most
peer-to-peer Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer n ...
networks do not have distributed data stores in that the user's data is only available when their node is on the network. However, this distinction is somewhat blurred in a system such as BitTorrent, where it is possible for the originating node to go offline but the content to continue to be served. Still, this is only the case for individual files requested by the redistributors, as contrasted with networks such as
Freenet Freenet is a peer-to-peer platform for censorship-resistant, anonymous communication. It uses a decentralized distributed data store to keep and deliver information, and has a suite of free software for publishing and communicating on the Web ...
,
Winny Winny (also known as WinNY) is a Japanese peer-to-peer (P2P) file-sharing program developed by Isamu Kaneko, a research assistant at the University of Tokyo in 2002. Like Freenet, a user must add an encrypted node list in order to connect to oth ...
, Share and
Perfect Dark ''Perfect Dark'' is a first-person shooter developed and published by Rare for the Nintendo 64 video game console in 2000. The first game of the '' Perfect Dark'' series, it follows Joanna Dark, an agent of the Carrington Institute research ...
where any node may be storing any part of the files on the network. Distributed data stores typically use an
error detection and correction In information theory and coding theory with applications in computer science and telecommunication, error detection and correction (EDAC) or error control are techniques that enable reliable delivery of digital data over unreliable communi ...
technique. Some distributed data stores (such as
Parchive Parchive (a portmanteau of parity archive, and formally known as Parity Volume Set Specification) is an erasure code system that produces par files for checksum verification of data integrity, with the capability to perform data recovery operatio ...
over NNTP) use
forward error correction In computing, telecommunication, information theory, and coding theory, an error correction code, sometimes error correcting code, (ECC) is used for controlling errors in data over unreliable or noisy communication channels. The central idea is ...
techniques to recover the original file when parts of that file are damaged or unavailable. Others try again to download that file from a different mirror.


Examples


Distributed non-relational databases


Peer network node data stores

* BitTorrent *
Blockchain (database) A blockchain is a type of distributed ledger technology (DLT) that consists of growing lists of records, called ''blocks'', that are securely linked together using cryptography. Each block contains a cryptographic hash of the previous block, ...
*
Chord project In computing, Chord is a protocol and algorithm for a peer-to-peer distributed hash table. A distributed hash table stores associative array, key-value pairs by assigning keys to different computers (known as "nodes"); a node will store the values ...
*
Freenet Freenet is a peer-to-peer platform for censorship-resistant, anonymous communication. It uses a decentralized distributed data store to keep and deliver information, and has a suite of free software for publishing and communicating on the Web ...
*
GNUnet GNUnet is a software framework for decentralized, peer-to-peer networking and an official GNU package. The framework offers link encryption, peer discovery, resource allocation, communication over many transports (such as TCP, UDP, HTTP ...
*
IPFS The InterPlanetary File System (IPFS) is a protocol, hypermedia and file sharing peer-to-peer network for storing and sharing data in a distributed file system. IPFS uses content-addressing to uniquely identify each file in a global namespac ...
*
Mnet M-Net (an abbreviation of Electronic Media Network) is a South African pay television channel established by Naspers in 1986. The channel broadcasts both local and international programming, including general entertainment, children's series, ...
*
Napster Napster was a peer-to-peer file sharing application. It originally launched on June 1, 1999, with an emphasis on digital audio file distribution. Audio songs shared on the service were typically encoded in the MP3 format. It was founded by Shawn ...
*
NNTP The Network News Transfer Protocol (NNTP) is an application protocol used for transporting Usenet news articles (''netnews'') between news servers, and for reading/posting articles by the end user client applications. Brian Kantor of the Univers ...
(the distributed data storage protocol used for
Usenet Usenet () is a worldwide distributed discussion system available on computers. It was developed from the general-purpose Unix-to-Unix Copy (UUCP) dial-up network architecture. Tom Truscott and Jim Ellis conceived the idea in 1979, and it was ...
news) * Unity, of the software
Perfect Dark ''Perfect Dark'' is a first-person shooter developed and published by Rare for the Nintendo 64 video game console in 2000. The first game of the '' Perfect Dark'' series, it follows Joanna Dark, an agent of the Carrington Institute research ...
* Share * Siacoin * DeNet * Storage@home * STORJ *
Tahoe-LAFS Tahoe-LAFS (Tahoe Least-Authority File Store) is a free and open, secure, decentralized, fault-tolerant, distributed data store and distributed file system. It can be used as an online backup system, or to serve as a file or Web host similar to ...
*
Winny Winny (also known as WinNY) is a Japanese peer-to-peer (P2P) file-sharing program developed by Isamu Kaneko, a research assistant at the University of Tokyo in 2002. Like Freenet, a user must add an encrypted node list in order to connect to oth ...
*
ZeroNet ZeroNet is a decentralized web-like network of peer-to-peer users, created by Tamas Kocsis in 2015, programming for the network was based in Budapest, Hungary; is built in Python; and is fully open source. Instead of having an IP address, site ...


See also

*
Cooperative storage cloud A cooperative storage cloud is a decentralized model of networked computer data storage, online storage where data is stored on multiple computers (Node (networking), nodes), hosted by the participants cooperating in the cloud. For the cooperative ...
*
Data store In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted. ...
*
Distributed file system A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for ...
* Keyspace, the DDS
schema The word schema comes from the Greek word ('), which means ''shape'', or more generally, ''plan''. The plural is ('). In English, both ''schemas'' and ''schemata'' are used as plural forms. Schema may refer to: Science and technology * SCHEMA ...
*
Peer-to-peer Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer n ...
*
Distributed hash table A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table: key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The m ...
*
Distributed cache In computing, a distributed cache is an extension of the traditional concept of cache used in a single locale. A distributed cache may span multiple servers so that it can grow in size and in transactional capacity. It is mainly used to store appl ...
* Cyber Resilience


References

{{Reflist Data management ja:分散ファイルシステム#分散データストア