A cooperative storage cloud is a decentralized model of networked
online storage where data is stored on multiple computers (
nodes), hosted by the participants cooperating in the cloud. For the cooperative scheme to be viable, the total storage contributed in aggregate must be at least equal to the amount of storage needed by end users. However, some nodes may contribute less storage and some may contribute more. There may be reward models to compensate the nodes contributing more.
Unlike a traditional
storage cloud, a cooperative does not directly employ dedicated servers for the actual storage of the data, thereby eliminating the need for a significant dedicated hardware investment. Each node in the cooperative runs specialized
software
Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work.
At the lowest programming level, executable code consists o ...
which communicates with a centralized control and orchestration server, thereby allowing the node to both consume and contribute storage space to the cloud. The centralized control and orchestration server requires several orders of magnitude less resources (storage, computing power, and bandwidth) to operate, relative to the overall capacity of the cooperative.
Data security
Files hosted in the cloud are fragmented and
encrypted
In cryptography, encryption is the process of encoding information. This process converts the original representation of the information, known as plaintext, into an alternative form known as ciphertext. Ideally, only authorized parties can deci ...
before leaving the local machine. They are then distributed randomly using a
load balancing and
geo-distribution algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
to other nodes in the cooperative. Users can add an additional layer of
security
Security is protection from, or resilience against, potential harm (or other unwanted coercive change) caused by others, by restraining the freedom of others to act. Beneficiaries (technically referents) of security may be of persons and social ...
and reduce storage space by
compressing and encrypting files before they are copied to the cloud.
Data redundancy
In order to maintain
data integrity
Data integrity is the maintenance of, and the assurance of, data accuracy and consistency over its entire life-cycle and is a critical aspect to the design, implementation, and usage of any system that stores, processes, or retrieves data. The ter ...
and high availability across a relatively unreliable set of computers over a
wide area network
A wide area network (WAN) is a telecommunications network that extends over a large geographic area. Wide area networks are often established with leased telecommunication circuits.
Businesses, as well as schools and government entities, us ...
like the
Internet
The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, pub ...
, the source node will add some level of
redundancy to each
data block.
[ This allows the system to recreate the entire block even if some nodes are temporarily unavailable (due to loss of network connectivity, the machine being powered off or a hardware failure). The most storage and ]bandwidth
Bandwidth commonly refers to:
* Bandwidth (signal processing) or ''analog bandwidth'', ''frequency bandwidth'', or ''radio bandwidth'', a measure of the width of a frequency range
* Bandwidth (computing), the rate of data transfer, bit rate or thr ...
efficient forms of redundancy use erasure coding
In coding theory, an erasure code is a forward error correction (FEC) code under the assumption of bit erasures (rather than bit errors), which transforms a message of ''k'' symbols into a longer message (code word) with ''n'' symbols such that the ...
techniques like Reed–Solomon. A simple, less CPU
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, and ...
intensive but more expensive form of redundancy is duplicate copies.
Flexible contribution
Due to bandwidth or hardware constraints some nodes may not be able to contribute as much space as they consume in the cloud. On the other hand, nodes with large storage space and limited or no bandwidth constraints may contribute more than they consume, thereby the cooperative can stay in balance.
Examples
Examples include MIT's Chord,[ ]Filecoin
Filecoin (⨎) is an open-source, public cryptocurrency and digital payment system intended to be a blockchain-based cooperative digital storage and data retrieval method. It is made by Protocol Labs and shares some ideas from InterPlanetary F ...
, Siacoin, DeNet, Cubbit, StorJ and Wildland.
A partly centralized system was operated by Symform, Inc., a startup company
A startup or start-up is a company or project undertaken by an entrepreneur to seek, develop, and validate a scalable business model. While entrepreneurship refers to all new businesses, including self-employment and businesses that never intend t ...
based in Seattle
Seattle ( ) is a seaport city on the West Coast of the United States. It is the seat of King County, Washington. With a 2020 population of 737,015, it is the largest city in both the state of Washington and the Pacific Northwest region of ...
. Symform generated and kept the keys used to encrypt and decrypt, and since it also decided which server will host which parts of a file, users have to trust Symform not to share those with any other party or misuse the information. Symform discontinued its service on July 31, 2016.
See also
* Cloud computing
* Cyber Resilience
*Distributed data store
A distributed data store is a computer network where information is stored on more than one node, often in a replicated fashion. It is usually specifically used to refer to either a distributed database where users store information on a ''num ...
*Freenet
Freenet is a peer-to-peer platform for censorship-resistant, anonymous communication. It uses a decentralized distributed data store to keep and deliver information, and has a suite of free software for publishing and communicating on the Web ...
*InterPlanetary File System
The InterPlanetary File System (IPFS) is a protocol, hypermedia and file sharing peer-to-peer network for storing and sharing data in a distributed file system. IPFS uses content-addressing to uniquely identify each file in a global namespac ...
*Tahoe-LAFS
Tahoe-LAFS (Tahoe Least-Authority File Store) is a free and open, secure, decentralized, fault-tolerant, distributed data store and distributed file system. It can be used as an online backup system, or to serve as a file or Web host similar to ...
References
{{Reflist
Cloud storage
Cloud computing
Distributed data storage