Voldemort (distributed Data Store)
   HOME

TheInfoList



OR:

Voldemort is a
distributed data store A distributed data store is a computer network where information is stored on more than one node, often in a replicated fashion. It is usually specifically used to refer to either a distributed database where users store information on a ''numb ...
that was designed as a key-value store used by
LinkedIn LinkedIn () is an American business and employment-oriented online service that operates via websites and mobile apps. Launched on May 5, 2003, the platform is primarily used for professional networking and career development, and allows job se ...
for highly-scalable storage. It is named after the fictional ''
Harry Potter ''Harry Potter'' is a series of seven fantasy literature, fantasy novels written by British author J. K. Rowling. The novels chronicle the lives of a young Magician (fantasy), wizard, Harry Potter (character), Harry Potter, and his friends ...
'' villain
Lord Voldemort Lord Voldemort ( , in the films) is a sobriquet for Tom Marvolo Riddle, a character and the main antagonist in J. K. Rowling's series of ''Harry Potter'' novels. The character first appeared in ''Harry Potter and the Philosopher's Stone' ...
.


Overview

Voldemort does not try to satisfy arbitrary relations and the
ACID In computer science, ACID ( atomicity, consistency, isolation, durability) is a set of properties of database transactions intended to guarantee data validity despite errors, power failures, and other mishaps. In the context of databases, a sequ ...
properties, but rather is a big, distributed, persistent hash table. A 2012 study comparing systems for storing
application performance management In the fields of information technology and systems management, application performance management (APM) is the monitoring and management of the performance and availability of software applications. APM strives to detect and diagnose complex appli ...
data reported that Voldemort,
Apache Cassandra Cassandra is a free and open-source, distributed, wide-column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandr ...
, and
HBase HBase is an open-source non-relational distributed database modeled after Google's Bigtable and written in Java. It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File Sys ...
all offered linear scalability in most cases, with Voldemort having the lowest latency and Cassandra having the highest throughput. In the parlance of Eric Brewer's
CAP theorem In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can provide only two of the following three guarantees:Seth Gilbert and Nancy Lynch"Brewer' ...
, Voldemort is an AP type system. Voldemort's creator and primary corporate contributor, LinkedIn, has migrated all of their systems off of Voldemort as of approximately August 2018, with no replacement sponsor .


Properties

Voldemort uses in-memory caching to eliminate a separate caching tier. It has a storage layer that is possible to emulate. Voldemort reads and writes scale horizontally. The API decides data replication and placement and accommodates a wide range of application-specific strategies.Serving Large-scale Batch Computed Data with Project Voldemort
/ref> The Voldemort distributed data store supports pluggable placement strategies for distribution across data centers. Data is automatically replicated across servers. Data is partitioned meaning a single server contains only a portion of the total data. Each data node is independent to avoid central point of failure. Pluggable serialization allows rich keys and values including lists and tuples with named fields, as well as the integration with common serialisation frameworks such as
Avro AVRO, short for Algemene Vereniging Radio Omroep ("General Association of Radio Broadcasting"), was a Dutch public broadcasting association operating within the framework of the Nederlandse Publieke Omroep system. It was the first public broad ...
, Java Serialization,
Protocol Buffers Protocol Buffers (Protobuf) is a free and open-source cross-platform data format used to serialize structured data. It is useful in developing programs to communicate with each other over a network or for storing data. The method involves an int ...
, and Thrift. Server failures are handled transparently. Data items are versioned, which maximizes data integrity.


See also

*
Distributed data store A distributed data store is a computer network where information is stored on more than one node, often in a replicated fashion. It is usually specifically used to refer to either a distributed database where users store information on a ''numb ...
*
NoSQL A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
*
Riak Riak (pronounced "ree-ack" ) is a distributed NoSQL key-value data store based on Amazon's Dynamo paper, including its "tunable AP" approach, that is tunable consistency, to the tradeoffs imposed by the CAP Theorem. Riak offers high availability, ...
*
Redis Redis (; Remote Dictionary Server) is an in-memory data structure store, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Redis supports different kinds of abstract data structures, su ...


References


External links


Project Voldemort - A distributed database

Project Voldemort Real Time Discussions
{{Microsoft FOSS Distributed data stores LinkedIn software NoSQL Microsoft free software Software using the Apache license 2009 software