Ordered Key-Value Store
   HOME

TheInfoList



OR:

An Ordered Key-Value Store (OKVS) is a type of data storage paradigm that can support
multi-model database In the field of database design, a multi-model database is a database management system designed to support multiple data models against a single, integrated backend. In contrast, most database management systems are organized around a single data ...
. An OKVS is an ordered mapping of bytes to bytes. An OKVS will keep the key-value pairs sorted by the key lexicographic order. OKVS systems provides different set of features and performance trade-offs. Most of them are shipped as a library without network interfaces, in order to be embedded in another process. Most OKVS support ACID guarantees. Some OKVS are
distributed database A distributed database is a database in which data is stored across different physical locations. It may be stored in multiple computers located in the same physical location (e.g. a data centre); or maybe dispersed over a computer network, netwo ...
s. Ordered Key-Value Store found their way into many modern database systems including
NewSQL NewSQL is a class of relational database management system, relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing (OLTP) workloads while maintaining the ACID guarantees of a t ...
database systems.


History

The origin of Ordered Key-Value Store stems from the work of
Ken Thompson Kenneth Lane Thompson (born February 4, 1943) is an American pioneer of computer science. Thompson worked at Bell Labs for most of his career where he designed and implemented the original Unix operating system. He also invented the B (programmi ...
on dbm in 1979. Later in 1991,
Berkeley DB Berkeley DB (BDB) is an embedded database software library for key/value data, historically significant in open-source software. Berkeley DB is written in C with API bindings for many other programming languages. BDB stores arbitrary key/data ...
was released that featured a
B-Tree In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing fo ...
backend that allowed the keys to stay sorted. Berkeley DB was said to be very fast and made its way into various commercial product. It was included in Python standard library until 2.7. In 2009, Tokyo Cabinet was released that was superseded by Kyoto Cabinet that support both transaction and ordered keys. In 2011, LMDB was created to replace Berkeley DB in
OpenLDAP OpenLDAP is a free, open-source implementation of the Lightweight Directory Access Protocol (LDAP) developed by the OpenLDAP Project. It is released under its own BSD-style license called the OpenLDAP Public License. LDAP is a platform-independ ...
. There is also Google's
LevelDB LevelDB is an open-source on-disk key-value store written by Google fellows Jeffrey Dean and Sanjay Ghemawat. Inspired by Bigtable, LevelDB source code is hosted on GitHub under the New BSD License and has been ported to a variety of Unix-base ...
that was forked by Facebook in 2012 as
RocksDB RocksDB is a high performance embedded database for key-value data. It is a fork of Google's LevelDB optimized to exploit multi-core processors (CPUs), and make efficient use of fast storage, such as solid-state drives (SSD), for input/outpu ...
. In 2014,
WiredTiger WiredTiger is a NoSQL, open source extensible platform for data management. It is released under version 2 or 3 of the GNU General Public License. WiredTiger uses multiversion concurrency control (MVCC) architecture. Details MongoDB acquired Wired ...
, successor of Berkeley DB was acquired by
MongoDB MongoDB is a source-available, cross-platform, document-oriented database program. Classified as a NoSQL database product, MongoDB uses JSON-like documents with optional database schema, schemas. Released in February 2009 by 10gen (now MongoDB ...
and is since 2019 the primary backend of MongoDB database. Other notable implementation of the OKVS paradigm are Sophia and SQLite3 LSM extension. Another notable use of OKVS paradigm is the multi-model database system called
ArangoDB ArangoDB is a graph database system developed by ArangoDB Inc. ArangoDB is a multi-model database system since it supports three data models (graphs, JSON documents, key/value) with one database core and a unified query language AQL (ArangoDB Qu ...
based on RocksDB. Some
NewSQL NewSQL is a class of relational database management system, relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing (OLTP) workloads while maintaining the ACID guarantees of a t ...
databases are supported by Ordered Key-Value Stores. JanusGraph, a property graph database, has both a Berkeley DB backend and FoundationDB backend.


Key concepts


Lexicographic encoding

There are algorithms that encode basic data types (boolean, string, number) and composition of those data types inside sorted containers (tuple, list, vector) that preserve their natural ordering. It is possible to work with an Ordered Key-Value Store without having to work directly with bytes. In FoundationDB, it is called the tuple layer.


Range query

Inside an OKVS, keys are ordered, and because of that it is possible to do range queries. A range query retrieves all keys between two specified keys, ensuring that the fetched keys are returned in a sorted order.


Subspaces


Key composition

One can construct key spaces to build higher level abstractions. The idea is to construct keys, that takes advantage of the ordered nature of the top level key space. When taking advantage of the ordered nature of the key space, one can query ranges of keys that have particular pattern.


Denormalization

Denormalization, as in, repeating the same piece of data in multiple subspace is common practice. It allows to create secondary representation, also called indices, that will allow to speed up queries.


Higher level abstractions

The following abstraction or databases were built on top Ordered Key-Value Stores: * Timeseries database, * Record Database, also known as Row store databases, they behave similarly to what is dubbed
RDBMS A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A Relational Database Management System (RDBMS) is a type of database management system that stores data in a structured forma ...
, * Tuple Stores, also known as Triple Store or Quad Store but also Generic Tuple Store, * Document database, that mimics MongoDB API, * Full-text search *Geographic Information Systems *Property Graph *Versioned Data *Vector space database for Approximate Nearest Neighbor All those abstraction can co-exist with the same OKVS database and when ACID is supported, the operations happens with the guarantees offered by the transaction system.


Feature matrix


See also

*
Key–value_database A key–value database, or key–value store, is a data storage paradigm designed for storing, retrieving, and managing associative arrays, a data structure more commonly known today as a ''dictionary'' or ''hash table''. Dictionaries contain a ...
*
Wide-column store A wide-column store (or extensible record store) is a type of NoSQL database.Wide Column Stores
*
Multi-model database In the field of database design, a multi-model database is a database management system designed to support multiple data models against a single, integrated backend. In contrast, most database management systems are organized around a single data ...


References

{{Database models Types of databases Database theory Databases Associative arrays Data management Data analysis NoSQL Structured storage Transaction processing