Riak (pronounced "ree-ack"
) is a distributed
NoSQL
A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
key-value
data store
In the pursuit of knowledge, data (; ) is a collection of discrete values that convey information, describing quantity, quality, fact, statistics, other basic units of meaning, or simply sequences of symbols that may be further interpreted. ...
based on Amazon's
Dynamo
file:DynamoElectricMachinesEndViewPartlySection USP284110.png, "Dynamo Electric Machine" (end view, partly section, )
A dynamo is an electrical generator that creates direct current using a commutator (electric), commutator. Dynamos were the f ...
paper, including its "tunable AP" approach, that is tunable
consistency
In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent ...
, to the tradeoffs imposed by the
CAP Theorem
In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that any distributed data store can provide only two of the following three guarantees:Seth Gilbert and Nancy Lynch"Brewer' ...
. Riak offers
high availability
High availability (HA) is a characteristic of a system which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.
Modernization has resulted in an increased reliance on these systems. Fo ...
,
fault tolerance
Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of one or more faults within some of its components. If its operating quality decreases at all, the decrease is proportional to the ...
, operational simplicity, and
scalability
Scalability is the property of a system to handle a growing amount of work by adding resources to the system.
In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...
.
It moved to an entirely
open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
project in August 2017, with many of the licensed Enterprise Edition features being incorporated after its acquisition by
Bet365
Bet365 Group Ltd (commonly known and stylized as bet365 and spoken as "bet three-six-five") is a leading British online gambling company based in the United Kingdom. It was founded by Denise Coates, who remains the majority shareholder and join ...
.
Written in
Erlang, Riak has fault tolerant data replication and automatic data distribution across the cluster for performance and resilience.
Riak was originally developed by engineers employed by Basho Technologies and maintained by them until 2017 when the rights were sold to bet365 after Basho went into receivership. Riak was originally licensed using a
freemium
Freemium, a portmanteau of the words "free" and "premium," is a pricing strategy by which a basic product or service is provided free of charge, but money (a premium) is charged for additional features, services, or virtual (online) or physical (o ...
model.
It is now completely open-source including all the enterprise features. Riak has a pluggable backend for its core storage, with the default storage backend being
Bitcask
{{Infobox software
, name = Bitcask
, title = Bitcask
, screenshot =
, caption =
, collapsible =
, author =
, developer = Basho Technol ...
.
Main features
;Fault-tolerant availability: Riak replicates key/value stores across a cluster of nodes with a default n_val of three. In the case of node outages due to
network partition
A network partition is a division of a computer network into relatively independent subnets, either by design, to optimize them separately, or due to the failure of network devices. Distributed software must be designed to be partition-tolerant, t ...
or hardware failures, data can still be written to a neighboring node beyond the initial three, and read-back due to its "masterless" peer-to-peer architecture.
;Queries: Riak provides a
REST-ful API
An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software interface, offering a service to other pieces of software. A document or standard that describes how ...
through HTTP and
Protocol Buffers
Protocol Buffers (Protobuf) is a free and open-source cross-platform data format used to serialize structured data. It is useful in developing programs to communicate with each other over a network or for storing data. The method involves an i ...
for basic PUT, GET, POST, and DELETE functions. More complex queries are also possible, including secondary indexes, search (via
Apache Solr
Solr (pronounced "solar") is an open-source enterprise-search platform, written in Java. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features a ...
), and
MapReduce
MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster.
A MapReduce program is composed of a ''map'' procedure, which performs filtering ...
. MapReduce has native support for both
JavaScript
JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of Website, websites use JavaScript on the Client (computing), client side ...
(using the
SpiderMonkey
SpiderMonkey is the first JavaScript engine, written by Brendan Eich at Netscape Communications, later released as open source and currently maintained by the Mozilla Foundation. It is used in the Firefox web browser.
History
Eich "wrote ...
runtime) and Erlang.
;Predictable latency: Riak distributes data across nodes with hashing and can provide latency profile, even in the case of multiple node failures.
;Storage options: Keys/values can be stored in memory, disk, or both.
;Multi-datacenter replication: In multi-datacenter replication, one cluster acts as a "primary cluster." The primary cluster handles replication requests from one or more "secondary clusters" (generally located in other regions or countries). If the datacenter with the primary cluster goes down, a second cluster can take over as the primary cluster.
:There are two primary modes of operation: fullsync and realtime. In fullsync mode, a complete synchronization occurs between primary and secondary cluster(s), by default every six hours. In real-time mode, replication to the secondary data center(s) is triggered by updates to the primary data center. All multi-datacenter replication occurs over multiple concurrent
TCP connections to maximize performance and network utilization.
;Tunable consistency: Option to choose between eventual and strong consistency for each bucket.
Licensing and support
Riak is available for free under the
Apache 2 License. In addition,
Basho Technologies
Basho Technologies was a distributed systems' company that developed a key-value NoSQL database technology, Riak, and an object storage system built upon the Riak platform, called Riak CS.
Technology and products
Basho was the developer of Ria ...
offered two options for its commercial software, Riak Enterprise and Riak Enterprise Plus. Riak Enterprise Plus adds baseline and annual system health checks to ensure long-term platform stability and performance.
Language support
Riak has official drivers for
Ruby
A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sa ...
,
Java
Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
,
Erlang and
Python
Python may refer to:
Snakes
* Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia
** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia
* Python (mythology), a mythical serpent
Computing
* Python (pro ...
. There are also numerous community-supported drivers for other programming languages.
History
Riak was originally written by Andy Gross and others at
Basho Technologies
Basho Technologies was a distributed systems' company that developed a key-value NoSQL database technology, Riak, and an object storage system built upon the Riak platform, called Riak CS.
Technology and products
Basho was the developer of Ria ...
to power a web Sales Force Automation application by former engineers and executives from
Akamai. There was more interest in the datastore technology than the applications built on it, so the company decided to build a business around Riak itself, gaining adoption throughout the Fortune 100 and becoming a foundation to many of the world's fastest-growing Web-based, mobile and social networking applications, as well as cloud service providers.
Releases after graduation include
*1.1, released Feb 21 2012, added Riaknostic, enhanced error logging and reporting, improved resiliency for large clusters, and a new graphical operations and monitoring interface called Riak Control.
*1.4, released July 10, 2013, added counters, secondary indexing improvements, reduced object storage overhead, handoff progress reporting, and enhancements to MDC replication.
*2.0, released September 2, 2014, added new data types including sets, maps, registers, and flags simplifying application development. Strong consistency by bucket, full-text integration with Apache Solr, Security, and reduced replicas for Secondary sites.
*2.1, released April 16, 2015, added an optimization for many write-heavy workloads – “write once” buckets – buckets whose entries are intended to be written exactly once, and never updated or over-written.
*2.2, released November 17, 2016, added Support for
Debian
Debian (), also known as Debian GNU/Linux, is a Linux distribution composed of free and open-source software, developed by the community-supported Debian Project, which was established by Ian Murdock on August 16, 1993. The first version of D ...
8 and
Ubuntu
Ubuntu ( ) is a Linux distribution based on Debian and composed mostly of free and open-source software. Ubuntu is officially released in three editions: ''Desktop'', ''Server'', and ''Core'' for Internet of things devices and robots. All the ...
16.04,
Solr
Solr (pronounced "solar") is an open-source enterprise-search platform, written in Java. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features ...
integration improvements.
*Riak may no longer be maintained by Basho
*At this point parent company
Basho Technologies
Basho Technologies was a distributed systems' company that developed a key-value NoSQL database technology, Riak, and an object storage system built upon the Riak platform, called Riak CS.
Technology and products
Basho was the developer of Ria ...
was put into receivership. The assets were purchased by
bet365
Bet365 Group Ltd (commonly known and stylized as bet365 and spoken as "bet three-six-five") is a leading British online gambling company based in the United Kingdom. It was founded by Denise Coates, who remains the majority shareholder and join ...
who open sourced all the code, including the previously closed source portions, allowing the release of:
*2.2.5, released April 26, 2018, is the first community release. Added support for Multi-Datacentre Replication which was not part of open-source Riak before, added a grow-only set data type, improved data distribution over nodes and cleaned up production test issues.
*
NHS Digital
NHS Digital is the trading name of the Health and Social Care Information Centre, which is the national provider of information, data and IT systems for commissioners, analysts and clinicians in health and social care in England, particularly th ...
and
Bet365
Bet365 Group Ltd (commonly known and stylized as bet365 and spoken as "bet three-six-five") is a leading British online gambling company based in the United Kingdom. It was founded by Denise Coates, who remains the majority shareholder and join ...
have continued to fund work to develop this community release, making significant changes, including bringing version 3.0 up to date with more modern Erlang OTP versions.
*Community development goes on beyond release 3.0.
Users
Notable users include
AT&T
AT&T Inc. is an American multinational telecommunications holding company headquartered at Whitacre Tower in Downtown Dallas, Texas. It is the world's largest telecommunications company by revenue and the third largest provider of mobile tel ...
,
Comcast
Comcast Corporation (formerly known as American Cable Systems and Comcast Holdings),Before the AT&T merger in 2001, the parent company was Comcast Holdings Corporation. Comcast Holdings Corporation now refers to a subsidiary of Comcast Corpora ...
,
GitHub
GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, continuous ...
,
Best Buy
Best Buy Co. Inc. is an American multinational consumer electronics retailer headquartered in Richfield, Minnesota. Originally founded by Richard M. Schulze and James Wheeler in 1966 as an audio specialty store called Sound of Music, it was rebra ...
,
UK National Health Services (NHS),
The Weather Channel
The Weather Channel (TWC) is an American pay television channel owned by Weather Group, LLC, a subsidiary of Allen Media Group. The channel's headquarters are in Atlanta, Georgia. Launched on May 2, 1982, the channel broadcasts weather forecas ...
, and
Riot Games
Riot Games, Inc. is an American video game developer, publisher and esports tournament organizer based in Los Angeles, California. It was founded in September 2006 by Brandon Beck and Marc Merrill to develop ''League of Legends'' and went on t ...
.
See also
*
Basho Technologies
Basho Technologies was a distributed systems' company that developed a key-value NoSQL database technology, Riak, and an object storage system built upon the Riak platform, called Riak CS.
Technology and products
Basho was the developer of Ria ...
*
Apache Accumulo
Apache Accumulo is a highly scalable sorted, distributed key-value store based on Google's Bigtable. It is a system built on top of Apache Hadoop, Apache ZooKeeper, and Apache Thrift. Written in Java, Accumulo has cell-level access labels and ...
*
Oracle NoSQL Database
Oracle NoSQL Database is a NoSQL-type distributed key-value database from Oracle Corporation. It provides transactional semantics for data manipulation, horizontal scalability, and simple administration and monitoring.
Oracle NoSQL Database Cl ...
*
NoSQL
A NoSQL (originally referring to "non- SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Such databases have existed ...
*
Structured storage
Structuring, also known as smurfing in banking jargon, is the practice of executing financial transactions such as making bank deposits in a specific pattern, calculated to avoid triggering financial institutions to file reports required by la ...
*
Memcached
Memcached (pronounced variously ''mem-cash-dee'' or ''mem-cashed'') is a general-purpose distributed memory-caching system. It is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the number of t ...
*
Redis
Redis (; Remote Dictionary Server) is an in-memory data structure store, used as a distributed, in-memory key–value database, cache and message broker, with optional durability. Redis supports different kinds of abstract data structures, su ...
References
External links
* {{Official website
Cloud applications
Cloud infrastructure
Key-value databases
NoSQL
Cloud storage
Free software programmed in Erlang
Software using the Apache license
Free database management systems