In
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, consistent hashing
is a special kind of
hashing
Hash, hashes, hash mark, or hashing may refer to:
Substances
* Hash (food), a coarse mixture of ingredients, often based on minced meat
* Hash (stew), a pork and onion-based gravy found in South Carolina
* Hash, a nickname for hashish, a cannab ...
technique such that when a
hash table
In computer science, a hash table is a data structure that implements an associative array, also called a dictionary or simply map; an associative array is an abstract data type that maps Unique key, keys to Value (computer science), values. ...
is resized, only
keys need to be remapped on average where
is the number of keys and
is the number of slots. In contrast, in most traditional hash tables, a change in the number of array slots causes nearly all keys to be remapped because the mapping between the keys and the slots is defined by a
modular operation.
Consistent hashing evenly distributes cache keys across
shards, even if some of the shards crash or become unavailable.
History
The term "consistent hashing" was introduced by
David Karger ''et al.'' at
MIT
The Massachusetts Institute of Technology (MIT) is a private research university in Cambridge, Massachusetts, United States. Established in 1861, MIT has played a significant role in the development of many areas of modern technology and sc ...
for use in
distributed caching, particularly for the
web
Web most often refers to:
* Spider web, a silken structure created by the animal
* World Wide Web or the Web, an Internet-based hypertext system
Web, WEB, or the Web may also refer to:
Computing
* WEB, a literate programming system created by ...
. This academic paper from 1997 in
Symposium on Theory of Computing
The Annual ACM Symposium on Theory of Computing (STOC) is an academic conference in the field of theoretical computer science. STOC has been organized annually since 1969, typically in May or June; the conference is sponsored by the Association for ...
introduced the term "consistent hashing" as a way of distributing requests among a changing population of web servers. Each slot is then represented by a server in a distributed system or cluster. The addition of a server and the removal of a server (during scalability or outage) requires only
items to be re-shuffled when the number of slots (i.e. servers) change. The authors mention
linear hashing Linear hashing (LH) is a dynamic data structure which implements a hash table and grows or shrinks one bucket at a time. It was invented by Witold Litwin in 1980.
It has been analyzed by Baeza-Yates and Soza-Pollman. It is the first in a number ...
and its ability to handle sequential server addition and removal, while consistent hashing allows servers to be added and removed in an arbitrary order.
The paper was later re-purposed to address technical challenge of keeping track of a file in
peer-to-peer networks
Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network, forming a peer-to-peer network of Node ...
such as a
distributed hash table
A distributed hash table (DHT) is a Distributed computing, distributed system that provides a lookup service similar to a hash table. Key–value pairs are stored in a DHT, and any participating node (networking), node can efficiently retrieve the ...
.
Teradata
Teradata Corporation is an American software company that provides cloud database and Analytics, analytics-related software, products, and services. The company was formed in 1979 in Brentwood, California, as a collaboration between researchers a ...
used this technique in their distributed database, released in 1986, although they did not use this term. Teradata still uses the concept of a
hash table
In computer science, a hash table is a data structure that implements an associative array, also called a dictionary or simply map; an associative array is an abstract data type that maps Unique key, keys to Value (computer science), values. ...
to fulfill exactly this purpose.
Akamai Technologies
Akamai Technologies, Inc. is an American company specialized in content delivery networkJ. Dilley, B. Maggs, J. Parikh, H. Prokop, R. Sitaraman, and B. Weihl. (CDN), cybersecurity, DDoS mitigation, and cloud services. It is headquartered in ...
was founded in 1998 by the scientists
Daniel Lewin
Daniel Mark Lewin (; May 14, 1970 – September 11, 2001) was an American-Israeli mathematician and entrepreneur who co-founded Akamai Technologies. A passenger on board American Airlines Flight 11, it is believed that Lewin was stabbed to death ...
and
F. Thomson Leighton (co-authors of the article coining "consistent hashing"). In Akamai's content delivery network, consistent hashing is used to balance the load within a cluster of servers, while a
stable marriage algorithm is used to balance load across clusters.
Consistent hashing has also been used to reduce the impact of partial system failures in large web applications to provide robust caching without incurring the system-wide fallout of a failure.
Consistent hashing is also the cornerstone of
distributed hash table
A distributed hash table (DHT) is a Distributed computing, distributed system that provides a lookup service similar to a hash table. Key–value pairs are stored in a DHT, and any participating node (networking), node can efficiently retrieve the ...
s (DHTs), which employ hash values to partition a keyspace across a distributed set of nodes, then construct an overlay network of connected nodes that provide efficient node retrieval by key.
Rendezvous hashing
Rendezvous or highest random weight (HRW) hashing is an algorithm that allows clients to achieve distributed agreement on a set of k options out of a possible set of n options. A typical application is when clients need to agree on which sites (or ...
, designed in 1996, is a simpler and more general technique . It achieves the goals of consistent hashing using the very different highest random weight (HRW) algorithm.
Basic technique

In the problem of
load balancing, for example, when a
BLOB has to be assigned to one of
servers on a
cluster
may refer to:
Science and technology Astronomy
* Cluster (spacecraft), constellation of four European Space Agency spacecraft
* Cluster II (spacecraft), a European Space Agency mission to study the magnetosphere
* Asteroid cluster, a small ...
, a standard hash function could be used in such a way that we calculate the hash value for that BLOB, assuming the resultant value of the hash is
, we perform
modular operation with the number of servers (
in this case) to determine the server in which we can place the BLOB:
; hence the BLOB will be placed in the server whose
is successor of
in this case. However, when a server is added or removed during outage or scaling (when
changes), all the BLOBs in every server should be reassigned and moved due to
rehashing, but this operation is expensive.
Consistent hashing was designed to avoid the problem of having to reassign every BLOB when a server is added or removed throughout the cluster. The central idea is to use a hash function that maps both the BLOB and servers to a unit circle, usually
radians. For example,
(where
is hash of a BLOB or server's identifier, like
IP address
An Internet Protocol address (IP address) is a numerical label such as that is assigned to a device connected to a computer network that uses the Internet Protocol for communication. IP addresses serve two main functions: network interface i ...
or
UUID
A Universally Unique Identifier (UUID) is a 128-bit nominal number, label used to uniquely identify objects in computer systems. The term Globally Unique Identifier (GUID) is also used, mostly in Microsoft systems.
When generated according to the ...
). Each BLOB is then assigned to the next server that appears on the circle in clockwise order. Usually,
binary search algorithm
In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the ...
or
linear search
In computer science, linear search or sequential search is a method for finding an element within a list. It sequentially checks each element of the list until a match is found or the whole list has been searched.
A linear search runs in linea ...
is used to find a "spot" or server to place that particular BLOB in
or
complexities respectively; and in every iteration, which happens in clockwise manner, an operation
(where
is the value of the server within the cluster) is performed to find the server to place the BLOB. This provides an even distribution of BLOBs to servers. But, more importantly, if a server fails and is removed from the circle, only the BLOBs that were mapped to the failed server need to be reassigned to the next server in clockwise order. Likewise, if a new server is added, it is added to the unit circle, and only the BLOBs mapped to that server need to be reassigned.
Importantly, when a server is added or removed, the vast majority of the BLOBs maintain their prior server assignments, and the addition of
server only causes
fraction of the BLOBs to relocate. Although the process of moving BLOBs across cache servers in the cluster depends on the context, commonly, the newly added cache server identifies its "predecessor" and moves all the BLOBs, whose mapping belongs to this server (i.e. whose hash value is less than that of the new server), from it. However, in the case of
web page caches, in most implementations there is no involvement of moving or copying, assuming the cached BLOB is small enough. When a request hits a newly added cache server, a
cache miss
In computing, a cache ( ) is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsew ...
happens and a request to the actual
web server
A web server is computer software and underlying Computer hardware, hardware that accepts requests via Hypertext Transfer Protocol, HTTP (the network protocol created to distribute web content) or its secure variant HTTPS. A user agent, co ...
is made and the BLOB is cached locally for future requests. The redundant BLOBs on the previously used cache servers would be removed as per the
cache eviction policies.
Implementation
Let
and
be the hash functions used for the BLOB and server's unique identifier respectively. In practice, a
binary search tree
In computer science, a binary search tree (BST), also called an ordered or sorted binary tree, is a Rooted tree, rooted binary tree data structure with the key of each internal node being greater than all the keys in the respective node's left ...
(BST) is used to dynamically maintain the
within a cluster or hashring, and to find the successor or minimum within the BST,
tree traversal
In computer science, tree traversal (also known as tree search and walking the tree) is a form of graph traversal and refers to the process of visiting (e.g. retrieving, updating, or deleting) each node in a Tree (data structure), tree data stru ...
is used.
;Inserting
into the cluster
:Let
be the hash value of a BLOB such that,
where
and
. To insert
, find the successor of
in the BST of
s. If
is larger than all of the
s, the BLOB is placed in the server with smallest
value.
;Deleting
from the cluster
:Find the successor of
in the BST, remove the BLOB from the returned
. If
has no successor, remove the BLOB from the smallest of the
s.
;Insert a server into cluster
:Let
be the hash value of a server's identifier such that,
where
and
. Move all the BLOBs, whose hash value is smaller than
, from the server whose
is successor of
. If
is largest of all the
s, move the relevant BLOBs from the smallest of the
s into
.
;Delete a server from cluster
:Find the successor of
in the BST, move the BLOBs from
into its successor server. If
doesn't have a successor, move the BLOBs into the smallest of the
s.
Variance reduction
To avoid
skewness
In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined.
For a unimodal ...
of multiple nodes within the radian, which happen due to lack of
uniform distribution of the servers within the cluster, multiple labels are used. Those duplicate labels are called "virtual nodes" i.e. multiple labels which point to a single "real" label or server within the cluster. The amount of virtual nodes or duplicate labels used for a particular server within a cluster is called the "weight" of that particular server.
Practical extensions
A number of extensions to the basic technique are needed for effectively using consistent hashing for load balancing in practice. In the basic scheme above, if a server fails, all its BLOBs are reassigned to the next server in clockwise order, potentially doubling the load of that server. This may not be desirable. To ensure a more even redistribution of BLOBs on server failure, each server can be hashed to multiple locations on the unit circle. When a server fails, the BLOBs assigned to each of its replicas on the unit circle will get reassigned to a different server in clockwise order, thus redistributing the BLOBs more evenly. Another extension concerns a situation where a single BLOB gets "hot" and is accessed a large number of times and will have to be hosted in multiple servers. In this situation, the BLOB may be assigned to multiple contiguous servers by traversing the unit circle in clockwise order. A more complex practical consideration arises when two BLOBs are hashed near each other in the unit circle and both get "hot" at the same time. In this case, both BLOBs will use the same set of contiguous servers in the unit circle. This situation can be ameliorated by each BLOB choosing a different hash function for mapping servers to the unit circle.
Comparison with rendezvous hashing and other alternatives
Rendezvous hashing
Rendezvous or highest random weight (HRW) hashing is an algorithm that allows clients to achieve distributed agreement on a set of k options out of a possible set of n options. A typical application is when clients need to agree on which sites (or ...
, designed in 1996, is a simpler and more general technique, and permits fully distributed agreement on a set of
options out of a possible set of
options.
It can in fact be shown that consistent hashing is a special case of rendezvous hashing. Because of its simplicity and generality, rendezvous hashing is now being used in place of Consistent Hashing in many applications.
If key values will always increase
monotonic
In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of ord ...
ally, an alternative approach using a
hash table with monotonic keys may be more suitable than consistent hashing.
Complexity
The
is an average cost for redistribution of keys and the
complexity for consistent hashing comes from the fact that a
binary search
In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the m ...
among nodes angles is required to find the next node on the ring.
Examples
Known examples of consistent hashing use include:
*
Couchbase
Couchbase Server, originally known as Membase, is a source-available, distributed ( shared-nothing architecture) multi-model NoSQL document-oriented database software package optimized for interactive applications. These applications may serv ...
automated data partitioning
*
OpenStack's Object Storage Service Swift
* Partitioning component of Amazon's storage system
Dynamo
"Dynamo Electric Machine" (end view, partly section, )
A dynamo is an electrical generator that creates direct current using a commutator. Dynamos employed electromagnets for self-starting by using residual magnetic field left in the iron cores ...
* Data partitioning in
Apache Cassandra
Apache Cassandra is a free and open-source software, free and open-source database management system designed to handle large volumes of data across multiple Commodity computing, commodity servers. The system prioritizes availability and scalab ...
* Data partitioning in
ScyllaDB
ScyllaDB is a source-available distributed NoSQL wide-column data store. It was designed to be compatible with Apache Cassandra while achieving significantly higher throughputs and lower latencies. It supports the same protocols as Cassandra ( CQ ...
* Data partitioning in
Voldemort
Lord Voldemort ( , in the films) is a fictional character and the main antagonist in the ''Harry Potter'' series of novels by J. K. Rowling. He first appears in '' Harry Potter and the Philosopher's Stone'' (1997) and returns either in pe ...
*
Akka
Akka or AKKA may refer to:
Arts and entertainment
* Akka (film), ''Akka'' (film), a 1976 Indian Tamil film
* Akka (TV series), ''Akka'' (TV series), a 2014–2015 Indian Tamil soap opera
* Akka, a character in the children's novel ''The Wonderful ...
's consistent hashing router
*
Riak
Riak (pronounced "ree-ack" ) is a distributed NoSQL key-value data store that offers high availability, fault tolerance, operational simplicity, and scalability. Riak moved to an entirely open-source project in August 2017, with many of the ...
, a distributed key-value database
*
Gluster
Gluster Inc. (formerly known as Z RESEARCH) was a software company that provided an open source platform for scale-out public and private cloud storage. The company was privately funded and headquartered in Sunnyvale, California, with an engineer ...
, a network-attached storage file system
*
Akamai content delivery network
*
Discord
Discord is an instant messaging and Voice over IP, VoIP social platform which allows communication through Voice over IP, voice calls, Videotelephony, video calls, text messaging, and digital media, media. Communication can be private or take ...
chat application
* Load balancing
gRPC
gRPC (acronym for gRPC Remote Procedure Calls) is a cross-platform high-performance remote procedure call (RPC) framework. gRPC was initially created by Google, but is open source and is used in many organizations. Use cases range from microservi ...
requests to a distributed cache in SpiceDB
*
Chord algorithm
[
]
*
MinIO object storage system
References
Works cited
*
*
External links
Understanding Consistent hashingConsistent hashing by Michael Nielsen on June 3, 2009Consistent Hashing, Danny Lewin, and the Creation of Akamai*
Jump Consistent Hashing: A Fast, Minimal Memory, Consistent Hash Algorithm
Rendezvous Hashing: an alternative to Consistent Hashing* Implementations in various languages:
*
C*
*
C#*
Erlang*
Go*
*
PHP*
Ruby*
Python*
Python (again)*
Perl*
Perl6
{{DEFAULTSORT:Consistent hashing
Hashing