In
computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...
, a hash table, also known as hash map, is a
data structure
In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, ...
that implements an
associative array or dictionary. It is an
abstract data type that maps
keys
Key or The Key may refer to:
Common meanings
* Key (cryptography), a piece of information that controls the operation of a cryptography algorithm
* Key (lock), device used to control access to places or facilities restricted by a lock
* Key (m ...
to
values.
A hash table uses a
hash function to compute an ''index'', also called a ''hash code'', into an array of ''buckets'' or ''slots'', from which the desired value can be found. During lookup, the key is hashed and the resulting hash indicates where the corresponding value is stored.
Ideally, the hash function will assign each key to a unique bucket, but most hash table designs employ an imperfect hash function, which might cause hash ''
collisions'' where the hash function generates the same index for more than one key. Such collisions are typically accommodated in some way.
In a well-dimensioned hash table, the average time complexity for each lookup is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of
key–value pairs, at
amortized constant average cost per operation.
[ Charles E. Leiserson]
''Amortized Algorithms, Table Doubling, Potential Method''
Lecture 13, course MIT 6.046J/18.410J Introduction to Algorithms—Fall 2005
Hashing is an example of a
space-time tradeoff. If
memory
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remember ...
is infinite, the entire key can be used directly as an index to locate its value with a single memory access. On the other hand, if infinite time is available, values can be stored without regard for their keys, and a
binary search or
linear search can be used to retrieve the element.
In many situations, hash tables turn out to be on average more efficient than
search trees or any other
table lookup structure. For this reason, they are widely used in many kinds of computer
software
Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work.
At the lowest programming level, executable code consist ...
, particularly for
associative arrays,
database indexing,
caches, and
sets.
History
The idea of hashing arose independently in different places. In January 1953,
Hans Peter Luhn wrote an internal
IBM memorandum that used hashing with chaining. Open addressing was later proposed by A. D. Linh building on Luhn's paper.
Around the same time,
Gene Amdahl,
Elaine M. McGraw,
Nathaniel Rochester, and
Arthur Samuel of
IBM Research implemented hashing for the
IBM 701
The IBM 701 Electronic Data Processing Machine, known as the Defense Calculator while in development, was IBM’s first commercial scientific computer and its first series production mainframe computer, which was announced to the public on May ...
assembler. Open addressing with linear probing is credited to Amdahl, although
Ershov independently had the same idea.
The term "open addressing" was coined by
W. Wesley Peterson on his article which discusses the problem of search in large files.
The first
published work on hashing with chaining is credited to
Arnold Dumey, who discussed the idea of using remainder module a prime as a hash function. The word "hashing" was first published by an article by Robert Morris. A
theoretical analysis of linear probing was submitted originally by Konheim and Weiss.
Overview
An
associative array stores a
set
Set, The Set, SET or SETS may refer to:
Science, technology, and mathematics Mathematics
*Set (mathematics), a collection of elements
*Category of sets, the category whose objects and morphisms are sets and total functions, respectively
Electro ...
of (key, value) pairs and allows insertion, deletion, and lookup (search), with the constraint of
unique keys. In the hash table implementation of associative arrays, an array
of length
is partially filled with
elements, where
. A value
gets stored at an index location