In
mathematics
Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
and
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, in the field of
coding theory
Coding theory is the study of the properties of codes and their respective fitness for specific applications. Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage. Codes are stud ...
, the Hamming bound is a limit on the parameters of an arbitrary
block code
In coding theory, block codes are a large and important family of error-correcting codes that encode data in blocks.
There is a vast number of examples for block codes, many of which have a wide range of practical applications. The abstract definit ...
: it is also known as the sphere-packing bound or the volume bound from an interpretation in terms of
packing balls in the
Hamming metric
In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of ''substitutions'' required to ch ...
into the
space
Space is the boundless three-dimensional extent in which objects and events have relative position and direction. In classical physics, physical space is often conceived in three linear dimensions, although modern physicists usually consider ...
of all possible words. It gives an important limitation on the
efficiency
Efficiency is the often measurable ability to avoid wasting materials, energy, efforts, money, and time in doing something or in producing a desired result. In a more general sense, it is the ability to do things well, successfully, and without ...
with which any
error-correcting code
In computing, telecommunication, information theory, and coding theory, an error correction code, sometimes error correcting code, (ECC) is used for controlling errors in data over unreliable or noisy communication channels. The central idea is ...
can utilize the space in which its
code word
In communication, a code word is an element of a standardized code or protocol. Each code word is assembled in accordance with the specific rules of the code and assigned a unique meaning. Code words are typically used for reasons of reliability, ...
s are embedded. A code that attains the Hamming bound is said to be a perfect code.
Background on error-correcting codes
An original message and an encoded version are both composed in an alphabet of ''q'' letters. Each
code word
In communication, a code word is an element of a standardized code or protocol. Each code word is assembled in accordance with the specific rules of the code and assigned a unique meaning. Code words are typically used for reasons of reliability, ...
contains ''n'' letters. The original message (of length ''m'') is shorter than ''n'' letters. The message is converted into an ''n''-letter codeword by an encoding algorithm, transmitted over a noisy
channel
Channel, channels, channeling, etc., may refer to:
Geography
* Channel (geography), in physical geography, a landform consisting of the outline (banks) of the path of a narrow body of water.
Australia
* Channel Country, region of outback Austral ...
, and finally decoded by the receiver. The decoding process interprets a garbled codeword, referred to as simply a ''word'', as the valid codeword "nearest" the ''n''-letter received string.
Mathematically, there are exactly ''q''
''m'' possible messages of length ''m'', and each message can be regarded as a
vector
Vector most often refers to:
*Euclidean vector, a quantity with a magnitude and a direction
*Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism
Vector may also refer to:
Mathematic ...
of length ''m''. The encoding scheme converts an ''m''-dimensional vector into an ''n''-dimensional vector. Exactly ''q''
''m'' valid codewords are possible, but any one of ''q''
''n'' words can be received because the noisy channel might distort one or more of the ''n'' letters when a codeword is transmitted.
Statement of the bound
Preliminary definitions
An alphabet set
is a set of symbols with
elements. The set of strings of length
on the alphabet set
are denoted
. (There are
distinct strings in this set of strings.) A
-ary block code of length
is a subset of the strings of
, where the alphabet set
is any alphabet set having
elements. (The choice of alphabet set
makes no difference to the result, provided the alphabet is of size
; for example, any block code defined using the alphabet
can be converted to one using the alphabet
by a simple
substitution cipher
In cryptography, a substitution cipher is a method of encrypting in which units of plaintext are replaced with the ciphertext, in a defined manner, with the help of a key; the "units" may be single letters (the most common), pairs of letters, trip ...
and vice versa. E.G. for the block code of length ''2'', the codewords
and
could be converted to the codewords
and
and vice versa. More than one such substitution cipher is possible, but the differences are irrelevant to the bound and its proof.)
Defining the bound
Let
denote the maximum possible size of a
-ary block code
of length
and minimum
Hamming distance
In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of ''substitutions'' required to chan ...
between elements of the block code (necessarily positive for
).
Then, the Hamming bound is:
:
where
:
Proof
It follows from the definition of
that if at most
:
errors are made during transmission of a
codeword
In communication, a code word is an element of a standardized code or protocol. Each code word is assembled in accordance with the specific rules of the code and assigned a unique meaning. Code words are typically used for reasons of reliability, ...
then
minimum distance decoding
In coding theory, decoding is the process of translating received messages into codewords of a given code. There have been many common methods of mapping messages to codewords. These are often used to recover messages sent over a noisy channel, suc ...
will decode it correctly (i.e., it decodes the received word as the codeword that was sent). Thus the code is said to be capable of correcting
errors.
For each codeword
, consider a
ball
A ball is a round object (usually spherical, but can sometimes be ovoid) with several uses. It is used in ball games, where the play of the game follows the state of the ball as it is hit, kicked or thrown by players. Balls can also be used f ...
of fixed radius
around
. Every pair of these balls (Hamming spheres) are non-intersecting by the
-error-correcting property. Let
be the number of words in each ball (in other words, the volume of the ball). A word that is in such a ball can deviate in at most
components from those of the ball's
centre
Center or centre may refer to:
Mathematics
*Center (geometry), the middle of an object
* Center (algebra), used in various contexts
** Center (group theory)
** Center (ring theory)
* Graph center, the set of all vertices of minimum eccentricity ...
, which is a codeword. The number of such words is then obtained by
choosing up to
of the
components of a codeword to deviate to one of
possible other values (recall, the code is
-ary: it takes values in
). Thus,
:
is the (maximum) total number of codewords in
, and so, by the definition of
, the greatest number of balls with no two balls having a word in common. Taking the
union
Union commonly refers to:
* Trade union, an organization of workers
* Union (set theory), in mathematics, a fundamental operation on sets
Union may also refer to:
Arts and entertainment
Music
* Union (band), an American rock group
** ''Un ...
of the words in these balls centered at codewords, results in a set of words, each counted precisely once, that is a subset of
(where
words) and so:
:
Whence:
:
Covering radius and packing radius
:
For an
code ''C'' (a subset of
), the ''covering radius'' of ''C'' is the smallest value of ''r'' such that every element of
is contained in at least one ball of radius ''r'' centered at each codeword of ''C''. The ''packing radius'' of ''C'' is the largest value of ''s'' such that the set of balls of radius ''s'' centered at each codeword of ''C'' are mutually disjoint.
From the proof of the Hamming bound, it can be seen that for
, we have:
:: ''s'' ≤ ''t'' and ''t'' ≤ ''r''.
Therefore, ''s'' ≤ ''r'' and if equality holds then ''s'' = ''r'' = ''t''. The case of equality means that the Hamming bound is attained.
Perfect codes
Codes that attain the Hamming bound are called perfect codes. Examples include codes that have only one codeword, and codes that are the whole of
. Another example is given by the ''repeat codes'', where each symbol of the message is repeated an odd fixed number of times to obtain a codeword where ''q'' = 2. All of these examples are often called the ''trivial'' perfect codes.
In 1973, Tietäväinen proved that any non-trivial perfect code over a prime-power alphabet has the parameters of a
Hamming code
In computer science and telecommunication, Hamming codes are a family of linear error-correcting codes. Hamming codes can detect one-bit and two-bit errors, or correct one-bit errors without detection of uncorrected errors. By contrast, the sim ...
or a
Golay code.
A perfect code may be interpreted as one in which the balls of Hamming radius ''t'' centered on codewords exactly fill out the space (''t'' is the covering radius = packing radius). A quasi-perfect code is one in which the balls of Hamming radius ''t'' centered on codewords are disjoint and the balls of radius ''t''+1 cover the space, possibly with some overlaps. Another way to say this is that a code is ''quasi-perfect'' if its covering radius is one greater than its packing radius.
See also
*
Griesmer bound In the mathematics of coding theory, the Griesmer bound, named after James Hugo Griesmer, is a bound on the length of linear binary codes of dimension ''k'' and minimum distance ''d''.
There is also a very similar version for non-binary codes.
St ...
*
Singleton bound
In coding theory, the Singleton bound, named after Richard Collom Singleton, is a relatively crude upper bound on the size of an arbitrary block code C with block length n, size M and minimum distance d. It is also known as the Joshibound. proved b ...
*
Gilbert-Varshamov bound
*
Plotkin bound
In the mathematics of coding theory, the Plotkin bound, named after Morris Plotkin, is a limit (or bound) on the maximum possible number of codewords in binary codes of given length ''n'' and given minimum distance ''d''.
Statement of the bound ...
*
Johnson bound In applied mathematics, the Johnson bound (named after Selmer Martin Johnson) is a limit on the size of error-correcting codes, as used in coding theory for data transmission or communications.
Definition
Let C be a ''q''-ary code of length n, i ...
*
Rate-distortion theory
Notes
References
*
*
*
*
*
*
*
*
{{Packing problem
Coding theory