mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...

and

computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...

, in the field of

coding theory Coding theory is the study of the properties of codes and their respective fitness for specific applications. Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage. Codes are studied ...

, the Hamming bound is a limit on the parameters of an arbitrary

block code In coding theory, block codes are a large and important family of error-correcting codes that encode data in blocks. There is a vast number of examples for block codes, many of which have a wide range of practical applications. The abstract defini ...

: it is also known as the sphere-packing bound or the volume bound from an interpretation in terms of packing balls in the

Hamming metric In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of ''substitutions'' required to chang ...

into the

space Space is the boundless three-dimensional extent in which objects and events have relative position and direction. In classical physics, physical space is often conceived in three linear dimensions, although modern physicists usually consi ...

of all possible words. It gives an important limitation on the

efficiency Efficiency is the often measurable ability to avoid wasting materials, energy, efforts, money, and time in doing something or in producing a desired result. In a more general sense, it is the ability to do things well, successfully, and without ...

with which any

error-correcting code In computing, telecommunication, information theory, and coding theory, an error correction code, sometimes error correcting code, (ECC) is used for controlling errors in data over unreliable or noisy communication channels. The central idea i ...

can utilize the space in which its

code word In communication, a code word is an element of a standardized code or protocol. Each code word is assembled in accordance with the specific rules of the code and assigned a unique meaning. Code words are typically used for reasons of reliability, ...

s are embedded. A code that attains the Hamming bound is said to be a perfect code.

Background on error-correcting codes

An original message and an encoded version are both composed in an alphabet of ''q'' letters. Each

contains ''n'' letters. The original message (of length ''m'') is shorter than ''n'' letters. The message is converted into an ''n''-letter codeword by an encoding algorithm, transmitted over a noisy channel, and finally decoded by the receiver. The decoding process interprets a garbled codeword, referred to as simply a ''word'', as the valid codeword "nearest" the ''n''-letter received string. Mathematically, there are exactly ''q''^''m'' possible messages of length ''m'', and each message can be regarded as a

vector Vector most often refers to: *Euclidean vector, a quantity with a magnitude and a direction *Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematic ...

of length ''m''. The encoding scheme converts an ''m''-dimensional vector into an ''n''-dimensional vector. Exactly ''q''^''m'' valid codewords are possible, but any one of ''q''^''n'' words can be received because the noisy channel might distort one or more of the ''n'' letters when a codeword is transmitted.

Statement of the bound

Preliminary definitions

An alphabet set

\mathcal_q

is a set of symbols with

q

elements. The set of strings of length

n

on the alphabet set

\mathcal_q

are denoted

\mathcal_q^n

. (There are

q^n

distinct strings in this set of strings.) A

q

-ary block code of length

n

is a subset of the strings of

\mathcal_q^n

, where the alphabet set

\mathcal_q

is any alphabet set having

q

elements. (The choice of alphabet set

\mathcal_q

makes no difference to the result, provided the alphabet is of size

q

; for example, any block code defined using the alphabet

\mathcal_3=\

can be converted to one using the alphabet

\mathcal_3=\

by a simple

substitution cipher In cryptography, a substitution cipher is a method of encrypting in which units of plaintext are replaced with the ciphertext, in a defined manner, with the help of a key; the "units" may be single letters (the most common), pairs of letters, tri ...

and vice versa. E.G. for the block code of length ''2'', the codewords

\text\text

and

\text\text

could be converted to the codewords

\text\text

and

\text\text

and vice versa. More than one such substitution cipher is possible, but the differences are irrelevant to the bound and its proof.)

Defining the bound

Let

\ A_q(n,d)

denote the maximum possible size of a

q

-ary block code

\ C

of length

n

and minimum

Hamming distance In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of ''substitutions'' required to chan ...

d

between elements of the block code (necessarily positive for

q^n > 1

). Then, the Hamming bound is: :

\ A_q(n,d) \leq \frac

where :

t=\left\lfloor\frac\right\rfloor.

Proof

It follows from the definition of

d

that if at most :

t = \left\lfloor\frac(d-1)\right\rfloor

errors are made during transmission of a

codeword In communication, a code word is an element of a standardized code or protocol. Each code word is assembled in accordance with the specific rules of the code and assigned a unique meaning. Code words are typically used for reasons of reliability, ...

then minimum distance decoding will decode it correctly (i.e., it decodes the received word as the codeword that was sent). Thus the code is said to be capable of correcting

t

errors. For each codeword

c \in C

, consider a

ball A ball is a round object (usually spherical, but can sometimes be ovoid) with several uses. It is used in ball games, where the play of the game follows the state of the ball as it is hit, kicked or thrown by players. Balls can also be used f ...

of fixed radius

t

around

c

. Every pair of these balls (Hamming spheres) are non-intersecting by the

t

-error-correcting property. Let

m

be the number of words in each ball (in other words, the volume of the ball). A word that is in such a ball can deviate in at most

t

components from those of the ball's

centre Center or centre may refer to: Mathematics *Center (geometry), the middle of an object * Center (algebra), used in various contexts ** Center (group theory) ** Center (ring theory) * Graph center, the set of all vertices of minimum eccentricity ...

, which is a codeword. The number of such words is then obtained by choosing up to

t

of the

n

components of a codeword to deviate to one of

(q-1)

possible other values (recall, the code is

q

-ary: it takes values in

\mathcal_q^n

). Thus, :

m =
 \begin
  \sum_^t \binom(q-1)^k
 \end.

A_q (n,d)

is the (maximum) total number of codewords in

C

, and so, by the definition of

t

, the greatest number of balls with no two balls having a word in common. Taking the union of the words in these balls centered at codewords, results in a set of words, each counted precisely once, that is a subset of

\mathcal_q^n

(where

, \mathcal_q^n,  = q^n

words) and so: :

A_q(n,d) \times m = A_q(n,d) \times
 \begin
  \sum_^t \binom(q-1)^k
 \end
\leq q^n.

Whence: :

A_q(n,d) \leq \frac.

Covering radius and packing radius

: For an

A_q(n,d)

code ''C'' (a subset of

\mathcal_q^n

), the ''covering radius'' of ''C'' is the smallest value of ''r'' such that every element of

\mathcal_q^n

is contained in at least one ball of radius ''r'' centered at each codeword of ''C''. The ''packing radius'' of ''C'' is the largest value of ''s'' such that the set of balls of radius ''s'' centered at each codeword of ''C'' are mutually disjoint. From the proof of the Hamming bound, it can be seen that for

t\,=\,\left\lfloor\frac(d-1)\right\rfloor

, we have: :: ''s'' ≤ ''t'' and ''t'' ≤ ''r''. Therefore, ''s'' ≤ ''r'' and if equality holds then ''s'' = ''r'' = ''t''. The case of equality means that the Hamming bound is attained.

Perfect codes

Codes that attain the Hamming bound are called perfect codes. Examples include codes that have only one codeword, and codes that are the whole of

\scriptstyle\mathcal_q^n

. Another example is given by the ''repeat codes'', where each symbol of the message is repeated an odd fixed number of times to obtain a codeword where ''q'' = 2. All of these examples are often called the ''trivial'' perfect codes. In 1973, Tietäväinen proved that any non-trivial perfect code over a prime-power alphabet has the parameters of a

Hamming code In computer science and telecommunication, Hamming codes are a family of linear error-correcting codes. Hamming codes can detect one-bit and two-bit errors, or correct one-bit errors without detection of uncorrected errors. By contrast, the sim ...

or a Golay code. A perfect code may be interpreted as one in which the balls of Hamming radius ''t'' centered on codewords exactly fill out the space (''t'' is the covering radius = packing radius). A quasi-perfect code is one in which the balls of Hamming radius ''t'' centered on codewords are disjoint and the balls of radius ''t''+1 cover the space, possibly with some overlaps. Another way to say this is that a code is ''quasi-perfect'' if its covering radius is one greater than its packing radius.

Notes

References

* * * * * * * * {{Packing problem Coding theory