coding theory Coding theory is the study of the properties of codes and their respective fitness for specific applications. Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage. Codes are stud ...

, Hamming(7,4) is a linear error-correcting code that encodes four

bit The bit is the most basic unit of information in computing and digital communications. The name is a portmanteau of binary digit. The bit represents a logical state with one of two possible values. These values are most commonly represente ...

s of data into seven bits by adding three

parity bit A parity bit, or check bit, is a bit added to a string of binary code. Parity bits are a simple form of error detecting code. Parity bits are generally applied to the smallest units of a communication protocol, typically 8-bit octets (bytes) ...

s. It is a member of a larger family of

Hamming code In computer science and telecommunication, Hamming codes are a family of linear error-correcting codes. Hamming codes can detect one-bit and two-bit errors, or correct one-bit errors without detection of uncorrected errors. By contrast, the sim ...

s, but the term ''Hamming code'' often refers to this specific code that

Richard W. Hamming Richard Wesley Hamming (February 11, 1915 – January 7, 1998) was an American mathematician whose work had many implications for computer engineering and telecommunications. His contributions include the Hamming code (which makes use of a Ham ...

introduced in 1950. At the time, Hamming worked at

Bell Telephone Laboratories Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mul ...

and was frustrated with the error-prone

punched card A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to di ...

reader, which is why he started working on error-correcting codes. The Hamming code adds three additional check bits to every four data bits of the message. Hamming's (7,4)

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...

can correct any single-bit error, or detect all single-bit and two-bit errors. In other words, the minimal

Hamming distance In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of ''substitutions'' required to chan ...

between any two correct codewords is 3, and received words can be correctly decoded if they are at a distance of at most one from the codeword that was transmitted by the sender. This means that for transmission medium situations where burst errors do not occur, Hamming's (7,4) code is effective (as the medium would have to be extremely noisy for two out of seven bits to be flipped). In

quantum information Quantum information is the information of the state of a quantum system. It is the basic entity of study in quantum information theory, and can be manipulated using quantum information processing techniques. Quantum information refers to both th ...

, the Hamming (7,4) is used as the base for the

Steane code The Steane code is a tool in quantum error correction introduced by Andrew Steane in 1996. It is a CSS code (Calderbank-Shor-Steane), using the classical binary ,4,3Hamming code to correct for qubit flip errors (X errors) and the dual of the Hamm ...

, a type of

CSS code In quantum error correction, CSS codes, named after their inventors, Robert Calderbank, Peter Shor and Andrew Steane, are a special type of stabilizer code constructed from classical codes with some special properties. An example of a CSS code ...

used for

quantum error correction Quantum error correction (QEC) is used in quantum computing to protect quantum information from errors due to decoherence and other quantum noise. Quantum error correction is theorised as essential to achieve fault tolerant quantum computing that ...

Goal

The goal of the Hamming codes is to create a set of

s that overlap so that a single-bit error in a data bit ''or'' a parity bit can be detected and corrected. While multiple overlaps can be created, the general method is presented in

Hamming codes In computer science and telecommunication, Hamming codes are a family of linear error-correcting codes. Hamming codes can detect one-bit and two-bit errors, or correct one-bit errors without detection of uncorrected errors. By contrast, the sim ...

. : This table describes which parity bits cover which transmitted bits in the encoded word. For example, ''p''₂ provides an even parity for bits 2, 3, 6, and 7. It also details which transmitted bit is covered by which parity bit by reading the column. For example, ''d''₁ is covered by ''p''₁ and ''p''₂ but not ''p''₃ This table will have a striking resemblance to the parity-check matrix (H) in the next section. Furthermore, if the parity columns in the above table were removed : then resemblance to rows 1, 2, and 4 of the code generator matrix (G) below will also be evident. So, by picking the parity bit coverage correctly, all errors with a Hamming distance of 1 can be detected and corrected, which is the point of using a Hamming code.

Hamming matrices

Hamming codes can be computed in

linear algebra Linear algebra is the branch of mathematics concerning linear equations such as: :a_1x_1+\cdots +a_nx_n=b, linear maps such as: :(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n, and their representations in vector spaces and through matrices. ...

terms through

matrices Matrix most commonly refers to: * ''The Matrix'' (franchise), an American media franchise ** ''The Matrix'', a 1999 science-fiction action film ** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchis ...

because Hamming codes are

linear code In coding theory, a linear code is an error-correcting code for which any linear combination of codewords is also a codeword. Linear codes are traditionally partitioned into block codes and convolutional codes, although turbo codes can be seen as ...

s. For the purposes of Hamming codes, two Hamming matrices can be defined: the code

generator matrix In coding theory, a generator matrix is a matrix whose rows form a basis for a linear code. The codewords are all of the linear combinations of the rows of this matrix, that is, the linear code is the row space of its generator matrix. Terminolog ...

G and the

parity-check matrix In coding theory, a parity-check matrix of a linear block code ''C'' is a matrix which describes the linear relations that the components of a codeword must satisfy. It can be used to decide whether a particular vector is a codeword and is also used ...

H: :

\mathbf := \begin
 1 & 1 & 0 & 1 \\
 1 & 0 & 1 & 1 \\
 1 & 0 & 0 & 0 \\
 0 & 1 & 1 & 1 \\
 0 & 1 & 0 & 0 \\
 0 & 0 & 1 & 0 \\
 0 & 0 & 0 & 1 \\
\end, \qquad \mathbf := \begin
 1 & 0 & 1 & 0 & 1 & 0 & 1 \\
 0 & 1 & 1 & 0 & 0 & 1 & 1 \\
 0 & 0 & 0 & 1 & 1 & 1 & 1 \\
\end.

As mentioned above, rows 1, 2, and 4 of G should look familiar as they map the data bits to their parity bits: * ''p''₁ covers ''d''₁, ''d''₂, ''d''₄ * ''p''₂ covers ''d''₁, ''d''₃, ''d''₄ * ''p''₃ covers ''d''₂, ''d''₃, ''d''₄ The remaining rows (3, 5, 6, 7) map the data to their position in encoded form and there is only 1 in that row so it is an identical copy. In fact, these four rows are

linearly independent In the theory of vector spaces, a set of vectors is said to be if there is a nontrivial linear combination of the vectors that equals the zero vector. If no such linear combination exists, then the vectors are said to be . These concepts are ...

and form the

identity matrix In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. Terminology and notation The identity matrix is often denoted by I_n, or simply by I if the size is immaterial o ...

(by design, not coincidence). Also as mentioned above, the three rows of H should be familiar. These rows are used to compute the syndrome vector at the receiving end and if the syndrome vector is the

null vector In mathematics, given a vector space ''X'' with an associated quadratic form ''q'', written , a null vector or isotropic vector is a non-zero element ''x'' of ''X'' for which . In the theory of real number, real bilinear forms, definite quadrat ...

(all zeros) then the received word is error-free; if non-zero then the value indicates which bit has been flipped. The four data bits — assembled as a vector p — is pre-multiplied by G (i.e., Gp) and taken

modulo In computing, the modulo operation returns the remainder or signed remainder of a division, after one number is divided by another (called the '' modulus'' of the operation). Given two positive numbers and , modulo (often abbreviated as ) is t ...

2 to yield the encoded value that is transmitted. The original 4 data bits are converted to seven bits (hence the name "Hamming(7,4)") with three parity bits added to ensure even parity using the above data bit coverages. The first table above shows the mapping between each data and parity bit into its final bit position (1 through 7) but this can also be presented in a

Venn diagram A Venn diagram is a widely used diagram style that shows the logical relation between set (mathematics), sets, popularized by John Venn (1834–1923) in the 1880s. The diagrams are used to teach elementary set theory, and to illustrate simple ...

. The first diagram in this article shows three circles (one for each parity bit) and encloses data bits that each parity bit covers. The second diagram (shown to the right) is identical but, instead, the bit positions are marked. For the remainder of this section, the following 4 bits (shown as a column vector) will be used as a running example: :

\mathbf = \begin d_1 \\ d_2 \\ d_3 \\ d_4 \\ \end = \begin 1 \\ 0 \\ 1 \\ 1 \end

Channel coding

Suppose we want to transmit this data (1011) over a noisy

communications channel A communication channel refers either to a physical transmission medium such as a wire, or to a logical connection over a multiplexed medium such as a radio channel in telecommunications and computer networking. A channel is used for informat ...

. Specifically, a

binary symmetric channel A binary symmetric channel (or BSCp) is a common communications channel model used in coding theory and information theory. In this model, a transmitter wishes to send a bit (a zero or a one), and the receiver will receive a bit. The bit will be "f ...

meaning that error corruption does not favor either zero or one (it is symmetric in causing errors). Furthermore, all source vectors are assumed to be equiprobable. We take the product of G and p, with entries modulo 2, to determine the transmitted codeword x: :

\mathbf = \mathbf \mathbf =
\begin
 1 & 1 & 0 & 1 \\
 1 & 0 & 1 & 1 \\
 1 & 0 & 0 & 0 \\
 0 & 1 & 1 & 1 \\
 0 & 1 & 0 & 0 \\
 0 & 0 & 1 & 0 \\
 0 & 0 & 0 & 1 \\
\end
\begin 1 \\ 0 \\ 1 \\ 1 \end =
\begin 2 \\ 3 \\ 1 \\ 2 \\ 0 \\ 1 \\ 1 \end =
\begin 0 \\ 1 \\ 1 \\ 0 \\ 0 \\ 1 \\ 1 \end

This means that 0110011 would be transmitted instead of transmitting 1011. Programmers concerned about multiplication should observe that each row of the result is the least significant bit of the Population Count of set bits resulting from the row and column being

Bitwise AND In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral (considered as a bit string) at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operati ...

ed together rather than multiplied. In the adjacent diagram, the seven bits of the encoded word are inserted into their respective locations; from inspection it is clear that the parity of the red, green, and blue circles are even: * red circle has two 1's * green circle has two 1's * blue circle has four 1's What will be shown shortly is that if, during transmission, a bit is flipped then the parity of two or all three circles will be incorrect and the errored bit can be determined (even if one of the parity bits) by knowing that the parity of all three of these circles should be even.

Parity check

If no error occurs during transmission, then the received codeword r is identical to the transmitted codeword x: :

\mathbf = \mathbf

The receiver multiplies H and r to obtain the syndrome vector z, which indicates whether an error has occurred, and if so, for which codeword bit. Performing this multiplication (again, entries modulo 2): :

\mathbf = \mathbf\mathbf = 
\begin
 1 & 0 & 1 & 0 & 1 & 0 & 1 \\
 0 & 1 & 1 & 0 & 0 & 1 & 1 \\
 0 & 0 & 0 & 1 & 1 & 1 & 1 \\
\end
\begin 0 \\ 1 \\ 1 \\ 0 \\ 0 \\ 1 \\ 1 \end =
\begin 2 \\ 4 \\ 2 \end = 
\begin 0 \\ 0 \\ 0 \end

Since the syndrome z is the

, the receiver can conclude that no error has occurred. This conclusion is based on the observation that when the data vector is multiplied by G, a change of basis occurs into a vector subspace that is the

kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learnin ...

of H. As long as nothing happens during transmission, r will remain in the kernel of H and the multiplication will yield the null vector.

Error correction

Otherwise, suppose, we can write :

\mathbf  = \mathbf +\mathbf_i

modulo 2, where e_''i'' is the

i_

unit vector In mathematics, a unit vector in a normed vector space is a vector (often a spatial vector) of length 1. A unit vector is often denoted by a lowercase letter with a circumflex, or "hat", as in \hat (pronounced "v-hat"). The term ''direction vecto ...

, that is, a zero vector with a 1 in the

i^

, counting from 1. :

\mathbf_2 = \begin 0 \\ 1 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end

Thus the above expression signifies a single bit error in the

i^

place. Now, if we multiply this vector by H: :

\mathbf = \mathbf \left( \mathbf+\mathbf_i \right) = \mathbf + \mathbf_i

Since x is the transmitted data, it is without error, and as a result, the product of H and x is zero. Thus :

\mathbf + \mathbf_i = \mathbf + \mathbf_i = \mathbf_i

Now, the product of H with the

i^

standard basis vector picks out that column of H, we know the error occurs in the place where this column of H occurs. For example, suppose we have introduced a bit error on bit #5 :

\mathbf = \mathbf+\mathbf_5 = \begin 0 \\ 1 \\ 1 \\ 0 \\ 0 \\ 1 \\ 1 \end + \begin 0 \\ 0 \\ 0 \\ 0 \\ 1 \\ 0 \\ 0 \end = \begin 0 \\ 1 \\ 1 \\ 0 \\ 1 \\ 1 \\ 1 \end

The diagram to the right shows the bit error (shown in blue text) and the bad parity created (shown in red text) in the red and green circles. The bit error can be detected by computing the parity of the red, green, and blue circles. If a bad parity is detected then the data bit that overlaps ''only'' the bad parity circles is the bit with the error. In the above example, the red and green circles have bad parity so the bit corresponding to the intersection of red and green but not blue indicates the errored bit. Now, :

\mathbf = \mathbf = \begin 1 & 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 1 & 1 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 1 & 1 & 1 & 1 \\ \end 
\begin 0 \\ 1 \\ 1 \\ 0 \\ 1 \\ 1 \\ 1 \end = \begin 3 \\ 4 \\ 3 \end =  \begin 1 \\ 0 \\ 1 \end

which corresponds to the fifth column of H. Furthermore, the general algorithm used (''see Hamming code#General algorithm'') was intentional in its construction so that the syndrome of 101 corresponds to the binary value of 5, which indicates the fifth bit was corrupted. Thus, an error has been detected in bit 5, and can be corrected (simply flip or negate its value): :

\mathbf_ =  \begin 0 \\ 1 \\ 1 \\ 0 \\ \overline \\ 1 \\ 1 \end = \begin 0 \\ 1 \\ 1 \\ 0 \\ 0 \\ 1 \\ 1 \end

This corrected received value indeed, now, matches the transmitted value x from above.

Decoding

Once the received vector has been determined to be error-free or corrected if an error occurred (assuming only zero or one bit errors are possible) then the received data needs to be decoded back into the original four bits. First, define a matrix R: :

\mathbf = \begin
 0 & 0 & 1 & 0 & 0 & 0 & 0 \\
 0 & 0 & 0 & 0 & 1 & 0 & 0 \\
 0 & 0 & 0 & 0 & 0 & 1 & 0 \\
 0 & 0 & 0 & 0 & 0 & 0 & 1 \\
\end

Then the received value, p_r, is equal to Rr. Using the running example from above :

\mathbf = \begin
 0 & 0 & 1 & 0 & 0 & 0 & 0 \\
 0 & 0 & 0 & 0 & 1 & 0 & 0 \\
 0 & 0 & 0 & 0 & 0 & 1 & 0 \\
 0 & 0 & 0 & 0 & 0 & 0 & 1 \\
\end
\begin 0 \\ 1 \\ 1 \\ 0 \\ 0 \\ 1 \\ 1  \end = \begin 1 \\ 0 \\ 1 \\ 1 \end

Multiple bit errors

It is not difficult to show that only single bit errors can be corrected using this scheme. Alternatively, Hamming codes can be used to detect single and double bit errors, by merely noting that the product of H is nonzero whenever errors have occurred. In the adjacent diagram, bits 4 and 5 were flipped. This yields only one circle (green) with an invalid parity but the errors are not recoverable. However, the Hamming (7,4) and similar Hamming codes cannot distinguish between single-bit errors and two-bit errors. That is, two-bit errors appear the same as one-bit errors. If error correction is performed on a two-bit error the result will be incorrect. Similarly, Hamming codes cannot detect or recover from an arbitrary three-bit error; Consider the diagram: if the bit in the green circle (colored red) were 1, the parity checking would return the null vector, indicating that there is no error in the codeword.

All codewords

Since the source is only 4 bits then there are only 16 possible transmitted words. Included is the eight-bit value if an extra parity bit is used (''see Hamming(7,4) code with an additional parity bit''). (The data bits are shown in blue; the parity bits are shown in red; and the extra parity bit shown in green.)

E₇ lattice

The Hamming(7,4) code is closely related to the E₇ lattice and, in fact, can be used to construct it, or more precisely, its dual lattice E₇^∗ (a similar construction for E₇ uses the

dual code In coding theory, the dual code of a linear code :C\subset\mathbb_q^n is the linear code defined by :C^\perp = \ where :\langle x, c \rangle = \sum_^n x_i c_i is a scalar product. In linear algebra terms, the dual code is the annihilator ...

,3,4sub>2). In particular, taking the set of all vectors ''x'' in Z^''7'' with ''x'' congruent (modulo 2) to a codeword of Hamming(7,4), and rescaling by 1/, gives the lattice E₇^∗ :

E_7^* = \tfrac\left\.

This is a particular instance of a more general relation between lattices and codes. For instance, the extended (8,4)-Hamming code, which arises from the addition of a parity bit, is also related to the E₈ lattice.

References

{{reflist

External links

A programming problem about the Hamming Code(7,4)
Coding theory Error detection and correction Computer arithmetic