In
coding theory
Coding theory is the study of the properties of codes and their respective fitness for specific applications. Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage. Codes are stud ...
, expander codes form a class of
error-correcting codes
In computing, telecommunication, information theory, and coding theory, an error correction code, sometimes error correcting code, (ECC) is used for controlling errors in data over unreliable or noisy communication channels. The central idea is ...
that are constructed from
bipartite expander graph
In graph theory, an expander graph is a sparse graph that has strong connectivity properties, quantified using vertex, edge or spectral expansion. Expander constructions have spawned research in pure and applied mathematics, with several appli ...
s.
Along with
Justesen codes, expander codes are of particular interest since they have a constant positive
rate, a constant positive relative
distance
Distance is a numerical or occasionally qualitative measurement of how far apart objects or points are. In physics or everyday usage, distance may refer to a physical length or an estimation based on other criteria (e.g. "two counties over"). ...
, and a constant
alphabet size.
In fact, the alphabet contains only two elements, so expander codes belong to the class of
binary code
A binary code represents text, computer processor instructions, or any other data using a two-symbol system. The two-symbol system used is often "0" and "1" from the binary number system. The binary code assigns a pattern of binary digits, als ...
s.
Furthermore, expander codes can be both encoded and decoded in time proportional to the block length of the code.
Expander codes
In
coding theory
Coding theory is the study of the properties of codes and their respective fitness for specific applications. Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage. Codes are stud ...
, an expander code is a
linear block code
In coding theory, block codes are a large and important family of error-correcting codes that encode data in blocks.
There is a vast number of examples for block codes, many of which have a wide range of practical applications. The abstract defini ...
whose parity check matrix is the adjacency matrix of a bipartite
expander graph
In graph theory, an expander graph is a sparse graph that has strong connectivity properties, quantified using vertex, edge or spectral expansion. Expander constructions have spawned research in pure and applied mathematics, with several appli ...
. These codes have good relative
distance
Distance is a numerical or occasionally qualitative measurement of how far apart objects or points are. In physics or everyday usage, distance may refer to a physical length or an estimation based on other criteria (e.g. "two counties over"). ...
, where
and
are properties of the expander graph as defined later),
rate , and decodability (algorithms of running time
exist).
Definition
Let
be a
-
biregular graph
In graph-theoretic mathematics, a biregular graph or semiregular bipartite graph is a bipartite graph G=(U,V,E) for which every two vertices on the same side of the given bipartition have the same degree as each other. If the degree of the vertice ...
between a set of
nodes
, called ''variables'', and a set of
nodes
, called ''constraints''.
Let
be a function designed so that, for each constraint
, the variables neighboring
are
.
Let
be an error-correcting code of block length
. The ''expander code''
is the code of block length
whose codewords are the words
such that, for
,
is a codeword of
.
It has been shown that nontrivial lossless expander graphs exist. Moreover, we can explicitly construct them.
Rate
The rate of
is its dimension divided by its block length. In this case, the parity check matrix has size
, and hence
has rate at least
.
Distance
Suppose
. Then the distance of a
expander code
is at least
.
Proof
Note that we can consider every codeword
in
as a subset of vertices
, by saying that vertex
if and only if the
th index of the codeword is a 1. Then
is a codeword iff every vertex
is adjacent to an even number of vertices in
. (In order to be a codeword,
, where
is the parity check matrix. Then, each vertex in
corresponds to each column of
. Matrix multiplication over
then gives the desired result.) So, if a vertex
is adjacent to a single vertex in
, we know immediately that
is not a codeword. Let
denote the neighbors in
of
, and
denote those neighbors of
which are unique, i.e., adjacent to a single vertex of
.
Lemma 1
For every
of size
,
.
Proof
Trivially,
, since
implies
.
follows since the degree of every vertex in
is
. By the expansion property of the graph, there must be a set of
edges which go to distinct vertices. The remaining
edges make at most
neighbors not unique, so
.
Corollary
Every sufficiently small
has a unique neighbor. This follows since
.
Lemma 2
Every subset
with
has a unique neighbor.
Proof
Lemma 1 proves the case
, so suppose
. Let
such that
. By Lemma 1, we know that
. Then a vertex
is in
iff
, and we know that
, so by the first part of Lemma 1, we know
. Since
,
, and hence
is not empty.
Corollary
Note that if a
has at least 1 unique neighbor, i.e.
, then the corresponding word
corresponding to
cannot be a codeword, as it will not multiply to the all zeros vector by the parity check matrix. By the previous argument,
. Since
is linear, we conclude that
has distance at least
.
Encoding
The encoding time for an expander code is upper bounded by that of a general linear code -
by matrix multiplication. A result due to Spielman shows that encoding is possible in
time.
Decoding
Decoding of expander codes is possible in
time when
using the following algorithm.
Let
be the vertex of
that corresponds to the
th index in the codewords of
. Let
be a received word, and
. Let
be
, and
be
. Then consider the greedy algorithm:
----
Input: received word
.
initialize y' to y
while there is a v in R adjacent to an odd number of vertices in V(y')
if there is an i such that o(i) > e(i)
flip entry i in y'
else
fail
Output: fail, or modified codeword
.
----
Proof
We show first the correctness of the algorithm, and then examine its running time.
Correctness
We must show that the algorithm terminates with the correct codeword when the received codeword is within half the code's distance of the original codeword. Let the set of corrupt variables be
,
, and the set of unsatisfied (adjacent to an odd number of vertices) vertices in
be
. The following lemma will prove useful.
= Lemma 3
=
If
, then there is a
with
.
= Proof
=
By Lemma 1, we know that
. So an average vertex has at least
unique neighbors (recall unique neighbors are unsatisfied and hence contribute to
), since
, and thus there is a vertex
with
.
So, if we have not yet reached a codeword, then there will always be some vertex to flip. Next, we show that the number of errors can never increase beyond
.
= Lemma 4
=
If we start with
, then we never reach
at any point in the algorithm.
= Proof
=
When we flip a vertex
,
and
are interchanged, and since we had
, this means the number of unsatisfied vertices on the right decreases by at least one after each flip. Since
, the initial number of unsatisfied vertices is at most
, by the graph's
-regularity. If we reached a string with
errors, then by Lemma 1, there would be at least
unique neighbors, which means there would be at least
unsatisfied vertices, a contradiction.
Lemmas 3 and 4 show us that if we start with
(half the distance of
), then we will always find a vertex
to flip. Each flip reduces the number of unsatisfied vertices in
by at least 1, and hence the algorithm terminates in at most
steps, and it terminates at some codeword, by Lemma 3. (Were it not at a codeword, there would be some vertex to flip). Lemma 4 shows us that we can never be farther than
away from the correct codeword. Since the code has distance
(since
), the codeword it terminates on must be the correct codeword, since the number of bit flips is less than half the distance (so we couldn't have traveled far enough to reach any other codeword).
Complexity
We now show that the algorithm can achieve linear time decoding. Let
be constant, and
be the maximum degree of any vertex in
. Note that
is also constant for known constructions.
# Pre-processing: It takes
time to compute whether each vertex in
has an odd or even number of neighbors.
# Pre-processing 2: We take
time to compute a list of vertices
in
which have
.
# Each Iteration: We simply remove the first list element. To update the list of odd / even vertices in
, we need only update
entries, inserting / removing as necessary. We then update
entries in the list of vertices in
with more odd than even neighbors, inserting / removing as necessary. Thus each iteration takes
time.
# As argued above, the total number of iterations is at most
.
This gives a total runtime of
time, where
and
are constants.
See also
*
Expander graph
In graph theory, an expander graph is a sparse graph that has strong connectivity properties, quantified using vertex, edge or spectral expansion. Expander constructions have spawned research in pure and applied mathematics, with several appli ...
*
Low-density parity-check code
In information theory, a low-density parity-check (LDPC) code is a linear error correcting code, a method of transmitting a message over a noisy transmission channel. An LDPC code is constructed using a sparse Tanner graph (subclass of the b ...
* Linear time encoding and decoding of error-correcting codes
* ABNNR and AEL codes
Notes
This article is based on Dr. Venkatesan Guruswami's course notes.
{{cite journal , first=V. , last=Guruswami , title=Guest column: error-correcting codes and expander graphs , journal=ACM SIGACT News , volume=35 , issue=3 , pages=25–41 , date=September 2004 , doi=10.1145/1027914.1027924 , s2cid=17550280 , url=http://dl.acm.org/citation.cfm?id=1027924
References
Error detection and correction
Coding theory
Capacity-approaching codes