Generalized Minimum Distance Decoding
   HOME

TheInfoList



OR:

In
coding theory Coding theory is the study of the properties of codes and their respective fitness for specific applications. Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage. Codes are studied ...
, generalized minimum-distance (GMD) decoding provides an efficient
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
for decoding
concatenated code In coding theory, concatenated codes form a class of error-correcting codes that are derived by combining an inner code and an outer code. They were conceived in 1966 by Dave Forney as a solution to the problem of finding a code that has both expo ...
s, which is based on using an
error An error (from the Latin ''error'', meaning "wandering") is an action which is inaccurate or incorrect. In some usages, an error is synonymous with a mistake. The etymology derives from the Latin term 'errare', meaning 'to stray'. In statistics ...
s-and- erasures decoder for the outer code. A naive decoding algorithm for concatenated codes can not be an optimal way of decoding because it does not take into account the information that
maximum likelihood decoding In coding theory, decoding is the process of translating received messages into codewords of a given code. There have been many common methods of mapping messages to codewords. These are often used to recover messages sent over a noisy channel, su ...
(MLD) gives. In other words, in the naive algorithm, inner received
codeword In communication, a code word is an element of a standardized code or protocol. Each code word is assembled in accordance with the specific rules of the code and assigned a unique meaning. Code words are typically used for reasons of reliability, ...
s are treated the same regardless of the difference between their
hamming distance In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of ''substitutions'' required to chan ...
s. Intuitively, the outer decoder should place higher confidence in symbols whose inner
encodings In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communication ...
are close to the received word. David Forney in 1966 devised a better algorithm called generalized minimum distance (GMD) decoding which makes use of those information better. This method is achieved by measuring confidence of each received codeword, and erasing symbols whose confidence is below a desired value. And GMD decoding algorithm was one of the first examples of
soft-decision decoder In information theory, a soft-decision decoder is a kind of decoding methods – a class of algorithm used to decode data that has been encoded with an error correcting code. Whereas a hard-decision decoder operates on data that take on a fixed s ...
s. We will present three versions of the GMD decoding algorithm. The first two will be randomized algorithms while the last one will be a
deterministic algorithm In computer science, a deterministic algorithm is an algorithm that, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states. Deterministic algorithms are by far ...
.


Setup

*
Hamming distance In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of ''substitutions'' required to chan ...
: Given two
vector Vector most often refers to: *Euclidean vector, a quantity with a magnitude and a direction *Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematic ...
s u, v\in\Sigma^n the Hamming distance between u and v, denoted by \Delta(u, v), is defined to be the number of positions in which u and v differ. * Minimum distance: Let C\subseteq\Sigma^n be a code. The minimum distance of code C is defined to be d= \min\Delta(c_1, c_2) where c_1 \ne c_2 \in C * Code concatenation: Given m = (m_1, \cdots, m_K) \in K, consider two codes which we call outer code and inner code ::C_\text = K \to N, \qquad C_\text : k \to n, :and their distances are D and d. A concatenated code can be achieved by C_\text \circ C_\text (m) = (C_\text (C_\text (m)_1), \ldots, C_\text (C_\text (m)_N)) where C_\text(m) = ((C_\text (m)_1, \ldots, (m)_N)). Finally we will take C_\text to be RS code, which has an errors and erasure decoder, and K = O(\log N), which in turn implies that MLD on the inner code will be polynomial in N time. * Maximum likelihood decoding (MLD): MLD is a decoding method for error correcting codes, which outputs the codeword closest to the received word in Hamming distance. The MLD function denoted by D_ : \Sigma^n \to C is defined as follows. For every y\in\Sigma^n, D_(y) = \arg \min_\Delta(c, y). *
Probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) ca ...
: A probability distribution \Pr on a sample space S is a mapping from events of S to
real number In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every ...
s such that \Pr \ge 0 for any event A, \Pr = 1, and \Pr \cup B= \Pr + \Pr /math> for any two mutually exclusive events A and B * Expected value: The expected value of a
discrete random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
X is ::\mathbb = \sum_x \Pr = x


Randomized algorithm

Consider the received word \mathbf = (y_1,\ldots,y_N) \in ^nN which was corrupted by a
noisy channel In information theory, the noisy-channel coding theorem (sometimes Shannon's theorem or Shannon's limit), establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data (dig ...
. The following is the algorithm description for the general case. In this algorithm, we can decode y by just declaring an erasure at every bad position and running the errors and erasure decoding algorithm for C_\text on the resulting vector. Randomized_Decoder
Given : \mathbf = (y_1,\dots,y_N) \in ^nN. # For every 1 \le i \le N, compute y_i' = MLD_(y_i). # Set \omega_i = \min(\Delta(C_\text(y_i'), y_i), \tfrac). # For every 1 \le i \le N, repeat : With probability 2\omega_i \over d, set y_i'' \leftarrow ?, otherwise set y_i'' = y_i'. # Run errors and erasure algorithm for C_\text on \mathbf'' = (y_1'', \ldots, y_N''). Theorem 1. ''Let y be a received word such that there exists a
codeword In communication, a code word is an element of a standardized code or protocol. Each code word is assembled in accordance with the specific rules of the code and assigned a unique meaning. Code words are typically used for reasons of reliability, ...
'' \mathbf = (c_1,\cdots, c_N) \in C_\text\circ \subseteq ^nN ''such that'' \Delta(\mathbf, \mathbf) < \tfrac. ''Then the deterministic GMD algorithm outputs'' \mathbf. Note that a naive decoding algorithm for concatenated codes can correct up to \tfrac errors. :Lemma 1. Let the assumption in Theorem 1 hold. And if \mathbf'' has e' errors and s' erasures (when compared with \mathbf) after Step 1, then \mathbb e' + s'< D. ''Remark.'' If 2e' + s' < D, then the algorithm in Step 2 will output \mathbf. The lemma above says that in expectation, this is indeed the case. Note that this is not enough to prove Theorem 1, but can be crucial in developing future variations of the algorithm. Proof of lemma 1. For every 1 \le i \le N, define e_i = \Delta(y_i, c_i). This implies that :\sum_^N e_i < \frac \qquad\qquad (1) Next for every 1 \le i \le N, we define two
indicator variable In regression analysis, a dummy variable (also known as indicator variable or just dummy) is one that takes the values 0 or 1 to indicate the absence or presence of some categorical effect that may be expected to shift the outcome. For example, i ...
s: : \begin X = 1 &\Leftrightarrow y_i'' = ? \\ X = 1 &\Leftrightarrow C_\text(y_i'') \ne c_i \ \text \ y_i'' \neq ? \end We claim that we are done if we can show that for every 1 \le i \le N: : \mathbb \left X \right \leqslant \qquad\qquad (2) Clearly, by definition :e' = \sum_i X_i^e \quad \text \quad s' = \sum_i X_i^?. Further, by the
linear Linearity is the property of a mathematical relationship ('' function'') that can be graphically represented as a straight line. Linearity is closely related to '' proportionality''. Examples in physics include rectilinear motion, the linear ...
ity of expectation, we get :\mathbb e' + s'\leqslant \frac\sum_ie_i < D. To prove (2) we consider two cases: i-th block is correctly decoded (Case 1), i-th block is incorrectly decoded (Case 2): Case 1: (c_i = C_\text(y_i')) Note that if y_i'' = ? then X_i^e = 0, and \Pr _i'' = ?= \tfrac implies \mathbb _i^?= \Pr _i^? = 1= \tfrac, and \mathbb _i^e= \Pr _i^e = 1= 0. Further, by definition we have : \omega_i = \min \left (\Delta(C_\text(y_i'), y_i), \tfrac \right ) \leqslant \Delta(C_\text(y_i'), y_i) = \Delta(c_i, y_i) = e_i Case 2: (c_i \ne C_\text(y_i')) In this case, \mathbb _i^?= \tfrac and \mathbb _i^e= \Pr _i^e = 1= 1 - \tfrac. Since c_i \ne C_\text(y_i'), e_i + \omega_i \geqslant d. This follow
another case analysis
when (\omega_i = \Delta(C_\text(y_i'), y_i) < \tfrac) or not. Finally, this implies : \mathbb X_i^e + X_i^?= 2 - \le . In the following sections, we will finally show that the deterministic version of the algorithm above can do unique decoding of C_\text \circ C_\text up to half its design distance.


Modified randomized algorithm

Note that, in the previous version of the GMD algorithm in step "3", we do not really need to use "fresh"
randomness In common usage, randomness is the apparent or actual lack of pattern or predictability in events. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. Individual rand ...
for each i. Now we come up with another randomized version of the GMD algorithm that uses the ''same'' randomness for every i. This idea follows the algorithm below. Modified_Randomized_Decoder
Given : \mathbf = (y_1, \ldots,y_N) \in ^nN, pick \theta \in
, 1 The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline o ...
/math> at random. Then every for every 1 \le i \le N: # Set y_i' = MLD_(y_i). # Compute \omega_i = \min(\Delta(C_\text(y_i'), y_i), ). # If \theta< \tfrac, set y_i'' \leftarrow ?, otherwise set y_i'' = y_i'. # Run errors and erasure algorithm for C_\text on \mathbf'' = (y_1'',\ldots, y_N''). For the proof of
Lemma 1 Lemma may refer to: Language and linguistics * Lemma (morphology), the canonical, dictionary or citation form of a word * Lemma (psycholinguistics), a mental abstraction of a word about to be uttered Science and mathematics * Lemma (botany), a ...
, we only use the randomness to show that : \Pr _i'' = ?= . In this version of the GMD algorithm, we note that : \Pr _i'' = ?= \Pr \left theta_\in_\left_
theta_\in_\left_[0,_\tfrac_\right_\right_">,_\tfrac_\right_.html"_;"title="theta_\in_\left_[0,_\tfrac_\right_">theta_\in_\left_[0,_\tfrac_\right_\right_=_\tfrac. The_second_Equality_(mathematics).html" "title=",_\tfrac_\right_\right_.html" ;"title=",_\tfrac_\right_.html" ;"title="theta \in \left [0, \tfrac \right ">theta \in \left [0, \tfrac \right \right ">,_\tfrac_\right_.html" ;"title="theta \in \left [0, \tfrac \right ">theta \in \left [0, \tfrac \right \right = \tfrac. The second Equality (mathematics)">equality Equality may refer to: Society * Political equality, in which all members of a society are of equal standing ** Consociationalism, in which an ethnically, religiously, or linguistically divided state functions by cooperation of each group's elit ...
above follows from the choice of \theta. The proof of Lemma 1 can be also used to show \mathbb e' + s'< D for version2 of GMD. In the next section, we will see how to get a deterministic version of the GMD algorithm by choosing \theta from a polynomially sized set as opposed to the current infinite set
, 1 The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline o ...
/math>.


Deterministic algorithm

Let Q = \ \cup \. Since for each i, \omega_i = \min(\Delta(\mathbf, \mathbf), ), we have : Q = \ \cup \ where q_1 < \cdots < q_m for some m \le \left \lfloor \frac \right \rfloor. Note that for every \theta \in [q_i, q_], the step 1 of the second version of randomized algorithm outputs the same \mathbf''.. Thus, we need to consider all possible value of \theta \in Q. This gives the deterministic algorithm below. Deterministic_Decoder
Given : \mathbf = (y_1,\ldots,y_N) \in ^nN, for every \theta \in Q, repeat the following. # Compute y_i' = MLD_(y_i) for 1 \le i \le N. # Set \omega_i = \min(\Delta(C_\text(y_i'), y_i), ) for every 1 \le i \le N. # If \theta < , set y_i'' \leftarrow ?, otherwise set y_i'' = y_i'. # Run errors-and-erasures algorithm for C_\text on \mathbf'' = (y_1'', \ldots, y_N''). Let c_\theta be the codeword in C_\text \circ C_\text corresponding to the output of the algorithm, if any. # Among all the c_\theta output in 4, output the one closest to \mathbf Every loop of 1~4 can be run in polynomial time, the algorithm above can also be computed in polynomial time. Specifically, each call to an errors and erasures decoder of
errors takes O(d) time. Finally, the runtime of the algorithm above is O(NQn^ + NT_\text) where T_\text is the running time of the outer errors and erasures decoder.


See also

*
Concatenated code In coding theory, concatenated codes form a class of error-correcting codes that are derived by combining an inner code and an outer code. They were conceived in 1966 by Dave Forney as a solution to the problem of finding a code that has both expo ...
s * Reed Solomon error correction * Welch Berlekamp algorithm


References


University at Buffalo Lecture Notes on Coding Theory – Atri Rudra

MIT Lecture Notes on Essential Coding Theory – Madhu Sudan

University of Washington – Venkatesan Guruswami
* G. David Forney. Generalized Minimum Distance decoding. ''IEEE Transactions on Information Theory'', 12:125–131, 1966 {{DEFAULTSORT:Generalized minimum distance decoding Error detection and correction Coding theory Finite fields Information theory