Tornado Codes
   HOME

TheInfoList



OR:

In
coding theory Coding theory is the study of the properties of codes and their respective fitness for specific applications. Codes are used for data compression, cryptography, error detection and correction, data transmission and data storage. Codes are stud ...
, Tornado codes are a class of
erasure code In coding theory, an erasure code is a forward error correction (FEC) code under the assumption of bit erasures (rather than bit errors), which transforms a message of ''k'' symbols into a longer message (code word) with ''n'' symbols such that the ...
s that support
error correction In information theory and coding theory with applications in computer science and telecommunication, error detection and correction (EDAC) or error control are techniques that enable reliable delivery of digital data over unreliable communica ...
. Tornado codes require a constant C more redundant blocks than the more data-efficient Reed–Solomon erasure codes, but are much faster to generate and can fix erasures faster. Software-based implementations of tornado codes are about 100 times faster on small lengths and about 10,000 times faster on larger lengths than Reed–Solomon erasure codes. Since the introduction of Tornado codes, many other similar erasure codes have emerged, most notably
Online codes In computer science, online codes are an example of rateless erasure codes. These codes can encode a message into a number of symbols such that knowledge of any fraction of them allows one to recover the original message (with high probability). ''R ...
,
LT codes In computer science, Luby transform codes (LT codes) are the first class of practical fountain codes that are near-optimal erasure correcting codes. They were invented by Michael Luby in 1998 and published in 2002. Like some other fountain codes ...
and
Raptor codes In computer science, Raptor codes (''rapid tornado''; see Tornado codes) are the first known class of fountain codes with linear time encoding and decoding. They were invented by Amin Shokrollahi in 2000/2001 and were first published in 2004 as an ...
. Tornado codes use a layered approach. All layers except the last use an
LDPC In information theory, a low-density parity-check (LDPC) code is a linear error correcting code, a method of transmitting a message over a noisy transmission channel. An LDPC code is constructed using a sparse Tanner graph (subclass of the bipa ...
error correction code, which is fast but has a chance of failure. The final layer uses a Reed–Solomon correction code, which is slower but is optimal in terms of failure recovery. Tornado codes dictates how many levels, how many recovery blocks in each level, and the distribution used to generate blocks for the non-final layers.


Overview

The input data is divided into blocks. Blocks are sequences of bits that are all the same size. Recovery data uses the same block size as the input data. The erasure of a block (input or recovery) is detected by some other means. (For example, a block from disk does not pass a CRC check or a network packet with a given sequence number never arrived.) The number of recovery blocks is given by the user. Then the number of levels is determined along with the number of blocks in each level. The number in each level is determined by a factor B which is less than one. If there are N input blocks, the first recovery level has B*N blocks, the second has B*B*N, the third has B*B*B*N, and so on. All levels of recovery except the final one use an LDPC, which works by xor (exclusive-or). Xor operates on binary values, 1s and 0s. A xor B is 1 if A and B have different values and 0 if A and B have the same values. If you are given result of (A xor B) and A, you can determine the value for B. (A xor B xor A = B) Similarly, if you are given result of (A xor B) and B, you can determine the value for A. This extends to multiple values, so given result of (A xor B xor C xor D) and any 3 of the values, the missing value can be recovered. So the recovery blocks in level one are just the xor of some set of input blocks. Similarly, the recovery blocks in level two are each the xor of some set of blocks in level one. The blocks used in the xor are chosen randomly, without repetition. However, the ''number'' of blocks xor'ed to make a recovery block is chosen from a very specific distribution for each level. Since xor is a fast operation and the recovery blocks are an xor of only a subset of the blocks in the input (or at a lower recovery level), the recovery blocks can be generated quickly. The final level is a Reed–Solomon code. Reed–Solomon codes are optimal in terms of recovering from failures, but slow to generate and recover. Since each level has fewer blocks than the one before, the Reed–Solomon code has a small number of recovery blocks to generate and to use in recovery. So, even though Reed–Solomon is slow, it only has a small amount of data to handle. During recovery, the Reed–Solomon code is recovered first. This is guaranteed to work if the number of missing blocks in the next-to-final level is less than the present blocks in the final level. Going lower, the LDPC (xor) recovery level can be used to recover the level beneath it ''with high probability'' if all the recovery blocks are present and the level beneath is missing at most C' fewer blocks than the recovery level. The algorithm for recovery is to find some recovery block that has only one of its generating set missing from the lower level. Then the xor of the recovery block with all of the blocks that are present is equal to the missing block.


Patent issues

Tornado codes were formerly patented inside the United States of America. Patents US6163870 A (filed Nov 6, 1997) and US 6081909 A (filed Nov 6, 1997) describe Tornado codes, and have expired as of November 6, 2017. Patents US6307487 B1 (filed Feb 5, 1999) and US6320520 B1 (filed Sep 17, 1999) also mention Tornado codes, and have expired as of February 5, 2019, and September 17, 2019, respectively.


Citations

Michael Luby Michael George Luby is a mathematician and computer scientist, CEO of BitRipple, Senior Research Scientist at the International Computer Science Institute (ICSI), former VP Technology at Qualcomm, co-founder and former Chief Technology Officer ...
created the Tornado codes.


External links

A readable description from CMU (PostScript

and another from Luby at the
International Computer Science Institute The International Computer Science Institute (ICSI) is an independent, non-profit research organization located in Berkeley, California, United States. Since its founding in 1988, ICSI has maintained an affiliation agreement with the University ...
(PostScript


See also

*
Raptor code In computer science, Raptor codes (''rapid tornado''; see Tornado codes) are the first known class of fountain codes with linear time encoding and decoding. They were invented by Amin Shokrollahi in 2000/2001 and were first published in 2004 as an ...
*
Erasure code In coding theory, an erasure code is a forward error correction (FEC) code under the assumption of bit erasures (rather than bit errors), which transforms a message of ''k'' symbols into a longer message (code word) with ''n'' symbols such that the ...


Notes


References

* * * {{cite journal , ref={{harvid, Luby, 1998 , author= M. Luby, M. Mitzenmacher, A. Shokrollahi , title=Analysis of Random Processes via And-Or Tree Evaluation , journal=Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms , pages=364–373 , year=1998 Coding theory Capacity-approaching codes