HOME

TheInfoList



OR:

Run-length encoding (RLE) is a form of lossless data compression in which ''runs'' of data (sequences in which the same data value occurs in many consecutive data elements) are stored as a single data value and count, rather than as the original run. This is most efficient on data that contains many such runs, for example, simple graphic images such as icons, line drawings, Conway's Game of Life, and animations. For files that do not have many runs, RLE could increase the file size. RLE may also be used to refer to an early graphics file format supported by CompuServe for compressing black and white images, but was widely supplanted by their later
Graphics Interchange Format The Graphics Interchange Format (GIF; or , see pronunciation) is a bitmap image format that was developed by a team at the online services provider CompuServe led by American computer scientist Steve Wilhite and released on 15 June 1987 ...
(GIF). RLE also refers to a little-used image format in
Windows 3.x Windows 3.x means either of, or all of the following versions of Microsoft Windows: * Windows 3.0 * Windows 3.1x Windows NT * Windows NT 3.x Windows NT 3.x may refer to either of, or all of the following versions of Microsoft Windows: * Windows ...
, with the extension rle, which is a run-length encoded bitmap, used to compress the Windows 3.x startup screen.


Example

Consider a screen containing plain black text on a solid white background. There will be many long runs of white
pixel In digital imaging, a pixel (abbreviated px), pel, or picture element is the smallest addressable element in a raster image, or the smallest point in an all points addressable display device. In most digital display devices, pixels are the ...
s in the blank space, and many short runs of black pixels within the text. A hypothetical scan line, with B representing a black pixel and W representing white, might read as follows: : WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW With a run-length encoding (RLE) data compression algorithm applied to the above hypothetical scan line, it can be rendered as follows: : 12W1B12W3B24W1B14W This can be interpreted as a sequence of twelve Ws, one B, twelve Ws, three Bs, etc., and represents the original 67 characters in only 18. While the actual format used for the storage of images is generally binary rather than
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
characters like this, the principle remains the same. Even binary data files can be compressed with this method; file format specifications often dictate repeated bytes in files as padding space. However, newer compression methods such as DEFLATE often use LZ77-based algorithms, a generalization of run-length encoding that can take advantage of runs of strings of characters (such as BWWBWWBWWBWW). Run-length encoding can be expressed in multiple ways to accommodate data properties as well as additional compression algorithms. For instance, one popular method encodes run lengths for runs of two or more characters only, using an "escape" symbol to identify runs, or using the character itself as the escape, so that any time a character appears twice it denotes a run. On the previous example, this would give the following: : WW12BWW12BB3WW24BWW14 This would be interpreted as a run of twelve Ws, a B, a run of twelve Ws, a run of three Bs, etc. In data where runs are less frequent, this can significantly improve the compression rate. One other matter is the application of additional compression algorithms. Even with the runs extracted, the frequencies of different characters may be large, allowing for further compression; however, if the run lengths are written in the file in the locations where the runs occurred, the presence of these numbers interrupts the normal flow and makes it harder to compress. To overcome this, some run-length encoders separate the data and escape symbols from the run lengths, so that the two can be handled independently. For the example data, this would result in two outputs, the string "WWBWWBBWWBWW" and the numbers (12,12,3,24,14).


History and applications

Run-length encoding (RLE) schemes were employed in the transmission of analog television signals as far back as 1967. In 1983, run-length encoding was
patent A patent is a type of intellectual property that gives its owner the legal right to exclude others from making, using, or selling an invention for a limited period of time in exchange for publishing an enabling disclosure of the invention."A ...
ed by
Hitachi () is a Japanese multinational conglomerate corporation headquartered in Chiyoda, Tokyo, Japan. It is the parent company of the Hitachi Group (''Hitachi Gurūpu'') and had formed part of the Nissan ''zaibatsu'' and later DKB Group and Fuyo G ...
. RLE is particularly well suited to
palette Palette may refer to: * Cosmetic palette, an archaeological form * Palette, another name for a color scheme * Palette (painting), a wooden board used for mixing colors for a painting ** Palette knife, an implement for painting * Palette (company) ...
-based bitmap images such as computer icons, and was a popular image compression method on early online services such as CompuServe before the advent of more sophisticated formats such as GIF. It does not work well on continuous-tone images such as photographs, although
JPEG JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and imag ...
uses it on the coefficients that remain after transforming and quantizing image blocks. Common formats for run-length encoded data include Truevision TGA, PackBits (by Apple, used in MacPaint),
PCX PCX, standing for ''PiCture eXchange'', was an image file format developed by the now-defunct ZSoft Corporation of Marietta, Georgia, United States. It was the native file format for PC Paintbrush and became one of the first widely accepted DOS ...
and ILBM. The
International Telecommunication Union The International Telecommunication Union is a specialized agency of the United Nations responsible for many matters related to information and communication technologies. It was established on 17 May 1865 as the International Telegraph Unio ...
also describes a standard to encode run-length-colour for fax machines, known as T.45. The standard, which is combined with other techniques into
Modified Huffman coding Modified Huffman coding is used in fax machines to encode black-on-white images (bitmaps). It combines the variable-length codes of Huffman coding with the coding of repetitive data in run-length encoding. The basic Huffman coding provides a way to ...
, is relatively efficient because most faxed documents are generally white space, with occasional interruptions of black.


See also

* Kolakoski sequence * Look-and-say sequence * Comparison of graphics file formats * Golomb coding * Burrows–Wheeler transform *
Recursive indexing {{no footnotes, date=June 2020 Recursive indexing is an algorithm used to represent large numberic values using members of a relatively small set. Recursive indexing writes the successive differences of the number after extracting the maximum valu ...
*
Run-length limited Run-length limited or RLL coding is a line coding technique that is used to send arbitrary data over a communications channel with bandwidth limits. RLL codes are defined by four main parameters: ''m'', ''n'', ''d'', ''k''. The first two, ''m'' ...
* Bitmap index * Forsyth–Edwards Notation, which uses run-length-encoding for empty spaces in chess positions. * DEFLATE


References


External links


Run-length encoding implemented in different programming languages
(on Rosetta Code)
Single Header Run-Length Encoding Library
smallest possible implementation (about 20 SLoC) in ANSI C. FOSS, compatible with Truevision TGA, supports 8, 16, 24 and 32 bit elements too. {{Compression formats Lossless compression algorithms