Algorithm BSTW
   HOME





Algorithm BSTW
The Algorithm BSTW is a data compression algorithm, named after its designers, Bentley, Sleator, Tarjan and Wei in 1986. BSTW is a dictionary-based algorithm that uses a move-to-front transform to keep recently seen dictionary entries at the front of the dictionary. Dictionary references are then encoded using any of a number of encoding methods, usually Elias delta coding or Elias gamma coding. References This algorithm was published in the following paper: "A Locally Adaptive Data Compression Scheme", Communications of the ACM, 1986, volume 29 number 4, pp. 320–330. A related idea was published in Ryabko, B. Ya. "Data compression by means of a book stack", Problems of Information Transmission, 1980, v. 16: (4), pp. 265–269. The original name of this code is "book stack". The history of discovery of the book stack (or move-to-front The move-to-front (MTF) transform is an encoding of data (typically a stream of bytes) designed to improve the performance of ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Data Compression
In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder. The process of reducing the size of a data file is often referred to as data compression. In the context of data transmission, it is called source coding: encoding is done at the source of the data before it is stored or transmitted. Source coding should not be confused with channel coding, for error detection and correction or line coding, the means for mapping data onto a sig ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specifications for performing calculations and data processing. More advanced algorithms can use Conditional (computer programming), conditionals to divert the code execution through various routes (referred to as automated decision-making) and deduce valid inferences (referred to as automated reasoning). In contrast, a Heuristic (computer science), heuristic is an approach to solving problems without well-defined correct or optimal results.David A. Grossman, Ophir Frieder, ''Information Retrieval: Algorithms and Heuristics'', 2nd edition, 2004, For example, although social media recommender systems are commonly called "algorithms", they actually rely on heuristics as there is no truly "correct" recommendation. As an e ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Daniel Sleator
Daniel Dominic Kaplan Sleator (born 10 December 1953) is a professor of computer science at Carnegie Mellon University, Pittsburgh, United States. In 1999, he won the ACM Paris Kanellakis Award (jointly with Robert Tarjan) for the splay tree data structure. He was one of the pioneers in amortized analysis of algorithms, early examples of which were the analyses of the move-to-front heuristic, and splay trees. He invented many data structures with Robert Tarjan, such as splay trees, link/cut trees, and skew heaps. The Sleator and Tarjan paper on the move-to-front heuristic first suggested the idea of comparing an online algorithm to an optimal offline algorithm, for which the term competitive analysis was later coined in a paper of Karlin, Manasse, Rudolph, and Sleator. Sleator also developed the theory of link grammars, and the Serioso music analyzer for analyzing meter and harmony in written music. Personal life Sleator was born to William Warner Sleator, Jr., a profe ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Robert Tarjan
Robert Endre Tarjan (born April 30, 1948) is an American computer scientist and mathematician. He is the discoverer of several graph theory algorithms, including his strongly connected components algorithm, and co-inventor of both splay trees and Fibonacci heaps. Tarjan is currently the James S. McDonnell Distinguished University Professor of Computer Science at Princeton University. Personal life and education He was born in Pomona, California. His father, George Tarjan (1912–1991), raised in Hungary, was a child psychiatrist, specializing in mental retardation, and ran a state hospital. Robert Tarjan's younger brother James became a chess grandmaster. As a child, Robert Tarjan read a lot of science fiction, and wanted to be an astronomer. He became interested in mathematics after reading Martin Gardner's mathematical games column in Scientific American. He became seriously interested in math in the eighth grade, thanks to a "very stimulating" teacher. While he was in hig ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Move-to-front Transform
The move-to-front (MTF) transform is an encoding of data (typically a stream of bytes) designed to improve the performance of entropy encoding techniques of compression. When efficiently implemented, it is fast enough that its benefits usually justify including it as an extra step in data compression algorithm. This algorithm was first published by Boris Ryabko under the name of "book stack" in 1980. Subsequently, it was rediscovered by J.K. Bentley et al. in 1986, as attested in the explanatory note. The transform The main idea is that each symbol in the data is replaced by its index in the stack of “recently used symbols”. For example, long sequences of identical symbols are replaced by as many zeroes, whereas when a symbol that has not been used in a long time appears, it is replaced with a large number. Thus at the end the data is transformed into a sequence of integers; if the data exhibits a lot of local correlations, then these integers tend to be small. Let us gi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Elias Delta Coding
Elias δ code or Elias delta code is a universal code encoding the positive integers developed by Peter Elias. Encoding To code a number ''X'' ≥ 1: # Let ''N'' = ⌊log2 ''X''⌋; be the highest power of 2 in ''X'', so 2''N'' ≤ ''X'' < 2''N''+1. # Let ''L'' = ⌊log2 ''N''+1⌋ be the highest power of 2 in ''N''+1, so 2''L'' ≤ ''N''+1 < 2''L''+1. # Write ''L'' zeros, followed by # the ''L''+1-bit binary representation of ''N''+1, followed by # all but the leading bit (i.e. the last ''N'' bits) of ''X''. An equivalent way to express the same process: #Separate ''X'' into the highest power of 2 it contains (2''N'') and the remaining ''N'' binary digits. #Encode ''N''+1 with Elias gamma coding. #Append the remaining ''N'' binary digits to this representation of ''N''+1. To represent a number x
[...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Elias Gamma Coding
Elias \gamma code or Elias gamma code is a universal code encoding positive integers developed by Peter Elias. It is used most commonly when coding integers whose upper bound cannot be determined beforehand. Encoding To code a number ''x'' ≥ 1: # Let N = \lfloor \log_2 x \rfloor be the highest power of 2 it contains, so 2''N'' ≤ ''x'' < 2''N''+1. # Write out N zero bits, then # Append the binary form of x, an (N+1)-bit binary number. An equivalent way to express the same process: # Encode N in unary; that is, as N zeroes followed by a one. # Append the remaining N binary digits of x to this representation of N. To represent a number x, Elias gamma (γ) uses 2 \lfloor \log_2(x) \rfloor + 1 bits. The code begins (the implied probability distribution for the code is added for clarity): Decoding To decode an Elias gamma-coded integer: #Read and count 0s from the stream until you reach the first 1. Call this count of zeroes ''N''. #Considering the one that was r ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Move-to-front Transform
The move-to-front (MTF) transform is an encoding of data (typically a stream of bytes) designed to improve the performance of entropy encoding techniques of compression. When efficiently implemented, it is fast enough that its benefits usually justify including it as an extra step in data compression algorithm. This algorithm was first published by Boris Ryabko under the name of "book stack" in 1980. Subsequently, it was rediscovered by J.K. Bentley et al. in 1986, as attested in the explanatory note. The transform The main idea is that each symbol in the data is replaced by its index in the stack of “recently used symbols”. For example, long sequences of identical symbols are replaced by as many zeroes, whereas when a symbol that has not been used in a long time appears, it is replaced with a large number. Thus at the end the data is transformed into a sequence of integers; if the data exhibits a lot of local correlations, then these integers tend to be small. Let us gi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Lossless Compression Algorithms
Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistical redundancy. By contrast, lossy compression permits reconstruction only of an approximation of the original data, though usually with greatly improved compression rates (and therefore reduced media sizes). By operation of the pigeonhole principle, no lossless compression algorithm can shrink the size of all possible data: Some data will get longer by at least one symbol or bit. Compression algorithms are usually effective for human- and machine-readable documents and cannot shrink the size of random data that contain no redundancy. Different algorithms exist that are designed either with a specific type of input data in mind or with specific assumptions about what kinds of redundancy the uncompressed data are likely to contain. Lo ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]