HOME
*



picture info

Wavelet Tree
The Wavelet Tree is a succinct data structure to store strings in compressed space. It generalizes the \mathbf_q and \mathbf_q operations defined on bitvectors to arbitrary alphabets. Originally introduced to represent compressed suffix arrays, it has found application in several contexts. The tree is defined by recursively partitioning the alphabet into pairs of subsets; the leaves correspond to individual symbols of the alphabet, and at each node a bitvector stores whether a symbol of the string belongs to one subset or the other. The name derives from an analogy with the wavelet transform for signals, which recursively decomposes a signal into low-frequency and high-frequency components. Properties Let \Sigma be a finite alphabet with \sigma=, \Sigma, . By using succinct dictionaries in the nodes, a string s \in \Sigma^* can be stored in , s, H_0(s) + o(, s, \log \sigma), where H_0(s) is the order-0 empirical entropy of s. If the tree is balanced, the operations \mathbf ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




Wavelet Tree
The Wavelet Tree is a succinct data structure to store strings in compressed space. It generalizes the \mathbf_q and \mathbf_q operations defined on bitvectors to arbitrary alphabets. Originally introduced to represent compressed suffix arrays, it has found application in several contexts. The tree is defined by recursively partitioning the alphabet into pairs of subsets; the leaves correspond to individual symbols of the alphabet, and at each node a bitvector stores whether a symbol of the string belongs to one subset or the other. The name derives from an analogy with the wavelet transform for signals, which recursively decomposes a signal into low-frequency and high-frequency components. Properties Let \Sigma be a finite alphabet with \sigma=, \Sigma, . By using succinct dictionaries in the nodes, a string s \in \Sigma^* can be stored in , s, H_0(s) + o(, s, \log \sigma), where H_0(s) is the order-0 empirical entropy of s. If the tree is balanced, the operations \mathbf ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Succinct Data Structure
In computer science, a succinct data structure is a data structure which uses an amount of space that is "close" to the information-theoretic lower bound, but (unlike other compressed representations) still allows for efficient query operations. The concept was originally introduced by Jacobson to encode bit vectors, (unlabeled) trees, and planar graphs. Unlike general lossless data compression algorithms, succinct data structures retain the ability to use them in-place, without decompressing them first. A related notion is that of a compressed data structure, in which the size of the data structure depends upon the particular data being represented. Suppose that Z is the information-theoretical optimal number of bits needed to store some data. A representation of this data is called: * ''implicit'' if it takes Z + O(1) bits of space, * ''succinct'' if it takes Z + o(Z) bits of space, and * ''compact'' if it takes O(Z) bits of space. For example, a data structure that uses 2Z bits ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Succinct Data Structure
In computer science, a succinct data structure is a data structure which uses an amount of space that is "close" to the information-theoretic lower bound, but (unlike other compressed representations) still allows for efficient query operations. The concept was originally introduced by Jacobson to encode bit vectors, (unlabeled) trees, and planar graphs. Unlike general lossless data compression algorithms, succinct data structures retain the ability to use them in-place, without decompressing them first. A related notion is that of a compressed data structure, in which the size of the data structure depends upon the particular data being represented. Suppose that Z is the information-theoretical optimal number of bits needed to store some data. A representation of this data is called: * ''implicit'' if it takes Z + O(1) bits of space, * ''succinct'' if it takes Z + o(Z) bits of space, and * ''compact'' if it takes O(Z) bits of space. For example, a data structure that uses 2Z bits ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


Compressed Suffix Array
In computer science, a compressed suffix arrayR. Grossi and J. S. Vitter, Compressed Suffix Arrays and Suffix Trees, with Applications to Text Indexing and String Matching, ''SIAM Journal on Computing,'' 35(2), 2005, 378-407. An earlier version appeared in ''Proceedings of the 32nd ACM Symposium on Theory of Computing,'' May 2000, 397-406.Paolo Ferragina and Giovanni Manzini (2000). "Opportunistic Data Structures with Applications". Proceedings of the 41st Annual Symposium on Foundations of Computer Science. p.390.R. Grossi, A. Gupta, and J. S. Vitter, High-Order Entropy-Compressed Text Indexes, ''Proceedings of the 14th Annual SIAM/ACM Symposium on Discrete Algorithms,'' January 2003, 841-850. is a compressed data structure for pattern matching. Compressed suffix arrays are a general class of data structure that improve on the suffix array. These data structures enable quick search for an arbitrary string with a comparatively small index. Given a text ''T'' of ''n'' characters fr ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Wavelet Transform
In mathematics, a wavelet series is a representation of a square-integrable (real number, real- or complex number, complex-valued) function (mathematics), function by a certain orthonormal series (mathematics), series generated by a wavelet. This article provides a formal, mathematical definition of an orthonormal wavelet and of the integral wavelet transform. Definition A function \psi \,\in\, L^2(\mathbb) is called an orthonormal wavelet if it can be used to define a Hilbert space#Orthonormal bases, Hilbert basis, that is a orthonormal basis, complete orthonormal system, for the Hilbert space L^2\left(\mathbb\right) of Square-integrable function, square integrable functions. The Hilbert basis is constructed as the family of functions \ by means of Dyadic transformation, dyadic translation (geometry), translations and dilation (operator theory), dilations of \psi\,, :\psi_(x) = 2^\frac \psi\left(2^jx - k\right)\, for integers j,\, k \,\in\, \mathbb. If under the standard ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Entropy (information Theory)
In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes. Given a discrete random variable X, which takes values in the alphabet \mathcal and is distributed according to p: \mathcal\to , 1/math>: \Eta(X) := -\sum_ p(x) \log p(x) = \mathbb \log p(X), where \Sigma denotes the sum over the variable's possible values. The choice of base for \log, the logarithm, varies for different applications. Base 2 gives the unit of bits (or " shannons"), while base ''e'' gives "natural units" nat, and base 10 gives units of "dits", "bans", or " hartleys". An equivalent definition of entropy is the expected value of the self-information of a variable. The concept of information entropy was introduced by Claude Shannon in his 1948 paper "A Mathematical Theory of Communication",PDF archived froherePDF archived frohere and is also referred to as Shannon entropy. Shannon's theory defi ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  




FM-index
In computer science, an FM-index is a compressed full-text substring index based on the Burrows–Wheeler transform, with some similarities to the suffix array. It was created by Paolo Ferragina and Giovanni Manzini,Paolo Ferragina and Giovanni Manzini (2000)"Opportunistic Data Structures with Applications".Proceedings of the 41st Annual Symposium on Foundations of Computer Science. p.390. who describe it as an opportunistic data structure as it allows compression of the input text while still permitting fast substring queries. The name stands for Full-text index in Minute space. It can be used to efficiently find the number of occurrences of a pattern within the compressed text, as well as locate the position of each occurrence. The query time, as well as the required storage space, has a sublinear complexity with respect to the size of the input data. The original authors have devised improvements to their original approach and dubbed it "FM-Index version 2". A further improvemen ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Trie
In computer science, a trie, also called digital tree or prefix tree, is a type of ''k''-ary search tree, a tree data structure used for locating specific keys from within a set. These keys are most often strings, with links between nodes defined not by the entire key, but by individual characters. In order to access a key (to recover its value, change it, or remove it), the trie is traversed depth-first, following the links between nodes, which represent each character in the key. Unlike a binary search tree, nodes in the trie do not store their associated key. Instead, a node's position in the trie defines the key with which it is associated. This distributes the value of each key across the data structure, and means that not every node necessarily has an associated value. All the children of a node have a common prefix of the string associated with that parent node, and the root is associated with the empty string. This task of storing data accessible by its prefix can be ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]  


picture info

Trees (data Structures)
In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are usable as lumber or plants above a specified height. In wider definitions, the taller palms, tree ferns, bananas, and bamboos are also trees. Trees are not a taxonomic group but include a variety of plant species that have independently evolved a trunk and branches as a way to tower above other plants to compete for sunlight. The majority of tree species are angiosperms or hardwoods; of the rest, many are gymnosperms or softwoods. Trees tend to be long-lived, some reaching several thousand years old. Trees have been in existence for 370 million years. It is estimated that there are some three trillion mature trees in the world. A tree typically has many secondary branches supported clear of the ground by the trunk. This trunk typically co ...
[...More Info...]      
[...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]