HOME

TheInfoList



OR:

In
data compression In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressio ...
and the theory of
formal language In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules. The alphabet of a formal language consists of sy ...
s, the smallest grammar problem is the problem of finding the smallest context-free grammar that generates a given string of characters (but no other string). The size of a grammar is defined by some authors as the number of symbols on the right side of the production rules. Others also add the number of rules to that.Florian Benz and Timo Kötzing, “An effective heuristic for the smallest grammar problem,” Proceedings of the fifteenth annual conference on Genetic and evolutionary computation conference - GECCO ’13, 2013. The (decision version of the) problem is
NP-complete In computational complexity theory, a problem is NP-complete when: # it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by trying ...
. The smallest context-free grammar that generates a given string is always a
straight-line grammar A straight-line grammar (sometimes abbreviated as SLG) is a formal grammar that generates exactly one string.Florian Benz and Timo Kötzing, “An effective heuristic for the smallest grammar problem,” Proceedings of the fifteenth annual conferen ...
without
useless rules In formal language theory, a context-free grammar (CFG) is a formal grammar whose Production (computer science), production rules are of the form :A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of Terminal and nonter ...
.


See also

*
Grammar-based code Grammar-based codes or Grammar-based compression are compression algorithms based on the idea of constructing a context-free grammar (CFG) for the string to be compressed. Examples include universal lossless data compression algorithms. To compre ...
*
Kolmogorov Complexity In algorithmic information theory (a subfield of computer science and mathematics), the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program (in a predetermined programming language) that produ ...
*
Lossless data compression Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistic ...


References

* Formal languages Data compression {{algorithm-stub