Context-free Grammar Generation Algorithms
A straight-line grammar (sometimes abbreviated as SLG) is a formal grammar that generates exactly one string.Florian Benz and Timo Kötzing, “An effective heuristic for the smallest grammar problem,” Proceedings of the fifteenth annual conference on Genetic and evolutionary computation conference - GECCO ’13, 2013. , p. 488 Consequently, it does not branch (every non-terminal has only one associated production rule) nor loop (if non-terminal ''A'' appears in a derivation of ''B'', then ''B'' does not appear in a derivation of ''A''). Areas of usefulness Straight-line grammars are widely used in the development of algorithms that execute directly on compressed structures (without prior decompression). SLGs are of interest in fields like Kolmogorov complexity, Lossless data compression, Structure discovery and Compressed data structures. The problem of finding a context-free grammar (equivalently: an SLG) of minimal size that generates a given string is called the small ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Formal Grammar
In formal language theory, a grammar (when the context is not given, often called a formal grammar for clarity) describes how to form strings from a language's alphabet that are valid according to the language's syntax. A grammar does not describe the meaning of the strings or what can be done with them in whatever context—only their form. A formal grammar is defined as a set of production rules for such strings in a formal language. Formal language theory, the discipline that studies formal grammars and languages, is a branch of applied mathematics. Its applications are found in theoretical computer science, theoretical linguistics, formal semantics, mathematical logic, and other areas. A formal grammar is a set of rules for rewriting strings, along with a "start symbol" from which rewriting starts. Therefore, a grammar is usually thought of as a language generator. However, it can also sometimes be used as the basis for a "recognizer"—a function in computing that deter ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Kolmogorov Complexity
In algorithmic information theory (a subfield of computer science and mathematics), the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program (in a predetermined programming language) that produces the object as output. It is a measure of the computational resources needed to specify the object, and is also known as algorithmic complexity, Solomonoff–Kolmogorov–Chaitin complexity, program-size complexity, descriptive complexity, or algorithmic entropy. It is named after Andrey Kolmogorov, who first published on the subject in 1963 and is a generalization of classical information theory. The notion of Kolmogorov complexity can be used to state and prove impossibility results akin to Cantor's diagonal argument, Gödel's incompleteness theorem, and Turing's halting problem. In particular, no program ''P'' computing a lower bound for each text's Kolmogorov complexity can return a value essentially larger than ''P'''s own leng ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Lossless Data Compression
Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistical redundancy. By contrast, lossy compression permits reconstruction only of an approximation of the original data, though usually with greatly improved compression rates (and therefore reduced media sizes). By operation of the pigeonhole principle, no lossless compression algorithm can efficiently compress all possible data. For this reason, many different algorithms exist that are designed either with a specific type of input data in mind or with specific assumptions about what kinds of redundancy the uncompressed data are likely to contain. Therefore, compression ratios tend to be stronger on human- and machine-readable documents and code in comparison to entropic binary data (random bytes). Lossless data compression is used in many ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Structure Discovery
A structure is an arrangement and organization of interrelated elements in a material object or system, or the object or system so organized. Material structures include man-made objects such as buildings and machines and natural objects such as biological organisms, minerals and chemicals. Abstract structures include data structures in computer science and musical form. Types of structure include a hierarchy (a cascade of one-to-many relationships), a network featuring many-to-many links, or a lattice featuring connections between components that are neighbors in space. Load-bearing Buildings, aircraft, skeletons, anthills, beaver dams, bridges and salt domes are all examples of load-bearing structures. The results of construction are divided into buildings and non-building structures, and make up the infrastructure of a human society. Built structures are broadly divided by their varying design approaches and standards, into categories including building structures, arch ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Compressed Data Structure
The term compressed data structure arises in the computer science subfields of algorithms, data structures, and theoretical computer science. It refers to a data structure whose operations are roughly as fast as those of a conventional data structure for the problem, but whose size can be substantially smaller. The size of the compressed data structure is typically highly dependent upon the entropy of the data being represented. Important examples of compressed data structures include the compressed suffix array and the FM-index,P. Ferragina and G. Manzini, Opportunistic Data Structures with Applications, ''Proceedings of the 41st IEEE Symposium on Foundations of Computer Science'', November 2000, 390-398. Journal version in Indexing Compressed Text, ''Journal of the ACM'', 52(4), 2005, 552-581. both of which can represent an arbitrary text of characters ''T'' for pattern matching. Given any input pattern ''P'', they support the operation of finding if and where ''P'' appears i ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Smallest Grammar Problem
In data compression and the theory of formal languages, the smallest grammar problem is the problem of finding the smallest context-free grammar that generates a given string of characters (but no other string). The size of a grammar is defined by some authors as the number of symbols on the right side of the production rules. Others also add the number of rules to that.Florian Benz and Timo Kötzing, “An effective heuristic for the smallest grammar problem,” Proceedings of the fifteenth annual conference on Genetic and evolutionary computation conference - GECCO ’13, 2013. The (decision version of the) problem is NP-complete. The smallest context-free grammar that generates a given string is always a straight-line grammar without useless rules. See also * Grammar-based code * Kolmogorov Complexity * Lossless data compression Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Tree (data Structure)
In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be connected to exactly one parent, except for the ''root'' node, which has no parent. These constraints mean there are no cycles or "loops" (no node can be its own ancestor), and also that each child can be treated like the root node of its own subtree, making recursion a useful technique for tree traversal. In contrast to linear data structures, many trees cannot be represented by relationships between neighboring nodes in a single straight line. Binary trees are a commonly used type, which constrain the number of children for each parent to exactly two. When the order of the children is specified, this data structure corresponds to an ordered tree in graph theory. A value or pointer to other data may be associated with every node in the tre ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Context-free Grammar
In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form :A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be empty). A formal grammar is "context-free" if its production rules can be applied regardless of the context of a nonterminal. No matter which symbols surround it, the single nonterminal on the left hand side can always be replaced by the right hand side. This is what distinguishes it from a context-sensitive grammar. A formal grammar is essentially a set of production rules that describe all possible strings in a given formal language. Production rules are simple replacements. For example, the first rule in the picture, :\langle\text\rangle \to \langle\text\rangle = \langle\text\rangle ; replaces \langle\text\rangle with \langle\text\rangle = \langle\text\rangle ;. There can be multiple replacement rules for a given nonterminal symbol. The ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Directed Graph
In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph that is made up of a set of vertices connected by directed edges, often called arcs. Definition In formal terms, a directed graph is an ordered pair where * ''V'' is a set whose elements are called '' vertices'', ''nodes'', or ''points''; * ''A'' is a set of ordered pairs of vertices, called ''arcs'', ''directed edges'' (sometimes simply ''edges'' with the corresponding set named ''E'' instead of ''A''), ''arrows'', or ''directed lines''. It differs from an ordinary or undirected graph, in that the latter is defined in terms of unordered pairs of vertices, which are usually called ''edges'', ''links'' or ''lines''. The aforementioned definition does not allow a directed graph to have multiple arrows with the same source and target nodes, but some authors consider a broader definition that allows directed graphs to have such multiple arcs (namely, they allow the arc set to be a m ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Directed Acyclic Graph
In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one vertex to another, such that following those directions will never form a closed loop. A directed graph is a DAG if and only if it can be topologically ordered, by arranging the vertices as a linear ordering that is consistent with all edge directions. DAGs have numerous scientific and computational applications, ranging from biology (evolution, family trees, epidemiology) to information science (citation networks) to computation (scheduling). Directed acyclic graphs are sometimes instead called acyclic directed graphs or acyclic digraphs. Definitions A graph is formed by vertices and by edges connecting pairs of vertices, where the vertices can be any kind of object that is connected in pairs by edges. In the case of a directed graph, ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Chomsky Normal Form
In formal language theory, a context-free grammar, ''G'', is said to be in Chomsky normal form (first described by Noam Chomsky) if all of its production rules are of the form: : ''A'' → ''BC'', or : ''A'' → ''a'', or : ''S'' → ε, where ''A'', ''B'', and ''C'' are nonterminal symbols, the letter ''a'' is a terminal symbol (a symbol that represents a constant value), ''S'' is the start symbol, and ε denotes the empty string. Also, neither ''B'' nor ''C'' may be the start symbol, and the third production rule can only appear if ε is in ''L''(''G''), the language produced by the context-free grammar ''G''. Every grammar in Chomsky normal form is context-free, and conversely, every context-free grammar can be transformed into an equivalent onethat is, one that produces the same language which is in Chomsky normal form and has a size no larger than the square of the original grammar's size. Converting a grammar to Chomsky normal form To convert a grammar to Ch ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |
|
Straight-line Program
In mathematics, more specifically in computational algebra, a straight-line program (SLP) for a finite group ''G'' = 〈''S''〉 is a finite sequence ''L'' of elements of ''G'' such that every element of ''L'' either belongs to ''S'', is the inverse of a preceding element, or the product of two preceding elements. An SLP ''L'' is said to ''compute'' a group element ''g'' ∈ ''G'' if ''g'' ∈ ''L'', where ''g'' is encoded by a word in ''S'' and its inverses. Intuitively, an SLP computing some ''g'' ∈ ''G'' is an ''efficient'' way of storing ''g'' as a group word over ''S''; observe that if ''g'' is constructed in ''i'' steps, the word length of ''g'' may be exponential in ''i'', but the length of the corresponding SLP is linear in ''i''. This has important applications in computational group theory, by using SLPs to efficiently encode group elements as words over a given generating set. Straight-line programs were introduced by Bab ... [...More Info...]       [...Related Items...]     OR:     [Wikipedia]   [Google]   [Baidu]   |