Generalized Context-free Grammar
   HOME

TheInfoList



OR:

Generalized context-free grammar (GCFG) is a grammar formalism that expands on
context-free grammars In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form :A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be empt ...
by adding potentially non-context-free composition functions to rewrite rules.
Head grammar Head grammar (HG) is a grammar formalism introduced in Carl Pollard (1984) Pollard, C. 1984. ''Generalized Phrase Structure Grammars, Head Grammars, and Natural Language''. Ph.D. thesis, Stanford University, CA. as an extension of the context-fr ...
(and its weak equivalents) is an instance of such a GCFG which is known to be especially adept at handling a wide variety of non-CF properties of natural language.


Description

A GCFG consists of two components: a set of composition functions that combine string tuples, and a set of rewrite rules. The composition functions all have the form f(\langle x_1, ..., x_m \rangle, \langle y_1, ..., y_n \rangle, ...) = \gamma, where \gamma is either a single string tuple, or some use of a (potentially different) composition function which reduces to a string tuple. Rewrite rules look like X \to f(Y, Z, ...), where Y, Z, ... are string tuples or non-terminal symbols. The rewrite semantics of GCFGs is fairly straightforward. An occurrence of a non-terminal symbol is rewritten using rewrite rules as in a context-free grammar, eventually yielding just compositions (composition functions applied to string tuples or other compositions). The composition functions are then applied, successively reducing the tuples to a single tuple.


Example

A simple translation of a context-free grammar into a GCFG can be performed in the following fashion. Given the grammar in (), which generates the palindrome language \, where w^R is the string reverse of w, we can define the composition function ''conc'' as in () and the rewrite rules as in (). The CF production of ' is : : : : : and the corresponding GCFG production is : S \to conc(\langle a \rangle, S, \langle a \rangle) : conc(\langle a \rangle, conc(\langle b \rangle, S, \langle b \rangle), \langle a \rangle) : conc(\langle a \rangle, conc(\langle b \rangle, conc(\langle b \rangle, S, \langle b \rangle), \langle b \rangle), \langle a \rangle) : conc(\langle a \rangle, conc(\langle b \rangle, conc(\langle b \rangle, conc(\langle \epsilon \rangle, \langle \epsilon \rangle, \langle \epsilon \rangle), \langle b \rangle), \langle b \rangle), \langle a \rangle) : conc(\langle a \rangle, conc(\langle b \rangle, conc(\langle b \rangle, \langle \epsilon \rangle, \langle b \rangle), \langle b \rangle), \langle a \rangle) : conc(\langle a \rangle, conc(\langle b \rangle, \langle bb \rangle, \langle b \rangle), \langle a \rangle) : conc(\langle a \rangle, \langle bbbb \rangle, \langle a \rangle) : \langle abbbba \rangle


Linear Context-free Rewriting Systems (LCFRSs)

Weir (1988) describes two properties of composition functions, linearity and regularity. A function defined as f(x_1, ..., x_n) = ... is linear if and only if each variable appears at most once on either side of the ''='', making f(x) = g(x, y) linear but not f(x) = g(x, x). A function defined as f(x_1, ..., x_n) = ... is regular if the left hand side and right hand side have exactly the same variables, making f(x, y) = g(y, x) regular but not f(x) = g(x, y) or f(x, y) = g(x). A grammar in which all composition functions are both linear and regular is called a Linear Context-free Rewriting System (LCFRS). LCFRS is a
proper subclass In set theory and its applications throughout mathematics, a class is a collection of sets (or sometimes other mathematical objects) that can be unambiguously defined by a property that all its members share. Classes act as a way to have set-lik ...
of the GCFGs, i.e. it has strictly less computational power than the GCFGs as a whole. On the other hand, LCFRSs are strictly more expressive than
linear-indexed grammar Indexed grammars are a generalization of context-free grammars in that nonterminals are equipped with lists of ''flags'', or ''index symbols''. The language produced by an indexed grammar is called an indexed language. Definition Modern definitio ...
s and their weakly equivalent variant
tree adjoining grammar Tree-adjoining grammar (TAG) is a grammar formalism defined by Aravind Joshi. Tree-adjoining grammars are somewhat similar to context-free grammars, but the elementary unit of rewriting is the tree rather than the symbol. Whereas context-free gr ...
s (TAGs).
Head grammar Head grammar (HG) is a grammar formalism introduced in Carl Pollard (1984) Pollard, C. 1984. ''Generalized Phrase Structure Grammars, Head Grammars, and Natural Language''. Ph.D. thesis, Stanford University, CA. as an extension of the context-fr ...
is another example of an LCFRS that is strictly less powerful than the class of LCFRSs as a whole. LCFRS are weakly equivalent to (set-local) ''multicomponent'' TAGs ( MCTAGs) and also with
multiple context-free grammar Multiple may refer to: Economics *Multiple finance, a method used to analyze stock prices *Multiples of the price-to-earnings ratio *Chain stores, are also referred to as 'Multiples' * Box office multiple, the ratio of a film's total gross to th ...
(MCFG

. and minimalist grammars (MGs). The languages generated by LCFRS (and their weakly equivalents) can be parsed in
polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
.


See also

*
Range concatenation grammar Range concatenation grammar (RCG) is a grammar formalism developed by Pierre Boullier in 1998 as an attempt to characterize a number of phenomena of natural language, such as Chinese numbers and German word order scrambling, which are outside the b ...


References

{{DEFAULTSORT:Generalized Context-Free Grammar Formal languages Grammar frameworks