HOME

TheInfoList



OR:

A production or production rule in computer science is a '' rewrite rule'' specifying a symbol substitution that can be recursively performed to generate new symbol sequences. A finite set of productions P is the main component in the specification of a
formal grammar In formal language theory, a grammar (when the context is not given, often called a formal grammar for clarity) describes how to form strings from a language's alphabet that are valid according to the language's syntax. A grammar does not describe ...
(specifically a
generative grammar Generative grammar, or generativism , is a linguistic theory that regards linguistics as the study of a hypothesised innate grammatical structure. It is a biological or biologistic modification of earlier structuralist theories of linguistic ...
). The other components are a finite set N of
nonterminal symbol In computer science, terminal and nonterminal symbols are the lexical elements used in specifying the production rules constituting a formal grammar. ''Terminal symbols'' are the elementary symbols of the language defined by a formal grammar. ...
s, a finite set (known as an alphabet) \Sigma of
terminal symbol In computer science, terminal and nonterminal symbols are the lexical elements used in specifying the production rules constituting a formal grammar. ''Terminal symbols'' are the elementary symbols of the language defined by a formal grammar. ...
s that is disjoint from N and a distinguished symbol S \in N that is the ''start symbol''. In an
unrestricted grammar In automata theory, the class of unrestricted grammars (also called semi-Thue, type-0 or phrase structure grammars) is the most general class of grammars in the Chomsky hierarchy. No restrictions are made on the productions of an unrestricted gramma ...
, a production is of the form u \to v, where u and v are arbitrary strings of terminals and nonterminals, and u may not be the
empty string In formal language theory, the empty string, or empty word, is the unique string of length zero. Formal theory Formally, a string is a finite, ordered sequence of characters such as letters, digits or spaces. The empty string is the special cas ...
. If v is the empty string, this is denoted by the symbol \epsilon, or \lambda (rather than leave the right-hand side blank). So productions are members of the
cartesian product In mathematics, specifically set theory, the Cartesian product of two sets ''A'' and ''B'', denoted ''A''×''B'', is the set of all ordered pairs where ''a'' is in ''A'' and ''b'' is in ''B''. In terms of set-builder notation, that is : A\ti ...
:V^*NV^* \times V^* = (V^*\setminus\Sigma^*) \times V^*, where V := N \cup \Sigma is the ''vocabulary'', ^ is the
Kleene star In mathematical logic and computer science, the Kleene star (or Kleene operator or Kleene closure) is a unary operation, either on sets of strings or on sets of symbols or characters. In mathematics, it is more commonly known as the free monoid ...
operator, V^*NV^* indicates
concatenation In formal language theory and computer programming, string concatenation is the operation of joining character strings end-to-end. For example, the concatenation of "snow" and "ball" is "snowball". In certain formalisations of concatenat ...
, \cup denotes
set union In set theory, the union (denoted by ∪) of a collection of sets is the set of all elements in the collection. It is one of the fundamental operations through which sets can be combined and related to each other. A refers to a union of ze ...
, and \setminus denotes set minus or set difference. If we do not allow the start symbol to occur in v (the word on the right side), we have to replace V^* by (V \setminus \)^* on the right side of the cartesian product symbol.See Klaus Reinhardt
Prioritatszahlerautomaten und die Synchronisation von Halbspursprachen
; Fakultät Informatik der Universität Stuttgart; 1994 (German)
The other types of formal grammar in the
Chomsky hierarchy In formal language theory, computer science and linguistics, the Chomsky hierarchy (also referred to as the Chomsky–Schützenberger hierarchy) is a containment hierarchy of classes of formal grammars. This hierarchy of grammars was described by ...
impose additional restrictions on what constitutes a production. Notably in a context-free grammar, the left-hand side of a production must be a single nonterminal symbol. So productions are of the form: :N \to (N \cup \Sigma)^*


Grammar generation

To generate a string in the language, one begins with a string consisting of only a single ''start symbol'', and then successively applies the rules (any number of times, in any order) to rewrite this string. This stops when we obtain a string containing only terminals. The language consists of all the strings that can be generated in this manner. Any particular sequence of legal choices taken during this rewriting process yields one particular string in the language. If there are multiple different ways of generating this single string, then the grammar is said to be ambiguous. For example, assume the alphabet consists of a and b, with the start symbol S, and we have the following rules: : 1. S \rightarrow aSb : 2. S \rightarrow ba then we start with S, and can choose a rule to apply to it. If we choose rule 1, we replace S with aSb and obtain the string aSb. If we choose rule 1 again, we replace S with aSb and obtain the string aaSbb. This process is repeated until we only have symbols from the alphabet (i.e., a and b). If we now choose rule 2, we replace S with ba and obtain the string aababb, and are done. We can write this series of choices more briefly, using symbols: S \Rightarrow aSb \Rightarrow aaSbb \Rightarrow aababb. The language of the grammar is the set of all the strings that can be generated using this process: \{ba, abab, aababb, aaababbb, \dotsc\}.


See also

*
Formal grammar In formal language theory, a grammar (when the context is not given, often called a formal grammar for clarity) describes how to form strings from a language's alphabet that are valid according to the language's syntax. A grammar does not describe ...
*
Finite automata A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number o ...
*
Generative grammar Generative grammar, or generativism , is a linguistic theory that regards linguistics as the study of a hypothesised innate grammatical structure. It is a biological or biologistic modification of earlier structuralist theories of linguistic ...
*
L-system An L-system or Lindenmayer system is a parallel rewriting system and a type of formal grammar. An L-system consists of an alphabet of symbols that can be used to make strings, a collection of production rules that expand each symbol into som ...
* Rewrite rule * Backus–Naur form (A compact form for writing the productions of a context-free grammar.) * Phrase structure rule *
Post canonical system A Post canonical system, also known as a Post production system, as created by Emil Post, is a string-manipulation system that starts with finitely-many strings and repeatedly transforms them by applying a finite set j of specified rules of a cert ...
(Emil Post's production systems- a model of computation.)


References

Grammar Natural language processing Formal languages