In
formal language theory
In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules.
The alphabet of a formal language consists of symb ...
, a
context-sensitive grammar is in Kuroda normal form if all production rules are of the form:
:''AB'' → ''CD'' or
:''A'' → ''BC'' or
:''A'' → ''B'' or
:''A'' → ''a''
where A, B, C and D are
nonterminal
In computer science, terminal and nonterminal symbols are the lexical elements used in specifying the production rules constituting a formal grammar. ''Terminal symbols'' are the elementary symbols of the language defined by a formal grammar. ...
symbols and ''a'' is a
terminal symbol
In computer science, terminal and nonterminal symbols are the lexical elements used in specifying the production rules constituting a formal grammar. ''Terminal symbols'' are the elementary symbols of the language defined by a formal grammar. ...
.
Some sources omit the ''A'' → ''B'' pattern.
It is named after
Sige-Yuki Kuroda
, aka S.-Y. Kuroda, was Professor Emeritus
and Research Professor of Linguistics at the University of California, San Diego.
Although a pioneer in the application of Chomskyan generative syntax to
the Japanese language, he is known for the broad ...
, who originally called it a linear bounded grammar—a terminology that was also used by a few other authors thereafter.
Every grammar in Kuroda normal form is
noncontracting, and therefore, generates a
context-sensitive language
In formal language theory, a context-sensitive language is a language that can be defined by a context-sensitive grammar (and equivalently by a noncontracting grammar). Context-sensitive is one of the four types of grammars in the Chomsky hierarc ...
. Conversely, every context-sensitive language which does not generate the
empty string
In formal language theory, the empty string, or empty word, is the unique string of length zero.
Formal theory
Formally, a string is a finite, ordered sequence of characters such as letters, digits or spaces. The empty string is the special cas ...
can be generated by a grammar in Kuroda normal form.
A straightforward technique attributed to György Révész transforms a grammar in Kuroda's form to Chomsky's CSG: ''AB'' → ''CD'' is replaced by four context-sensitive rules ''AB'' → ''AZ'', ''AZ'' → ''WZ'', ''WZ'' → ''WD'' and ''WD'' → ''CD''. This technique also proves that every noncontracting grammar is context-sensitive.
There is a similar normal form for
unrestricted grammar In automata theory, the class of unrestricted grammars (also called semi-Thue, type-0 or phrase structure grammars) is the most general class of grammars in the Chomsky hierarchy. No restrictions are made on the productions of an unrestricted gramma ...
s as well, which at least some authors call "Kuroda normal form" too:
:''AB'' → ''CD'' or
:''A'' → ''BC'' or
:''A'' → ''a'' or
:''A'' → ''ε''
where ε is the empty string. Every unrestricted grammar is
weakly equivalent to one using only productions of this form.
If the rule AB → CD is eliminated from the above, then one obtains context-free languages.
The Penttonen normal form (for unrestricted grammars) is a special case where first rule above is ''AB'' → ''AD''.
Similarly, for context-sensitive grammars, the Penttonen normal form, also called the one-sided normal form (following Penttonen's own terminology) is:
:''AB'' → ''AD'' or
:''A'' → ''BC'' or
:''A'' → ''a''
For every context-sensitive grammar, there exists a weakly equivalent one-sided normal form.
See also
*
Backus–Naur form
In computer science, Backus–Naur form () or Backus normal form (BNF) is a metasyntax notation for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document formats ...
*
Chomsky normal form
In formal language theory, a context-free grammar, ''G'', is said to be in Chomsky normal form (first described by Noam Chomsky) if all of its production rules are of the form:
: ''A'' → ''BC'', or
: ''A'' → ''a'', or
: ''S'' → ...
*
Greibach normal form In formal language theory, a context-free grammar is in Greibach normal form (GNF) if the right-hand sides of all production rules start with a terminal symbol, optionally followed by some variables. A non-strict form allows one exception to this f ...
References
Further reading
*
* G. Révész, "Comment on the paper 'Error detection in formal languages,'" Journal of Computer and System Sciences, vol. 8, no. 2, pp. 238–242, Apr. 1974. (Révész' trick)
*
{{Formal languages and grammars, state=collapsed
Formal languages