Context-sensitive Language
   HOME

TheInfoList



OR:

In
formal language theory In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules. The alphabet of a formal language consists of symb ...
, a context-sensitive language is a language that can be defined by a context-sensitive grammar (and equivalently by a
noncontracting grammar In formal language theory, a formal grammar, grammar is noncontracting (or monotonic) if all of its production rules are of the form α → β where α and β are string (formal languages)#Formal theory, strings of nonterminal and termina ...
). Context-sensitive is one of the four types of grammars in the
Chomsky hierarchy In formal language theory, computer science and linguistics, the Chomsky hierarchy (also referred to as the Chomsky–Schützenberger hierarchy) is a containment hierarchy of classes of formal grammars. This hierarchy of grammars was described by ...
.


Computational properties

Computationally, a context-sensitive language is equivalent to a linear bounded
nondeterministic Turing machine In theoretical computer science, a nondeterministic Turing machine (NTM) is a theoretical model of computation whose governing rules specify more than one possible action when in some given situations. That is, an NTM's next state is ''not'' comp ...
, also called a
linear bounded automaton In computer science, a linear bounded automaton (plural linear bounded automata, abbreviated LBA) is a restricted form of Turing machine. Operation A linear bounded automaton is a nondeterministic Turing machine that satisfies the following thre ...
. That is a non-deterministic Turing machine with a tape of only kn cells, where n is the size of the input and k is a constant associated with the machine. This means that every formal language that can be decided by such a machine is a context-sensitive language, and every context-sensitive language can be decided by such a machine. This set of languages is also known as NLINSPACE or NSPACE(''O''(''n'')), because they can be accepted using linear space on a non-deterministic Turing machine. The class LINSPACE (or DSPACE(''O''(''n''))) is defined the same, except using a
deterministic Determinism is a philosophical view, where all events are determined completely by previously existing causes. Deterministic theories throughout the history of philosophy have developed from diverse and sometimes overlapping motives and consi ...
Turing machine. Clearly LINSPACE is a subset of NLINSPACE, but it is not known whether LINSPACE=NLINSPACE.


Examples

One of the simplest context-sensitive but not context-free languages is L = \: the language of all strings consisting of occurrences of the symbol "a", then "b"'s, then "c"'s (abc, , , etc.). A superset of this language, called the Bach language, is defined as the set of all strings where "a", "b" and "c" (or any other set of three symbols) occurs equally often (, , etc.) and is also context-sensitive. can be shown to be a context-sensitive language by constructing a linear bounded automaton which accepts . The language can easily be shown to be neither regular nor context free by applying the respective
pumping lemma In the theory of formal language In logic, mathematics, computer science, and linguistics, a formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules. The alphabet of ...
s for each of the language classes to . Similarly: L_ = \ is another context-sensitive language; the corresponding context-sensitive grammar can be easily projected starting with two context-free grammars generating sentential forms in the formats a^mC^m and B^nd^n and then supplementing them with a permutation production like CB\rightarrow BC, a new starting symbol and standard syntactic sugar. L_ = \ is another context-sensitive language (the "3" in the name of this language is intended to mean a ternary alphabet); that is, the "product" operation defines a context-sensitive language (but the "sum" defines only a context-free language as the grammar S\rightarrow aSc, R and R\rightarrow bRc, bc shows). Because of the commutative property of the product, the most intuitive grammar for L_ is ambiguous. This problem can be avoided considering a somehow more restrictive definition of the language, e.g. L_ = \. This can be specialized to L_ = \ and, from this, to L_ = \, L_ = \, etc. L_ = \ is a context-sensitive language. The corresponding context-sensitive grammar can be obtained as a generalization of the context-sensitive grammars for L_ = \, L_ = \, etc. L_ = \ is a context-sensitive language. L_ = \ is a context-sensitive language (the "2" in the name of this language is intended to mean a binary alphabet). This was proved by Hartmanis using pumping lemmas for regular and context-free languages over a binary alphabet and, after that, sketching a linear bounded multitape automaton accepting L_. L_ = \ is a context-sensitive language (the "1" in the name of this language is intended to mean an unary alphabet). This was credited by A. Salomaa to Matti Soittola by means of a linear bounded automaton over an unary alphabet (pages 213-214, exercise 6.8) and also to Marti Penttonen by means of a context-sensitive grammar also over an unary alphabet (See: Formal Languages by A. Salomaa, page 14, Example 2.5). An example of
recursive language In mathematics, logic and computer science, a formal language (a set of finite sequences of symbols taken from a fixed alphabet) is called recursive if it is a recursive subset of the set of all possible finite sequences over the alphabet of the ...
that is not context-sensitive is any recursive language whose decision is an EXPSPACE-hard problem, say, the set of pairs of equivalent
regular expression A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" or ...
s with exponentiation.


Properties of context-sensitive languages

* The union, intersection, concatenation of two context-sensitive languages is context-sensitive, also the
Kleene plus In mathematical logic and computer science, the Kleene star (or Kleene operator or Kleene closure) is a unary operation, either on sets of strings or on sets of symbols or characters. In mathematics, it is more commonly known as the free monoid ...
of a context-sensitive language is context-sensitive. * The complement of a context-sensitive language is itself context-sensitive a result known as the Immerman–Szelepcsényi theorem. * Membership of a string in a language defined by an arbitrary context-sensitive grammar, or by an arbitrary deterministic context-sensitive grammar, is a
PSPACE-complete In computational complexity theory, a decision problem is PSPACE-complete if it can be solved using an amount of memory that is polynomial in the input length (polynomial space) and if every other problem that can be solved in polynomial space can b ...
problem.


See also

*
Linear bounded automaton In computer science, a linear bounded automaton (plural linear bounded automata, abbreviated LBA) is a restricted form of Turing machine. Operation A linear bounded automaton is a nondeterministic Turing machine that satisfies the following thre ...
*
List of parser generators for context-sensitive languages This is a list of notable lexer generators and parser generators for various language classes. Regular languages Regular languages are a category of languages (sometimes termed Chomsky Type 3) which can be matched by a state machine (more spe ...
*
Chomsky hierarchy In formal language theory, computer science and linguistics, the Chomsky hierarchy (also referred to as the Chomsky–Schützenberger hierarchy) is a containment hierarchy of classes of formal grammars. This hierarchy of grammars was described by ...
*
Indexed language Indexed languages are a class of formal languages discovered by Alfred Aho; they are described by indexed grammars and can be recognized by nested stack automata. Indexed languages are a proper subset of context-sensitive languages. They qualify as ...
s – a strict subset of the context-sensitive languages *
Weir hierarchy An embedded pushdown automaton or EPDA is a computational model for parsing languages generated by tree-adjoining grammars (TAGs). It is similar to the context-free grammar-parsing pushdown automaton, but instead of using a plain stack to store sy ...


References

* Sipser, M. (1996), ''Introduction to the Theory of Computation'', PWS Publishing Co. {{DEFAULTSORT:Context-Sensitive Language Formal languages