Ogden's Lemma
   HOME

TheInfoList



OR:

In the theory of
formal language In logic, mathematics, computer science, and linguistics, a formal language is a set of strings whose symbols are taken from a set called "alphabet". The alphabet of a formal language consists of symbols that concatenate into strings (also c ...
s, Ogden's lemma (named after William F. Ogden) is a generalization of the pumping lemma for context-free languages. Despite Ogden's lemma being a strengthening of the pumping lemma, it is insufficient to fully characterize the class of context-free languages. This is in contrast to the
Myhill–Nerode theorem In the theory of formal languages, the Myhill–Nerode theorem provides a necessary and sufficient condition for a language to be regular. The theorem is named for John Myhill and Anil Nerode, who proved it at the University of Chicago in 1957 . ...
, which unlike the
pumping lemma for regular languages In the theory of formal languages, the pumping lemma for regular languages is a Lemma (mathematics), lemma that describes an essential property of all regular languages. Informally, it says that all sufficiently long string (computer science), st ...
is a necessary and sufficient condition for regularity.


Statement

We will use underlines to indicate "marked" positions.


Special cases

Ogden's lemma is often stated in the following form, which can be obtained by "forgetting about" the grammar, and concentrating on the language itself: If a language is context-free, then there exists some number p\geq 1 (where may or may not be a pumping length) such that for any string of length at least in and every way of "marking" or more of the positions in , can be written as :s = uvwxy with strings and , such that # has at least one marked position, # has at most marked positions, and #uv^n wx^n y \in L for all n \geq 0. In the special case where every position is marked, Ogden's lemma is equivalent to the pumping lemma for context-free languages. Ogden's lemma can be used to show that certain languages are not context-free in cases where the pumping lemma is not sufficient. An example is the language \.


Example applications


Non-context-freeness

The special case of Ogden's lemma is often sufficient to prove some languages are not context-free. For example, \ is a standard example of non-context-free language, Similarly, one can prove the "copy twice" language L=\ is not context-free, by using Ogden's lemma on a^\underlinea^b^. And the given example last section \ is not context-free by using Ogden's lemma on ab^ \underlined^.


Inherent ambiguity

Ogden's lemma can be used to prove the inherent ambiguity of some languages, which is implied by the title of Ogden's paper. Example: Let L_0 = \, L_1 = \. The language L = L_0 \cup L_1 is inherently ambiguous. (Example from page 3 of Ogden's paper.) Similarly, L^* is inherently ambiguous, and for any CFG of the language, letting p be the constant for Ogden's lemma, we find that (a^b^c^)^n has at least 2^n different parses. Thus L^* has an unbounded degree of inherent ambiguity.


Undecidability

The proof can be extended to show that deciding whether a CFG is inherently ambiguous is undecidable, by reduction to the
Post correspondence problem The Post correspondence problem is an undecidable decision problem that was introduced by Emil Post in 1946. Because it is simpler than the halting problem In computability theory (computer science), computability theory, the halting problem ...
. It can also show that deciding whether a CFG has an unbounded degree of inherent ambiguity is undecidable. (page 4 of Ogden's paper)


Generalized condition

Bader and Moura have generalized the lemma to allow marking some positions that are ''not'' to be included in . Their dependence of the parameters was later improved by Dömösi and Kudlek. If we denote the number of such ''excluded'' positions by , then the number of ''marked'' positions of which we want to include some in must satisfy d\geq p(e+1), where is some constant that depends only on the language. The statement becomes that every can be written as :s = uvwxy with strings and , such that # has at least one marked position and no excluded position, # has at most p^ marked positions, and #uv^n wx^n y \in L for all n \geq 0. Moreover, either each of has a marked position, or each of w,x,y has a marked position.


References

{{Reflist Formal languages Lemmas