Language equations are mathematical statements that resemble
numerical equations, but the variables assume values of
formal languages
In logic, mathematics, computer science, and linguistics, a formal language consists of string (computer science), words whose symbol (formal), letters are taken from an alphabet (formal languages), alphabet and are well-formedness, well-formed ...
rather than numbers. Instead of arithmetic operations in numerical equations, the variables are joined by language operations. Among the most common operations on two languages ''A'' and ''B'' are the
set union
In set theory, the union (denoted by ∪) of a collection of sets is the set of all elements in the collection. It is one of the fundamental operations through which sets can be combined and related to each other.
A refers to a union of ze ...
''A'' ∪ ''B'', the
set intersection
In set theory, the intersection of two sets A and B, denoted by A \cap B, is the set containing all elements of A that also belong to B or equivalently, all elements of B that also belong to A.
Notation and terminology
Intersection is writt ...
''A'' ∩ ''B'', and the
concatenation
In formal language theory and computer programming, string concatenation is the operation of joining character strings end-to-end. For example, the concatenation of "snow" and "ball" is "snowball". In certain formalisations of concatenat ...
''A''⋅''B''. Finally, as an operation taking a single
operand
In mathematics, an operand is the object of a mathematical operation, i.e., it is the object or quantity that is operated on.
Example
The following arithmetic expression shows an example of operators and operands:
:3 + 6 = 9
In the above examp ...
, the set ''A''
* denotes the
Kleene star
In mathematical logic and computer science, the Kleene star (or Kleene operator or Kleene closure) is a unary operation, either on sets of strings or on sets of symbols or characters. In mathematics,
it is more commonly known as the free monoid ...
of the language ''A''. Therefore language equations can be used to represent
formal grammar
In formal language theory, a grammar (when the context is not given, often called a formal grammar for clarity) describes how to form strings from a language's alphabet that are valid according to the language's syntax. A grammar does not describe ...
s, since the languages generated by the grammar must be the solution of a system of language equations.
Language equations and context-free grammars
Ginsburg and
Rice
Rice is the seed of the grass species ''Oryza sativa'' (Asian rice) or less commonly ''Oryza glaberrima
''Oryza glaberrima'', commonly known as African rice, is one of the two domesticated rice species. It was first domesticated and grown i ...
gave an alternative definition of
context-free grammar
In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form
:A\ \to\ \alpha
with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be empt ...
s by language equations. To every context-free grammar
, is associated a system of equations in variables
. Each variable
is an unknown language over
and is defined by the equation
where
, ...,
are all productions for
. Ginsburg and Rice used a
fixed-point iteration
In numerical analysis, fixed-point iteration is a method of computing fixed points of a function.
More specifically, given a function f defined on the real numbers with real values and given a point x_0 in the domain of f, the fixed-point itera ...
argument to show that a solution always exists, and proved that i.e. any other solution must be a of this one.
For example, the grammar
corresponds to the equation system
which has as solution every superset of
.
Language equations with added intersection analogously correspond to
conjunctive grammars.
Language equations and finite automata
Brzozowski and Leiss
studied ''left language equations'' where every concatenation is with a singleton constant language on the left, e.g.
with variable
, but not
nor
. Each equation is of the form
with one variable on the right-hand side. Every
nondeterministic finite automaton
In automata theory, a finite-state machine is called a deterministic finite automaton (DFA), if
* each of its transitions is ''uniquely'' determined by its source state and input symbol, and
* reading an input symbol is required for each state ...
has such corresponding equation using left-concatenation and union, see Fig. 1. If intersection operation is allowed, equations correspond to
alternating finite automata.
Baader
Baader is a surname of German origin.
People with the surname Baader
* Andreas Baader (1943–1977), militant of the Red Army Faction (Rote Armee Fraktion), also known as the ''Baader Meinhoff Gang''
* Caspar Baader (born 1953), Swiss politicia ...
and Narendran
studied equations
using left-concatenation and union and proved that their satisfiability problem is
EXPTIME-complete
In computational complexity theory, the complexity class EXPTIME (sometimes called EXP or DEXPTIME) is the set of all decision problems that are solvable by a deterministic Turing machine in exponential time, i.e., in O(2''p''(''n'')) time, w ...
.
Conway's problem
Conway proposed the following problem: given a constant finite language
, is the greatest solution of the equation
always regular? This problem was studied by
Karhumäki and Petre
who gave an affirmative answer in a special case. A strongly negative answer to Conway's problem was given by
Kunc
KUNC (91.5 FM) is a radio station broadcasting a News/Talk public radio format. Licensed to Greeley, Colorado, United States, it serves Northern Colorado, including Fort Collins and Greeley. The station is owned and operated by Community Radio ...
who constructed a finite language
such that the greatest solution of this equation is not recursively enumerable.
Kunc
also proved that the greatest solution of inequality
is always regular.
Language equations with Boolean operations
Language equations with concatenation and Boolean operations were first studied by
Parikh
Parikh is a name found among Hindus of the Bania caste and also Jains. In means ''assayer'' in the Gujarati language and has its roots in the Sanskrit word for ''examiner''. Both the Oswal and Porwal communities of India have clans called ''Parek ...
,
Chandra
Chandra ( sa, चन्द्र, Candra, shining' or 'moon), also known as Soma ( sa, सोम), is the Hindu god of the Moon, and is associated with the night, plants and vegetation. He is one of the Navagraha (nine planets of Hinduism) and ...
,
Halpern Halpern is a variation of the Jewish surname Heilprin and may refer to:
* Baruch Halpern, Jewish studies
* Benjamin Halpern, American marine biologist and ecologist
* Carolyn Halpern, American psychologist
* Charles Halpern, lawyer
* Charna Hal ...
and
Meyer Meyer may refer to:
People
*Meyer (surname), listing people so named
* Meyer (name), a list of people and fictional characters with the name
Companies
* Meyer Burger, a Swiss mechanical engineering company
* Meyer Corporation
* Meyer Sound Labo ...
who proved that the satisfiability problem for a given equation is undecidable, and that if a system of language equations has a unique solution, then that solution is recursive. Later, Okhotin
proved that the unsatisfiability problem is
RE-complete and that every recursive language is a unique solution of some equation.
Language equations over a unary alphabet
For a one-letter alphabet, Leiss
discovered the first language equation with a nonregular solution, using complementation and concatenation operations. Later, Jeż
showed that non-regular unary languages can be defined by language equations with union, intersection and concatenation, equivalent to
conjunctive grammar Conjunctive grammars are a class of formal grammars
studied in formal language theory.
They extend the basic type of grammars,
the context-free grammars,
with a conjunction operation.
Besides explicit conjunction,
conjunctive grammars allow implicit ...
s. By this method Jeż and Okhotin
proved that every recursive unary language is a unique solution of some equation.
See also
*
Boolean grammar Boolean grammars, introduced by , are a class of formal grammars studied in formal language theory. They extend the basic type of grammars, the context-free grammars, with conjunction and negation operations. Besides these explicit operations, Bool ...
*
Arden's rule
In theoretical computer science, Arden's rule, also known as Arden's lemma, is a mathematical statement about a certain form of language equations.
Background
A (formal) language is simply a set of strings. Such sets can be specified by means of ...
*
Set constraint
In mathematics and theoretical computer science, a set constraint is an equation or an inequation between sets of terms.
Similar to systems of ( in) equations between numbers, methods are studied for solving systems of set constraints.
Different ...
References
External links
Workshop on Theory and Applications of Language Equations (TALE 2007)
Formal languages
Equations
{{comp-sci-theory-stub