computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...

, more specifically in

automata An automaton (; : automata or automatons) is a relatively self-operating machine, or control mechanism designed to automatically follow a sequence of operations, or respond to predetermined instructions. Some automata, such as bellstrikers i ...

and

formal language In logic, mathematics, computer science, and linguistics, a formal language is a set of strings whose symbols are taken from a set called "alphabet". The alphabet of a formal language consists of symbols that concatenate into strings (also c ...

theory, nested words are a concept proposed by Alur and Madhusudan as a joint generalization of

words A word is a basic element of language that carries meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no consensus among linguists on its ...

, as traditionally used for modelling

linearly ordered In mathematics, a total order or linear order is a partial order in which any two elements are comparable. That is, a total order is a binary relation \leq on some set X, which satisfies the following for all a, b and c in X: # a \leq a ( re ...

structures, and of ordered unranked

trees In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, e.g., including only woody plants with secondary growth, only p ...

, as traditionally used for modelling hierarchical structures. Finite-state acceptors for nested words, so-called nested word automata, then give a more expressive generalization of

finite automata A finite-state machine (FSM) or finite-state automaton (FSA, plural: ''automata''), finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number ...

on words. The linear encodings of languages accepted by finite nested word automata gives the class of visibly pushdown languages. The latter language class lies properly between the

regular language In theoretical computer science and formal language theory, a regular language (also called a rational language) is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science (as opposed to ...

s and the

deterministic context-free language In formal language theory, deterministic context-free languages (DCFL) are a proper subset of context-free languages. They are context-free languages that can be accepted by a deterministic pushdown automaton. DCFLs are always unambiguous, meanin ...

s. Since their introduction in 2004, these concepts have triggered much research in that area.

Formal definition

To define ''nested words'', first define ''matching relations''. For a

nonnegative integer In mathematics, the natural numbers are the numbers 0, 1, 2, 3, and so on, possibly excluding 0. Some start counting with 0, defining the natural numbers as the non-negative integers , while others start with 1, defining them as the positiv ...

\ell

, the notation

ell An ell (from Proto-Germanic *''alinō'', cognate with Latin ''ulna'') is a northwestern European unit of measurement, originally understood as a cubit (the combined length of the forearm and extended hand). The word literally means "arm", an ...

/math> denotes the set

\

, with the special case

\emptyset

. A ''matching relation'' ↝ of length

\ell\ge 0

is a subset of

\\times\

such that: # all nesting edges are forward, that is, if then ; # nesting edges never have a finite position in common, that is, for , there is at most one position ''h'' such that , and there is at most one position ''j'' such that ''i'' ↝ ''j''; and # nesting edges never cross, that is, there are no such that both and . A position ''i'' is referred to as * a ''call position'', if ''i'' ↝ ''j'' for some ''j'', * a ''pending call'' if ''i'' ↝ ∞, * a ''return position'', if ''h'' ↝ ''i'' for some ''h'', * a ''pending return'' if −∞ ↝ ''i'', and * an ''internal position'' in all remaining cases. A ''nested word'' of length

\ell

over an

alphabet An alphabet is a standard set of letter (alphabet), letters written to represent particular sounds in a spoken language. Specifically, letters largely correspond to phonemes as the smallest sound segments that can distinguish one word from a ...

Σ is a pair (''w'',↝), where ''w'' is a word, or

string String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...

, of length

\ell

over Σ and ↝ is a matching relation of length

\ell

Encoding nested words into ordinary words

Nested words over the alphabet

\Sigma=\

can be encoded into "ordinary" words over the ''tagged alphabet''

\hat

, in which each symbol ''a'' from Σ has three tagged counterparts: the symbol ⟨a for encoding a call position in a nested word labelled with ''a'', the symbol a⟩ for encoding a return position labelled with ''a'', and finally the symbol a itself for representing an internal position labelled with ''a''. More precisely, let ''φ'' be the function mapping nested words over Σ to words over

\hat

such that each nested word (

w_1w_2\cdots w_\ell

,↝) is mapped to the word

x_1x_2...x_\ell

, where the letter

x_i

equals ⟨a, a, and a⟩, if

w_i=a

and ''i'' is a (possibly pending) call position, an internal position, and a (possibly pending) return position, respectively.

Example

For illustration, let be the nested word over a ternary alphabet with and matching relation . Then its encoding as word reads as .

Automata

Nested word automaton

A ''nested word automaton'' has a finite number of states, and operates in almost the same way as a

deterministic finite automaton In the theory of computation, a branch of theoretical computer science, a deterministic finite automaton (DFA)—also known as deterministic finite acceptor (DFA), deterministic finite-state machine (DFSM), or deterministic finite-state auto ...

on classical strings: a classical finite automaton reads the input word

w = w_1\cdots w_\ell

from left to right, and the state of the automaton after reading the ''j''th letter

w_j

depends on the state in which the automaton was before reading

w_j

. In a nested word automaton, the position

j

in the nested word (w,↝) might be a return position; if so, the state after reading

w_j

will not only depend on the ''linear state'' in which the automaton was before reading

w_j

, but also on a ''hierarchical state'' propagated by the automaton at the time it was in the corresponding call position. In analogy to

s of words, a set ''L'' of nested words is called ''regular'' if it is accepted by some (finite-state) nested word automaton.

Visibly pushdown automaton

Nested word automata are an automaton model accepting nested words. There is an equivalent automaton model operating on (ordinary) words. Namely, the notion of a deterministic visibly pushdown automaton is a restriction of the notion of a

deterministic pushdown automaton In automata theory, a deterministic pushdown automaton (DPDA or DPA) is a variation of the pushdown automaton. The class of deterministic pushdown automata accepts the deterministic context-free languages, a proper subset of context-free languages ...

. Following Alur and Madhusudan, a deterministic visibly pushdown automaton is formally defined as a 6-tuple

M=(Q, \hat, \Gamma, \delta, q_0, F)

where *

Q

is a finite set of ''states'', *

\hat

is the ''input alphabet'', which – in contrast to that of ordinary pushdown automata – is partitioned into three sets

\Sigma_\text

\Sigma_\text

, and

\Sigma_\text

. The alphabet

\Sigma_\text

denotes the set of ''call symbols'',

\Sigma_\text

contains the ''return symbols'', and the set

\Sigma_\text

contains the ''internal symbols'', *

\Gamma

is a finite set which is called the ''stack alphabet'', containing a special symbol

\bot\in\Gamma

denoting the empty stack, *

\delta = \delta_\text \cup \delta_\text \cup \delta_\text

is the ''transition function'', which is partitioned into three parts corresponding to call transitions, return transitions, and internal transitions, namely **

\delta_\text\colon Q \times \Sigma_\text \to Q \times \Gamma

, the ''call transition function'' **

\delta_\text\colon Q \times \Sigma_\text \times \Gamma \to Q

, the ''return transition function'' **

\delta_\text:Q \times \Sigma_\text \to Q

, the ''internal transition function'', *

q_0\in\, Q

is the ''initial state'', and *

F \subseteq Q

is the set of ''accepting states''. The notion of ''computation'' of a visibly pushdown automaton is a restriction of the one used for

pushdown automata In the theory of computation, a branch of theoretical computer science, a pushdown automaton (PDA) is a type of automaton that employs a stack. Pushdown automata are used in theories about what can be computed by machines. They are more capab ...

. Visibly pushdown automata only add a symbol to the stack when reading a call symbol

a_\text\in \Sigma_\text

, they only remove the top element from the stack when reading a return symbol

a_\text\in\Sigma_\text

and they do not alter the stack when reading an internal event

a_\text\in\Sigma_\text

. A computation ending in an accepting state is an ''accepting computation''. As a result, a visibly pushdown automaton cannot push to and pop from the stack with the same input symbol. Thus the language

L=\

cannot be accepted by a visibly pushdown automaton for any partition of

\Sigma

, however there are pushdown automata accepting this language. If a

language Language is a structured system of communication that consists of grammar and vocabulary. It is the primary means by which humans convey meaning, both in spoken and signed language, signed forms, and may also be conveyed through writing syste ...

L

over a tagged alphabet

\hat

is accepted by a deterministic visibly pushdown automaton, then

L

is called a ''visibly pushdown language''.

Nondeterministic visibly pushdown automata

Nondeterministic visibly pushdown automata are as expressive as deterministic ones. Hence one can transform a nondeterministic visibly pushdown automaton into a deterministic one, but if the nondeterministic automaton had

s

states, the deterministic one may have up to

2^

states.

Decision problems

Let

, A,

be the size of the description of an automaton

A

, then it is possible to check if a word ''n'' is accepted by the automaton in time

O(, A, ^3\ell)

. In particular, the emptiness problem is solvable in time

O(, A, ^3)

. If

A

is fixed, it is decidable in time

O(\ell)

and space

O(d)

where

d

is the depth of ''n'' in a streaming seeing. It is also decidable with space

O(\log(\ell))

and time

O(\ell^2 \log (\ell))

, and by a uniform Boolean circuit of depth

O(\log \ell)

. For two nondeterministic automata ''A'' and ''B'', deciding whether the set of words accepted by ''A'' is a subset of the word accepted by ''B'' is

EXPTIME In computational complexity theory, the complexity class EXPTIME (sometimes called EXP or DEXPTIME) is the set of all decision problems that are solvable by a deterministic Turing machine in exponential time, i.e., in O(2''p''(''n'')) time, w ...

-complete. It is also EXPTIME-complete to figure out if there is a word that is not accepted.

Languages

As the definition of visibly pushdown automata shows, deterministic visibly pushdown automata can be seen as a special case of deterministic pushdown automata; thus the set VPL of visibly pushdown languages over

\,\hat

forms a subset of the set DCFL of

s over the set of symbols in

\,\hat

. In particular, the function that removes the matching relation from nested words transforms regular languages over nested words into context-free languages.

Closure properties

The set of visibly pushdown languages is closed under the following operations: *set operations: **union **intersection **complement, :thus giving rise to a

Boolean algebra In mathematics and mathematical logic, Boolean algebra is a branch of algebra. It differs from elementary algebra in two ways. First, the values of the variable (mathematics), variables are the truth values ''true'' and ''false'', usually denot ...

. *

Kleene star In mathematical logic and theoretical computer science, the Kleene star (or Kleene operator or Kleene closure) is a unary operation on a Set (mathematics), set to generate a set of all finite-length strings that are composed of zero or more repe ...

concatenation In formal language theory and computer programming, string concatenation is the operation of joining character strings end-to-end. For example, the concatenation of "snow" and "ball" is "snowball". In certain formalizations of concatenati ...

* reversal *

String homomorphism In computer science, in the area of formal language theory, frequent use is made of a variety of string functions; however, the notation used is different from that used for computer programming, and some commonly used functions in the theoretical ...

For the intersection operation, one can construct a VPA ''M'' simulating two given VPAs

M_1

and

M_2

by a simple product construction : For

i=1,2

, assume

M_i

is given as

(Q_i,\  \hat,\  \Gamma_i,\  \delta_i, \ s_,\ Z_i, \ F_i)

. Then for the automaton ''M'', the set of states is

\, Q_1\times Q_2

, the initial state is

\left(s_, s_2\right)

, the set of final states is

F_1 \times F_2

, the stack alphabet is given by

\,\Gamma_1\times\Gamma_2

, and the initial stack symbol is

(Z_1,Z_2)

. If

M

is in state

(p_1,p_2)

on reading a ''call symbol''

\left\langle a\right.

, then

M

pushes the stack symbol

(\gamma_1,\gamma_2)

and goes to state

(q_1,q_2)

, where

\gamma_i

is the stack symbol pushed by

M_i

when transitioning from state

p_i

q_i

on reading input

\left\langle a\right.

. If

M

is in state

(p_1,p_2)

on reading an ''internal symbol''

a

, then

M

goes to state

(q_1,q_2)

, whenever

M_i

transitions from state

p_i

q_i

on reading ''a''. If

M

is in state

(p_1,p_2)

on reading a ''return symbol''

\left. a\right\rangle

, then

M

pops the symbol

(\gamma_1,\gamma_2)

from the stack and goes to state

(q_1,q_2)

, where

\gamma_i

is the stack symbol popped by

M_i

when transitioning from state

p_i

q_i

on reading

\left. a\right\rangle

. Correctness of the above construction crucially relies on the fact that the push and pop actions of the simulated machines

M_1

and

M_2

are synchronized along the input symbols read. In fact, a similar simulation is no longer possible for deterministic pushdown automata, as the larger class of deterministic context-free languages is no longer closed under intersection. In contrast to the construction for concatenation shown above, the complementation construction for visibly pushdown automata parallels the standard construction. for deterministic pushdown automata. Moreover, like the class of context free languages the class of visibly pushdown languages is closed under prefix closure and reversal, hence also suffix closure.

Relation to other language classes

point out that the visibly pushdown languages are more general than the parenthesis languages suggested in . As shown by , the visibly pushdown languages in turn are strictly contained in the class of languages described by

operator-precedence grammar An operator precedence grammar is a kind of grammar for formal languages. Technically, an operator precedence grammar is a context-free grammar that has the property (among others) that no production has either an empty right-hand side or two adjac ...

s, which were introduced by and enjoy the same closure properties and characteristics (see for ω languages and logic and automata-based characterizations). In comparison to conjunctive grammars, a generalization of context-free grammars, shows that the linear conjunctive languages form a superclass of the visibly pushdown languages. The table at the end of this article puts the family of visibly pushdown languages in relation to other language families in the

Chomsky hierarchy The Chomsky hierarchy in the fields of formal language theory, computer science, and linguistics, is a containment hierarchy of classes of formal grammars. A formal grammar describes how to form strings from a formal language's alphabet that are v ...

. Rajeev Alur and Parthasarathy Madhusudan Sect.4, Theorem 5, Sect.7 related a subclass of regular binary tree languages to visibly pushdown languages.

Other models of description

Visibly pushdown grammars

Visibly pushdown languages are exactly the languages that can be described by ''visibly pushdown grammars''. Visibly pushdown grammars can be defined as a restriction of

context-free grammars In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules can be applied to a nonterminal symbol regardless of its context. In particular, in a context-free grammar, each production rule is of the for ...

. A visibly pushdown grammar ''G'' is defined by the 4-

tuple In mathematics, a tuple is a finite sequence or ''ordered list'' of numbers or, more generally, mathematical objects, which are called the ''elements'' of the tuple. An -tuple is a tuple of elements, where is a non-negative integer. There is o ...

G = (V=V^0\cup V^1\,, \Sigma\,, R\,, S\,)

where *

V^0\,

and

V^1\,

are disjoint finite sets; each element

v\in V

is called ''a non-terminal character'' or a ''variable''. Each variable represents a different type of phrase or clause in the sentence. Each variable defines a sub-language of the language defined by

G\,

, and the sub-languages of

V^0\,

are the one without pending calls or pending returns. *

\Sigma\,

is a finite set of ''terminal''s, disjoint from

V\,

, which make up the actual content of the sentence. The set of terminals is the alphabet of the language defined by the grammar

G\,

. *

R\,

is a finite relation from

V\,

(V\cup\Sigma)^

such that

\exist\, w\in (V\cup\Sigma)^: (S,w)\in R

. The members of

R\,

are called the ''(rewrite) rule''s or ''production''s of the grammar. There are three kinds of rewrite rules. For

X,Y\in V ,Z\in V^0

a\in \hat\Sigma

and

b\in \hat\Sigma

X\to \epsilon

X\to aY

and if

X\in V^0

then

Y\in V^0

and

a\in \Sigma

X\to \langle aZb\rangle Y

and if

X\in V^0

then

Y\in V^0

S\in V\,

is the ''start variable'' (or ''start symbol''), used to represent the whole sentence (or program). Here, the asterisk represents the

operation and

\epsilon

is the empty word.

Uniform Boolean circuits

The problem whether a word of length

\ell

is accepted by a given nested word automaton can be solved by uniform

Boolean circuit In computational complexity theory and circuit complexity, a Boolean circuit is a mathematical model for combinational digital logic circuits. A formal language can be decided by a family of Boolean circuits, one circuit for each possible inpu ...

s of depth

\Omicron(\log\ell)

Logical description

Regular languages over nested words are exactly the set of languages described by monadic

second-order logic In logic and mathematics, second-order logic is an extension of first-order logic, which itself is an extension of propositional logic. Second-order logic is in turn extended by higher-order logic and type theory. First-order logic quantifies on ...

with two unary predicates ''call'' and ''return'', linear successor and the matching relation ↝.

Notes

References

* * * * * *Okhotin, Alexander
Comparing linear conjunctive languages to subfamilies of the context-free languages
37th International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM 2011). *

External links

Visibly pushdown automata – Automata on nested words

class VPL
at the Complexity Zoo {{Formal languages and grammars Words Formal languages Automata (computation)

Formal definition

Encoding nested words into ordinary words

Example

Automata

Nested word automaton

Visibly pushdown automaton

Nondeterministic visibly pushdown automata

Decision problems

Languages

Closure properties

Relation to other language classes

Other models of description

Visibly pushdown grammars

Uniform Boolean circuits

Logical description

See also

Notes

References

External links