Van Wijngaarden grammar
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, a Van Wijngaarden grammar (also vW-grammar or W-grammar) is a
two-level grammar A two-level grammar is a formal grammar that is used to generate another formal grammar, such as one with an infinite rule set. This is how a Van Wijngaarden grammar was used to specify Algol 68. A context-free grammar that defines the rules for a ...
which provides a technique to define potentially infinite
context-free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form :A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be em ...
s in a finite number of rules. The formalism was invented by Adriaan van Wijngaarden. to define rigorously some
syntactic In linguistics, syntax () is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituency) ...
restrictions which previously had to be formulated in
natural language In neuropsychology, linguistics, and philosophy of language, a natural language or ordinary language is any language that has evolved naturally in humans through use and repetition without conscious planning or premeditation. Natural languages ...
, despite their essentially syntactical content. Typical applications are the treatment of
gender Gender is the range of characteristics pertaining to femininity and masculinity and differentiating between them. Depending on the context, this may include sex-based social structures (i.e. gender roles) and gender identity. Most culture ...
and
number A number is a mathematical object used to count, measure, and label. The original examples are the natural numbers 1, 2, 3, 4, and so forth. Numbers can be represented in language with number words. More universally, individual number ...
in
natural language In neuropsychology, linguistics, and philosophy of language, a natural language or ordinary language is any language that has evolved naturally in humans through use and repetition without conscious planning or premeditation. Natural languages ...
syntax and the well-definedness of identifiers in
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s. For example, "there is one person" and "there are two people" are both grammatically correct but "there are one person" is incorrect for context-sensitive reasons that a W-grammar could represent. The technique was used and developed in the definition of the
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
ALGOL 68 ALGOL 68 (short for ''Algorithmic Language 1968'') is an imperative programming language that was conceived as a successor to the ALGOL 60 programming language, designed with the goal of a much wider scope of application and more rigorously d ...
. It is an example of the larger class of
affix grammar An affix grammar is a kind of formal grammar; it is used to describe the syntax of languages, mainly computer languages, using an approach based on how natural language is typically described.Koster, Cornelis HA.Affix grammars for natural languages ...
s. These grammars continue to be studied.


Overview

A W-grammar consists of a finite set of
meta Meta (from the Greek μετά, '' meta'', meaning "after" or "beyond") is a prefix meaning "more comprehensive" or "transcending". In modern nomenclature, ''meta''- can also serve as a prefix meaning self-referential, as a field of study or end ...
-rules, which are used to derive (possibly infinitely many) production rules from a finite set of
hyper Hyper may refer to: Arts and entertainment * ''Hyper'' (2016 film), 2016 Indian Telugu film * ''Hyper'' (2018 film), 2018 Indian Kannada film * ''Hyper'' (magazine), an Australian video game magazine *Hyper (TV channel), a Filipino sports channe ...
-rules. Meta-rules are restricted to those defined by a
context-free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form :A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be em ...
. Hyper-rules restrict the admissible contexts at the upper level. Essentially, the ''consistent substitution'' used in the derivation process is equivalent to unification, as in
Prolog Prolog is a logic programming language associated with artificial intelligence and computational linguistics. Prolog has its roots in first-order logic, a formal logic, and unlike many other programming languages, Prolog is intended primarily ...
, as was noted by
Alain Colmerauer Alain Colmerauer (24 January 1941 – 12 May 2017) was a French computer scientist. He was a professor at Aix-Marseille University, and the creator of the logic programming language Prolog. Early life Alain Colmerauer was born on 24 January 1941 ...
. For example, the
assignment Assignment, assign or The Assignment may refer to: * Homework * Sex assignment * The process of sending National Basketball Association players to its development league; see Computing * Assignment (computer science), a type of modification to ...
x:=1 is only valid if the variable x can contain an integer. Therefore, the context-free syntax variable := value is incomplete. In a two-level grammar, this might be specified in a context-sensitive manner as REF TYPE variable := TYPE value. Then ref integer variable := integer value could be a production rule but ref Boolean variable := integer value is not a possible production rule. This also means that assigning with incompatible types becomes a syntax error which can be caught at compile-time. Similarly,
STYLE begin token, new LAYER1 preludes, 
       parallel token, new LAYER1 tasks PACK, 
       STYLE end token
allows begin ... end and but not begin ... }.


ALGOL 68 examples

Prior to
ALGOL 68 ALGOL 68 (short for ''Algorithmic Language 1968'') is an imperative programming language that was conceived as a successor to the ALGOL 60 programming language, designed with the goal of a much wider scope of application and more rigorously d ...
the language
ALGOL 60 ALGOL 60 (short for ''Algorithmic Language 1960'') is a member of the ALGOL family of computer programming languages. It followed on from ALGOL 58 which had introduced code blocks and the begin and end pairs for delimiting them, representing a ...
was formalised using the context-free
Backus–Naur form In computer science, Backus–Naur form () or Backus normal form (BNF) is a metasyntax notation for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document format ...
. The appearance of new context-sensitive two-level grammar presented a challenge to some readers of the 1968
ALGOL 68 ALGOL 68 (short for ''Algorithmic Language 1968'') is an imperative programming language that was conceived as a successor to the ALGOL 60 programming language, designed with the goal of a much wider scope of application and more rigorously d ...
"Final Report". Subsequently, the final report was revised by Wijngaarden and his colleagues and published as the 1973 ALGOL 68 "Revised Report". The grammar for ALGOL 68 is officially defined in the two-level Van Wijngaarden grammar, but a subset has been done in the one-level
Backus–Naur form In computer science, Backus–Naur form () or Backus normal form (BNF) is a metasyntax notation for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document format ...
, compare: * Van Wijngaarden grammar; * Backus–Naur form/
Yacc Yacc (Yet Another Compiler-Compiler) is a computer program for the Unix operating system developed by Stephen C. Johnson. It is a Look Ahead Left-to-Right Rightmost Derivation (LALR) parser generator, generating a LALR parser (the part of a co ...


ALGOL 68 as in the 1968 Final Report §2.1

a) program : open symbol, standard prelude, library prelude option, particular program, exit, library postlude option, standard postlude, close symbol. b) standard prelude : declaration prelude sequence. c) library prelude : declaration prelude sequence. d) particular program : label sequence option, strong CLOSED void clause. e) exit : go on symbol, letter e letter x letter i letter t, label symbol. f) library postlude : statement interlude. g) standard postlude : strong void clause train


ALGOL 68 as in the 1973 Revised Report §2.2.1, §10.1.1

program : strong void new closed clause A) EXTERNAL :: standard ; library ; system ; particular. B) STOP :: label letter s letter t letter o letter p. a) program text : STYLE begin token, new LAYER1 preludes, parallel token, new LAYER1 tasks PACK, STYLE end token. b) NEST1 preludes : NEST1 standard prelude with DECS1, NEST1 library prelude with DECSETY2, NEST1 system prelude with DECSETY3, where (NEST1) is (new EMPTY new DECS1 DECSETY2 DECSETY3). c) NEST1 EXTERNAL prelude with DECSETY1 : strong void NEST1 series with DECSETY1, go on token ; where (DECSETY1) is (EMPTY), EMPTY. d) NEST1 tasks : NEST1 system task list, and also token, NEST1 user task PACK list. e) NEST1 system task : strong void NEST1 unit. f) NEST1 user task : NEST2 particular prelude with DECS, NEST2 particular program PACK, go on token, NEST2 particular postlude, where (NEST2) is (NEST1 new DECS STOP). g) NEST2 particular program : NEST2 new LABSETY3 joined label definition of LABSETY3, strong void NEST2 new LABSETY3 ENCLOSED clause. h) NEST joined label definition of LABSETY : where (LABSETY) is (EMPTY), EMPTY ; where (LABSETY) is (LAB1 LABSETY1), NEST label definition of LAB1, NEST joined label definition of$ LABSETY1. i) NEST2 particular postlude : strong void NEST2 series with STOP.


Implementations

''yo-yo''. parser for van Wijngaarden grammars with example grammars for ''expressions'', ''eva'', ''sal'' and
Pascal Pascal, Pascal's or PASCAL may refer to: People and fictional characters * Pascal (given name), including a list of people with the name * Pascal (surname), including a list of people and fictional characters with the name ** Blaise Pascal, Frenc ...
(the actual ISO 7185 standard for Pascal uses
extended Backus–Naur form In computer science, extended Backus–Naur form (EBNF) is a family of metasyntax notations, any of which can be used to express a context-free grammar. EBNF is used to make a formal description of a formal language such as a computer programm ...
).


History

W-grammars are based on the idea of providing the nonterminal symbols of context-free grammars with ''attributes'' (or ''affixes'') that pass information between the nodes of the
parse tree A parse tree or parsing tree or derivation tree or concrete syntax tree is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term ''parse tree'' itself is used primarily in co ...
, used to constrain the syntax and to specify the semantics. This idea was well known at the time; e.g.
Donald Knuth Donald Ervin Knuth ( ; born January 10, 1938) is an American computer scientist, mathematician, and professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of computer sc ...
visited the ALGOL 68 design committee while developing his own version of it,
attribute grammar An attribute grammar is a formal way to supplement a formal grammar with semantic information processing. Semantic information is stored in attributes associated with terminal and nonterminal symbols of the grammar. The values of attributes are resu ...
s. Quite peculiar to W-grammars was their strict treatment of attributes as strings, defined by a context-free grammar, on which concatenation is the only possible operation; in attribute grammars, attributes can be of any data type, and any kind of operation can be applied to them. After their introduction in the Algol 68 report, W-grammars were widely considered as too powerful and unconstrained to be practical. This was partly a consequence of the way in which they had been applied; the revised Algol 68 report contained a much more readable grammar, without modifying the W-grammar formalism itself. Meanwhile, it became clear that W-grammars, when used in their full generality, are indeed too powerful for such practical purposes as serving as the input for a
parser generator In computer science, a compiler-compiler or compiler generator is a programming tool that creates a parser, interpreter, or compiler from some form of formal description of a programming language and machine. The most common type of compiler- ...
. They describe precisely all
recursively enumerable language In mathematics, logic and computer science, a formal language is called recursively enumerable (also recognizable, partially decidable, semidecidable, Turing-acceptable or Turing-recognizable) if it is a recursively enumerable subset in the set o ...
s, which makes parsing impossible in general: it is an
undecidable problem In computability theory and computational complexity theory, an undecidable problem is a decision problem for which it is proved to be impossible to construct an algorithm that always leads to a correct yes-or-no answer. The halting problem is an ...
to decide whether a given string can be generated by a given W-grammar. Their use must be seriously constrained when used for automatic parsing or translation. Restricted and modified variants of W-grammars were developed to address this, e.g. * Extended Affix Grammars (EAGs), applied to describe the grammars of
natural language In neuropsychology, linguistics, and philosophy of language, a natural language or ordinary language is any language that has evolved naturally in humans through use and repetition without conscious planning or premeditation. Natural languages ...
such as English and Spanish) *
Q-systems Q-systems are a method of directed graph transformations according to given grammar rules, developed at the Université de Montréal by Alain Colmerauer in 1967–70 for use in natural language processing. The Université de Montréal's machine t ...
, also applied to natural language processing * the CDL series of languages, applied as
compiler construction In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs that ...
languages for
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
s


Applications outside of ALGOL 68

Anthony Fisher has written a parser for a large class of W-grammars.
Dick Grune Dick Grune is a Dutch computer scientist and university lecturer best known for inventing and developing the first version of the Concurrent Versions System (CVS). Grune was involved in the construction of Algol 68 compiler In computing, a co ...
created a C program that would generate all possible productions of a 2-level grammar. The applications of EAGs mentioned above can effectively be regarded as applications of W-grammars, since EAGs are so close to W-grammars. W-grammars have also been proposed for the description of complex human actions in
ergonomics Human factors and ergonomics (commonly referred to as human factors) is the application of psychological and physiological principles to the engineering and design of products, processes, and systems. Four primary goals of human factors learnin ...
. * A W-Grammar Description for Ada . – "W-grammar description of
Ada Ada may refer to: Places Africa * Ada Foah, a town in Ghana * Ada (Ghana parliament constituency) * Ada, Osun, a town in Nigeria Asia * Ada, Urmia, a village in West Azerbaijan Province, Iran * Ada, Karaman, a village in Karaman Province, T ...
is still useful to computer scientists who need more than a simple understanding of the syntax and rudimentary description of the semantics"


See also

*
Affix grammar An affix grammar is a kind of formal grammar; it is used to describe the syntax of languages, mainly computer languages, using an approach based on how natural language is typically described.Koster, Cornelis HA.Affix grammars for natural languages ...
* Extended Affix Grammar *
Attribute grammar An attribute grammar is a formal way to supplement a formal grammar with semantic information processing. Semantic information is stored in attributes associated with terminal and nonterminal symbols of the grammar. The values of attributes are resu ...


References


External links

* . * {{Citation , url = http://www.cwi.nl/~steven/vw.html , title = VW , type = tutorial introduction , author-link = Steven Pemberton , first = Steven , last = Pemberton , publisher = CWI , place = NL. Formal languages Parsing Compiler construction Dutch inventions