Dangling Else
   HOME

TheInfoList



OR:

The dangling else is a problem in programming of parser generators in which an optional else clause in an if–then(–else) statement results in nested conditionals being ambiguous. Formally, the reference
context-free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules are of the form :A\ \to\ \alpha with A a ''single'' nonterminal symbol, and \alpha a string of terminals and/or nonterminals (\alpha can be empt ...
of the language is
ambiguous Ambiguity is the type of meaning (linguistics), meaning in which a phrase, statement or resolution is not explicitly defined, making several interpretations wikt:plausible#Adjective, plausible. A common aspect of ambiguity is uncertainty. It ...
, meaning there is more than one correct
parse tree A parse tree or parsing tree or derivation tree or concrete syntax tree is an ordered, rooted tree that represents the syntactic structure of a string according to some context-free grammar. The term ''parse tree'' itself is used primarily in co ...
. In many programming languages one may write conditionally executed code in two forms: the if-then form, and the if-then-else form – the else clause is optional: if a then s if b then s1 else s2 This gives rise to an ambiguity in interpretation when there are nested statements, specifically whenever an if-then form appears as s1 in an if-then-else form: if a then if b then s else s2 In this example, s is unambiguously executed when a is true and b is true, but one may interpret s2 as being executed when a is false (thus attaching the else to the first if) or when a is true and b is false (thus attaching the else to the second if). In other words, one may see the previous statement as either of the following expressions: if a then (if b then s) else s2 if a then (if b then s else s2) The dangling else problem dates to
ALGOL 60 ALGOL 60 (short for ''Algorithmic Language 1960'') is a member of the ALGOL family of computer programming languages. It followed on from ALGOL 58 which had introduced code blocks and the begin and end pairs for delimiting them, representing a k ...
, and has been resolved in various ways in subsequent languages. In LR parsers, the dangling else is the archetypal example of a
shift-reduce conflict In computer science, LR parsers are a type of bottom-up parsing, bottom-up parser that analyse deterministic context-free languages in linear time. There are several variants of LR parsers: SLR parser, SLR parsers, LALR parser, LALR parsers, Canoni ...
.


Avoiding ambiguity while keeping the syntax

This is a problem that often comes up in compiler construction, especially scannerless parsing. The convention when dealing with the dangling else is to attach the else to the nearby if statement, allowing for unambiguous context-free grammars, in particular. Programming languages like Pascal, C and Java follow this convention, so there is no ambiguity in the semantics of the ''language'', though the use of a parser generator may lead to ambiguous ''grammars''. In these cases alternative grouping is accomplished by explicit blocks, such as begin...end in Pascal and in C. Depending on the compiler construction approach, one may take different corrective actions to avoid ambiguity: *If the parser is produced by an SLR, LR(1) or LALR LR parser generator, the programmer will often rely on the generated parser feature of preferring shift over reduce whenever there is a conflict. Alternatively, the grammar can be rewritten to remove the conflict, at the expense of an increase in grammar size (see
below Below may refer to: *Earth *Ground (disambiguation) *Soil *Floor *Bottom (disambiguation) Bottom may refer to: Anatomy and sex * Bottom (BDSM), the partner in a BDSM who takes the passive, receiving, or obedient role, to that of the top or ...
). *If the parser is hand written, the programmer may use a ''non-ambiguous'' context-free grammar. Alternatively, one may rely on a non-context-free grammar or a
parsing expression grammar In computer science, a parsing expression grammar (PEG) is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language. The formalism was introduced by Bryan Ford in ...
.


Avoiding ambiguity by changing the syntax

The problem can also be solved by making explicit the link between an else and its if, within the syntax. This usually helps avoid human errors. Possible solutions are: *Having an "end if" symbol delimiting the end of the if construct. Examples of such languages are ALGOL 68, Ada, Eiffel, PL/SQL, Visual Basic,
Modula-2 Modula-2 is a structured, procedural programming language developed between 1977 and 1985/8 by Niklaus Wirth at ETH Zurich. It was created as the language for the operating system and application software of the Lilith personal workstation. It w ...
, and AppleScript. *Disallowing the statement following a "then" to be an "if" itself (it may however be a pair of statement brackets containing only an if-then-clause). This approach is followed by
ALGOL 60 ALGOL 60 (short for ''Algorithmic Language 1960'') is a member of the ALGOL family of computer programming languages. It followed on from ALGOL 58 which had introduced code blocks and the begin and end pairs for delimiting them, representing a k ...
. *Requiring braces (parenthesize) when an "else" follows an "if". *Requiring every "if" to be paired with an "else". To avoid a similar problem concerning semantics rather than syntax,
Racket Racket may refer to: * Racket (crime), a systematised element of organized crime ** Protection racket, a scheme whereby a group provides protection to businesses or other groups through violence outside the sanction of the law * Racket (sports equ ...
deviates from Scheme by considering an if without a fallback clause to be an error, effectively distinguishing conditional ''expressions'' (i.e if) from conditional ''statements'' (i.e when and unless, which do not have fallback clauses). *Using different keywords for the one-alternative and two-alternative "if" statements. S-algol uses if e do s for the one-alternative case and if e1 then e2 else e3 for the general case. * Requiring braces unconditionally, like Swift. This is effectively true in Python as its indentation rules delimit every block, not just those in "if" statements.


Examples

Concrete examples follow.


C

In C, the grammar reads, in part:
 statement = ...
    ,  selection-statement

 selection-statement = ...
    ,  IF ( expression ) statement
    ,  IF ( expression ) statement ELSE statement
Thus, without further rules, the statement if (a) if (b) s; else s2; could ambiguously be parsed as if it were either: if (a) or: if (a) else s2; In practice in C the first tree is chosen, by associating the else with the nearest if.


Avoiding the conflict in LR parsers

The above example could be rewritten in the following way to remove the ambiguity :
statement: open_statement
         ,  closed_statement
         ;

open_statement: IF '(' expression ')' statement
              ,  IF '(' expression ')' closed_statement ELSE open_statement
              ;

closed_statement: non_if_statement
                ,  IF '(' expression ')' closed_statement ELSE closed_statement
                ;

non_if_statement: ...
                ;
Any other statement-related grammar rules may also have to be duplicated in this way if they may directly or indirectly end with a statement or selection-statement non-terminal. However, we give grammar that includes both of if and while statements.
statement: open_statement
         ,  closed_statement
         ;

open_statement: IF '(' expression ')' statement
              ,  IF '(' expression ')' closed_statement ELSE open_statement
              ,  WHILE '(' expression ')' open_statement
              ;

closed_statement: simple_statement
                ,  IF '(' expression ')' closed_statement ELSE closed_statement
                ,  WHILE '(' expression ')' closed_statement
                ;

simple_statement: ...
                ;
Finally, we give the grammar that forbids ambiguous IF statements.
statement: open_statement
         ,  closed_statement
         ;

open_statement: IF '(' expression ')' simple_statement
              ,  IF '(' expression ')' open_statement
              ,  IF '(' expression ')' closed_statement ELSE open_statement
              ,  WHILE '(' expression ')' open_statement
              ;

closed_statement: simple_statement
                ,  IF '(' expression ')' closed_statement ELSE closed_statement
                ,  WHILE '(' expression ')' closed_statement
                ;

simple_statement: ...
                ;
With this grammar parsing of "if (a) if (b) c else d" fails:
statement
open_statement
IF '(' expression ')' closed_statement ELSE open_statement
'if' '(' 'a' ')' closed_statement 'else' 'd'
and then the parsing fails trying to match closed_statement to "if (b) c". An attempt with closed_statement fails in the same way.


See also

* The lexer hack * Most vexing parse


References

{{DEFAULTSORT:Dangling Else Parsing Computer programming Ambiguity Conditional constructs