The grammar for integer
s considered above is
unambiguous, in the sense that there’s only one abstract syntax tree we
can get from any given input. The same is true of NUMBERS
in bc
. The bc
grammar as a whole is ambiguous
though. One of the possible expansions is expression
+
expression
. There are therefore at least two
ways to generate the expression 1 + 2 + 3
. One is
expression
expression + expression
NUMBER + expression
integer + expression
digit + expression
1 + expression
1 + expression + expression
1 + NUMBER + expression
1 + digit + expression
1 + 2 + expression
1 + 2 + NUMBER
1 + 2 + digit
1 + 2 + 3
and another is
expression
expression + expression
expression + NUMBER
expression + digit
expression + 3
expression + expression + 3
NUMBER + expression + 3
digit + expression + 3
1 + expression + 3
1 + NUMBER + 3
1 + digit + 3
1 + 2 + 3
These differ not just in the order in which we expanded symbols but
in how the expression 1 + 2 + 3
is broken up into phrases.
In the first one 1
and 2 + 3
are expressions
joined by a +
. In the second 1 + 2
and
3
are expressions joined by a +
.
Ambiguous grammars are allowed. In fact there are context free
languages for which no unambiguous grammar exists. It’s also possible,
and indeed common, for an ambiguous grammar and an unambiguous grammar
to define the same language. It’s often easier to write an ambiguous
grammar for a language and often easier to analyse an unambiguous one.
In fact the specification for bc
doesn’t require that this
particular grammar be used, merely that whatever grammar is used should
recognise the same language as this one generates. There are unambiguous
grammars for this language and an implementation which used one would
still be compliant.