Almost all of the grammars we’ve considered so far have been
unambiguous, in the sense that there’s only one abstract syntax tree we
can get from any given input. One exception was the fragmentary grammar
for English. Grammars for natural languages are almost always ambiguous.
The other exception was the language for expressions
which
we extracted from the grammar of the bc
utility.
expression
expression + expression
number + expression
integer + expression
digit + expression
1 + expression
1 + expression + expression
1 + number + expression
1 + digit + expression
1 + 2 + expression
1 + 2 + number
1 + 2 + digit
1 + 2 + 3
and another is
expression
expression + expression
expression + number
expression + digit
expression + 3
expression + expression + 3
number + expression + 3
digit + expression + 3
1 + expression + 3
1 + number + 3
1 + digit + 3
1 + 2 + 3
These differ not just in the order in which we expanded symbols but
in how the expression 1 + 2 + 3
is broken up into phrases.
In the first one 1
and 2 + 3
are expressions
joined by a +
. In the second 1 + 2
and
3
are expressions joined by a +
.
You can see that these are in fact distinct parsings by looking at the corresponding parse trees.
Strictly speaking these aren’t full parse trees, since they don’t show all the symbols which are expanded when generating the input. The full version, with nodes labelled by symbols, is given just for the first of these.
Ambiguous grammars are allowed. In fact there are context free
languages for which no unambiguous grammar exists. It’s also possible,
and indeed common, for an ambiguous grammar and an unambiguous grammar
to define the same language. It’s often easier to write an ambiguous
grammar for a language and often easier to analyse an unambiguous one.
In fact the specification for bc
doesn’t require the
particular grammar given in the specification, merely that whatever
grammar is used should recognise the same language as this one
generates. There are unambiguous grammars for this language and an
implementation which used one would still be compliant.