You shouldn’t assume just because a name is familiar that it means what you think. In our example
number : integer | "." integer | integer "." | integer "." integer
integer : digit | integer digit
digit : "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7"
| "8" | "9" | "A" | "B" | "C" | "D" | "E" | "F"
ACAB is a number while -7 and
5,011,400 are not. There are reasons for this.
A through F are classed as digits to allow for
hexadecimal representations of numbers. Disallowing the commas which
traditionally separate groups of three digits is a design decision. It
simplifies processing and avoids the awkward fact that most of the
non-English speaking world uses dots instead of commas, while India uses
commas but places them differently.
The minus sign isn’t needed because bc is a calculator
and part of its grammar that I omitted earlier is
expression : number | "(" expression ")" | "-" expression
| expression "+" expression | expression "-" expression
| expression MUL_OP expression | expression "^" expression
This is another recursive rule. MUL_OP is in upper case
so we recognise that it must be a terminal symbol. As you might guess
from the name it includes the token * for the
multiplication operator but it also includes a couple of tokens. From
this rule we see that a number is an
expression and - followed by any
expression is an expression so -7
isn’t a number but it is an expression. You
might have noticed that a - appears in two different
possible expansions of expression. In addition to the
expansion "-" expression there’s also
expression "-" expression, which would allow, for example,
27-9.
In any case, lets try the generative grammar approach and generate
some NUMBERs. We’ll start from the rule for
NUMBER and pick possibilities at random each time we have
to expand a nonterminal or choose a token for a terminal. Each line will
be the result of doing this to the previous line.
number
integer
digit
7
So 7 is a number. Let’s try again.
number
. integer
. digit
. B
So .B is also a number. Another two
attempts:
number
integer
digit
5
number
integer . integer
digit . integer
digit . integer digit
digit . digit digit
5 . digit digit
5 . C digit
5 . C C
So .B, 5 and 5.CC are
numbers. Note that the spaces between symbols above, and in
the specification are just there to improve readability and are not part
of the string we’re generating.