A subexample

The full language for bc is rather complicated so let’s concentrate just on the last bit for now:

NUMBER  : integer
        | '.' integer
        | integer '.'
        | integer '.' integer
        ;


integer : digit
        | integer digit
        ;


digit   : 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7
        | 8 | 9 | A | B | C | D | E | F
        ;

You shouldn’t assume just because a name is familiar that it means what you think. According to this grammar ACAB is a NUMBER while -7 and 5,011,400 are not. There are reasons for this. A through F are classed as digits to allow for hexadecimal representations of numbers. Disallowing the commas which traditionally separate groups of three digits is a design decision. It simplifies processing and avoids the awkward fact that most of the non-English speaking world uses dots instead of commas, while India uses commas but places them differently. The minus sign isn’t needed because bc is a calculator and a further part of its grammar is a rule for expression which includes '-' expression. A NUMBER is an expression and - followed by any expression is an expression so -7 isn’t a NUMBER but it is an expression.

In any case, lets try the generative grammar approach and generate some NUMBERs. We’ll start from the rule for NUMBER and pick possibilities at random each time we have to expand a nonterminal or choose a token for a terminal. Each line will be the result of doing this to the previous line.

NUMBER
integer
digit
7

So 7 is a NUMBER. Let’s try again.

NUMBER
. integer
. digit
. B

So .B is also a NUMBER. Another two attempts:

NUMBER
integer
digit
5

NUMBER
integer . integer
digit . integer
digit . integer digit
digit . digit digit
5 . digit digit
5 . C digit
5 . C C

So .B, 5 and 5.CC are NUMBERs. Note that the spaces between symbols above, and in the specification are just there to improve readability and are not part of the string we’re generating.