Example, continued

You shouldn’t assume just because a name is familiar that it means what you think. In our example

number : integer | "." integer | integer "." | integer "." integer
integer : digit | integer digit
digit   : "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7"
        | "8" | "9" | "A" | "B" | "C" | "D" | "E" | "F"

ACAB is a number while -7 and 5,011,400 are not. There are reasons for this. A through F are classed as digits to allow for hexadecimal representations of numbers. Disallowing the commas which traditionally separate groups of three digits is a design decision. It simplifies processing and avoids the awkward fact that most of the non-English speaking world uses dots instead of commas, while India uses commas but places them differently.

The minus sign isn’t needed because bc is a calculator and part of its grammar that I omitted earlier is

expression : number | "(" expression ")" | "-" expression
           | expression "+" expression | expression "-" expression
           | expression MUL_OP expression | expression "^" expression

This is another recursive rule. MUL_OP is in upper case so we recognise that it must be a terminal symbol. As you might guess from the name it includes the token * for the multiplication operator but it also includes a couple of tokens. From this rule we see that a number is an expression and - followed by any expression is an expression so -7 isn’t a number but it is an expression. You might have noticed that a - appears in two different possible expansions of expression. In addition to the expansion "-" expression there’s also expression "-" expression, which would allow, for example, 27-9.

In any case, lets try the generative grammar approach and generate some NUMBERs. We’ll start from the rule for NUMBER and pick possibilities at random each time we have to expand a nonterminal or choose a token for a terminal. Each line will be the result of doing this to the previous line.

number
integer
digit
7

So 7 is a number. Let’s try again.

number
. integer
. digit
. B

So .B is also a number. Another two attempts:

number
integer
digit
5

number
integer . integer
digit . integer
digit . integer digit
digit . digit digit
5 . digit digit
5 . C digit
5 . C C

So .B, 5 and 5.CC are numbers. Note that the spaces between symbols above, and in the specification are just there to improve readability and are not part of the string we’re generating.