More numerical examples

If you only want to allow decimal integers and you want to allow them to be negative you could use the following grammar:

integer : '0'
        | pos_integer
        | '-' pos_integer
        ;

pos_integer : pos_digit
            | pos_integer digit
            ;

digit : '0' | pos_digit
            ;

pos_digit : '1' | '2' | '3' | '4' | '5'
          | '6' | '7' | '8' | '9'
          ;

We’ve eliminated the digits needed for hexadecimal and allowed our integers to be negative. There are some further changes. 007 is a perfectly good integer according to bc’s grammar but is now disallowed. The only integer allowed to begin with a 0 is now 0 itself. We don’t allow -0 either. In this grammar there is one any only one way to represent each integer as an integer.

It’s possible to encode some basic arithmetic in a grammar. Consider, for example, the following grammar.

even_integer : '0'
             | pos_even_integer
             | '-' pos_even_integer
             ;

pos_even_integer : pos_even_digit
                 | pos_integer even_digit
                 ;

pos_integer : pos_digit
            | pos_integer digit
            ;

even_digit : 0 | pos_even_digit
           ;

pos_digit : pos_even_digit | pos_odd_digit
          ;

pos_even_digit : '2' | '4' | '6' | '8'
               ;

odd_digit : '1' | '3' | '5' | '7' | '9'
          ;

The even_integers described by this grammar are precisely the even integers. This relies on the fact the an integer is even if and only if its last digit is even.

You can check divisibility by three as well.

multiple_of_3 : '0'
              | pos_integer_0_mod_3
              | '-' pos_integer_0_mod_3
              ;

pos_integer_0_mod_3 : pos_digit_0_mod_3
                    | pos_integer_0_mod_3 digit_0_mod_3
                    | pos_integer_1_mod_3 digit_2_mod_3
                    | pos_integer_2_mod_3 digit_1_mod_3
          ;

pos_integer_1_mod_3 : digit_1_mod_3
                    | pos_integer_0_mod_3 digit_1_mod_3
                    | pos_integer_1_mod_3 digit_0_mod_3
                    | pos_integer_2_mod_3 digit_2_mod_3
                    ;

pos_integer_2_mod_3 : digit_1_mod_3
                    | pos_integer_0_mod_3 digit_2_mod_3
                    | pos_integer_1_mod_3 digit_1_mod_3
                    | pos_integer_2_mod_3 digit_0_mod_3
                    ;

digit_0_mod_3 : '0' | pod_digit_0_mod_3
              ;

pos_digit_0_mod_3 : | '3' | '6' | '9'
                  ;

digit_1_mod_3 : '1' | '4' | '7'
              ;

digit_2_mod_3 : '2' | '5' | '8'
              ;

The multiple_of_3’s are just the multiples of three.

How far can we go in this direction? Can we express divisibility by any integer purely in grammatical terms? As it turns out, yes. Since we can express divisibility can we write down a grammar for prime numbers? In this case the answer is more complicated. We can’t construct such a grammar using only rules of the type considered above but we can if we allow more complicated rules, which replace a list of symbols with another list of symbols rather than just replacing a single symbol with a list of symbols.