A grammar example (bc)

bc is an arbitrary precision calculator. It’s part of the POSIX specification for Unix operating systems. That specification not only requires a bc program to be present but also gives a minimal grammar which it must recognise, which makes it useful as an example. I’ll start by giving the grammar, then introduce some terminology useful for talking about it, then point out some features. Later I’ll describe in more detail the grammar of the language in which the grammar is expressed.

The grammar of the bc calculator, as given in the specification, is as follows. Understanding it in detail is not necessary unless for some reason you want to write a POSIX-compliant implementation of the bc utility.

%token    EOF NEWLINE STRING LETTER NUMBER


%token    MUL_OP
/*        '*', '/', '%'                           */


%token    ASSIGN_OP
/*        '=', '+=', '-=', '*=', '/=', '%=', '^=' */


%token    REL_OP
/*        '==', '<=', '>=', '!=', '<', '>'        */


%token    INCR_DECR
/*        '++', '--'                              */


%token    Define    Break    Quit    Length
/*        'define', 'break', 'quit', 'length'     */


%token    Return    For    If    While    Sqrt
/*        'return', 'for', 'if', 'while', 'sqrt'  */


%token    Scale    Ibase    Obase    Auto
/*        'scale', 'ibase', 'obase', 'auto'       */


%start    program


%%


program              : EOF
                     | input_item program
                     ;


input_item           : semicolon_list NEWLINE
                     | function
                     ;


semicolon_list       : /* empty */
                     | statement
                     | semicolon_list ';' statement
                     | semicolon_list ';'
                     ;


statement_list       : /* empty */
                     | statement
                     | statement_list NEWLINE
                     | statement_list NEWLINE statement
                     | statement_list ';'
                     | statement_list ';' statement
                     ;


statement            : expression
                     | STRING
                     | Break
                     | Quit
                     | Return
                     | Return '(' return_expression ')'
                     | For '(' expression ';'
                           relational_expression ';'
                           expression ')' statement
                     | If '(' relational_expression ')' statement
                     | While '(' relational_expression ')' statement
                     | '{' statement_list '}'
                     ;


function             : Define LETTER '(' opt_parameter_list ')'
                           '{' NEWLINE opt_auto_define_list
                           statement_list '}'
                     ;


opt_parameter_list   : /* empty */
                     | parameter_list
                     ;


parameter_list       : LETTER
                     | define_list ',' LETTER
                     ;


opt_auto_define_list : /* empty */
                     | Auto define_list NEWLINE
                     | Auto define_list ';'
                     ;


define_list          : LETTER
                     | LETTER '[' ']'
                     | define_list ',' LETTER
                     | define_list ',' LETTER '[' ']'
                     ;


opt_argument_list    : /* empty */
                     | argument_list
                     ;


argument_list        : expression
                     | LETTER '[' ']' ',' argument_list
                     ;


relational_expression : expression
                     | expression REL_OP expression
                     ;


return_expression    : /* empty */
                     | expression
                     ;


expression           : named_expression
                     | NUMBER
                     | '(' expression ')'
                     | LETTER '(' opt_argument_list ')'
                     | '-' expression
                     | expression '+' expression
                     | expression '-' expression
                     | expression MUL_OP expression
                     | expression '^' expression
                     | INCR_DECR named_expression
                     | named_expression INCR_DECR
                     | named_expression ASSIGN_OP expression
                     | Length '(' expression ')'
                     | Sqrt '(' expression ')'
                     | Scale '(' expression ')'
                     ;


named_expression     : LETTER
                     | LETTER '[' expression ']'
                     | Scale
                     | Ibase
                     | Obase
                     ;
NUMBER  : integer
        | '.' integer
        | integer '.'
        | integer '.' integer
        ;


integer : digit
        | integer digit
        ;


digit   : 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7
        | 8 | 9 | A | B | C | D | E | F
        ;

This isn’t actually a complete grammar. There are some further rules which are given afterwards in ordinary text, but those are mostly boring bits like the fact that NEWLINE is the newline character.