An important category of languages is the regular languages. These can be characterised in a variety of ways, via regular grammars, finite state automata, regular expressions, or syntactic monoids. It’s important to understand all of these points of view because often a problem which is hard to solve using one description is easy in another.
To simplify things all of our examples in this chapter will be
languages where the tokens are single characters and we’ll write lists
of tokens as strings. When writing grammars each terminal symbol will
have only a single token, i.e. character, and will be denoted by that
character. The characters :
, "
and
|
which have a special meaning in our language for
describing languages, will not be tokens in any of our example
languages, nor will any whitespace characters. Non-terminal symbols will
always be denoted by a string more than one character long. These aren’t
limitations imposed by the theory, just ways of making the examples
easier for you to read.
List of characters are strings. Because all tokens in our examples are characters all lists of tokens are strings. Strings are more familiar than lists of tokens so I’ll often refer to them as strings when doing examples. I may occasionally make the mistake of referring to strings rather than lists of tokens in the general theory as well.