Regular languages

We’ve now seen how to go from a left or right regular grammar to a finite state automaton, from a finite state automaton to a deterministic finite state automaton, from there to a generalised finite state automaton, from there to a regular expression, and finally from a regular expression to a left or right regular grammar. At each step the language is unchanged. We can therefore conclude that the following sets of languages for a given set of tokens are all the same:

the set of languages with a left regular grammar
the set of languages with a right regular grammar
the set of languages recognised by a finite state automaton
the set of languages recognised by a deterministic finite state automaton
the set of languages recognised by a generalised finite state automaton
the set of languages described by a regular expression

The last of these was our definition of a regular language, but we could really have taken any of them as our definition.

This equivalence now allows us to answer many questions which were left unanswered in the sections from individual points of view.

I stated earlier, for example, that every language generated by a left regular grammar can also be generated by a right regular grammar and vice versa. We know this is true. In theory the proof is even constructive. We could take a left regular grammar, construct the corresponding finite state automaton, use the power set construction to construct a deterministic finite state automaton, convert it to a generalised finite state automaton with a single initial and single accepting state, kill all of its intermediate states one by one, take the resulting regular expression, and then use it to find a right regular grammar. All the steps in this process can in principle be carried out in a purely mechanical way. You shouldn’t ever do this, of course. The resulting grammar would be horrible.

We also considered closure of these sets of languages under various set operations. It was easy, for example, to see that the union of languages with a regular grammar has a regular grammar. We can now see that that’s true of languages described by any of the three types of finite automata or by regular expressions. This would be easy to prove directly for regular expressions but is quite tricky to prove for deterministic finite state automata. On the other hand it was fairly straightforward to prove that the intersection of languages defined by such finite state automata is also defined by such an automaton but this is far from obvious for languages defined by regular grammars or regular expressions. It must be true though, since these are all different ways of describing the same set of languages.

Similarly, reversal is an easy process to describe in terms of regular expressions but it’s far from clear how to take, for example, a left regular grammar and construct a left regular grammar for the reversed language, or to take a deterministic finite state automaton for a language and construct a deterministic finite state automaton for the reversed language. The equivalence of all of these descriptions of regular languages shows that it must be possible though.

In general if you want to prove a fact about regular languages you should look for the description in terms of which this fact is easiest to prove. You can even mix them. If regular languages appear in both the hypotheses and conclusion of a theorem you want to prove you might find it convenient to use one characterisation for the hypotheses and another for the conclusion.