The theory of finite automata is the mathematical theory of a simple class of algorithms that are important in computer science. Algorithms are recipes that tell us how to solve problems; the rules we learn in school for adding, subtracting, multiplying and dividing numbers are good examples of algorithms. Although algorithms have always been important in mathematics, mathematicians did not spell out precisely what they were until the early decades of the twentieth century, when they started to ask questions about the nature of mathematical proof. Perhaps motivated in part by the mechanical devices that were used to help in arithmetical calculations, mathematicians wanted to know whether there were mechanical ways of proving theorems.


By mechanical, they did not intend to build real machines that could prove theorems; rather they wanted to know whether it was possible in principle to prove theorems mechanically. In mathematical terms, they wanted to know whether there was an algorithm for proving the theorems of mathematics. The answer to this question obviously depended on what was meant by the word 'algorithm.' Numerous definitions were suggested by different mathematicians from various perspectives and, remarkably, they all proved to be equivalent. By the end of the 1930s, mathematicians had a precise definition of what an algorithm was, and using this definition they were able to show that there were problems that could not be solved by algorithmic means. It was perhaps no accident that, as mathematicans were laying the foundations of the theory of algorithms, engineers were constructing machines that could implement algorithms as programs. Algorithms and programs are just two sides of the same coin.


Mathematicians were initially interested in the dividing line between what was and what was not algorithmic. But simply knowing that a problem could be solved by means of an algorithm was not the end of the story. Certainly, this implied you could write a program to solve the problem, but did not imply your program would run quickly or efficiently. For this reason, two approaches to classifying algorithms were developed. The first classified them according to their running times and is known as complexity theory, whereas the second classified them according to the types of memory used in implementing them and is known as language theory. In language theory, the simplest algorithms are those which can be implemented by finite automata, the subject of this book.


Finite automata were first studied in the 1950s by Stephen Kleene, and found a number of important applications in computer science: for example, in the design of computer circuits, and in the lexical analyzers of compilers. In the 1960s and 1970s, mathematicians such as Samuel Eilenberg, Marcel-Paul Schuumltzenberger, and John Rhodes pioneered the mathematics of finite automata. More recently, other mathematicians have come to appreciate the usefulness of automata in such areas as combinatorial group theory and symbolic dynamics.

This book is intended to be an introduction to the mathematical theory of finite automata, assuming as background only a first course in discrete mathematics. The Appendix outlines the prerequisites.


Structure of the book

The book is notionally divided into two parts: Chapters 1 to 6 form the first part, and Chapters 7 to 12 the second; Chapter 7 is the bridge that links them.

PART I. This centres on Kleene's Theorem, the first major result proved about finite automata. I describe two different ways of proving this theorem with Chapter 1 common to both:


• The quickest route to proving Kleene's Theorem is the following: Sections 3.1-3.3 for constructions involving non-deterministic automata, Section 5.1 for the definition of regular expressions, and then Theorem 5.2.1 of Section 5.2, which is the proof of Kleene's Theorem itself. Chapter 2, omitting Section 2.5, can be regarded as a collection of examples.

• The route that emphasises algorithms more than proofs is the following: Sections 2.1-2.3 for practice in designing automata, Sections 3.1 and 3.2 for the accessible subset construction, Section 4.1 and Theorem 4.2.1 of Section 4.2 for ε-automata, Section 5.1 for regular expressions, and Section 5.3 for an algorithmic proof of Kleene's Theorem.

Section 2.6 on the Pumping Lemma can be read at any point after Chapter 1. Section 5.4 describes an algebraic technique for converting automata into regular expressions based on solving equations. Chapter 6, on local languages, describes a different way of converting regular expressions into automata from that used in any of the above proofs of Kleene's theorem.

PART II. This centres on the algebraic theory of recognisable languages, the main goals being Schutzenberger's characterisation of star-free languages, and the Variety Theorem of Eilenberg and Schutzenberger. Chapters 7 and 8 have a strong algorithmic flavour, but Chapters 9-12 are increasingly mathematical.

Chapter 7 describes how to find the smallest automaton recognising a given language. Two different techniques are given depending on whether the language is described by an automaton or by a regular expression. If the former, then Section 7.2 describes the algorithm for converting the automaton into a minimal automaton. The theory of minimal automata is developed in Sections 7.3 and 7.4. If you just want the algorithm for minimising an automaton then Sections 7.1 and 7.2 are all you need. If the language is described by a regular expression, then there is a beautiful technique, called the Method of Quotients, which will construct the minimal automaton directly from the expression. Unfortunately, there are 'issues' connected with this method, but I have relegated a discussion of these to the end of Section 7.5.

The minimal automaton is obviously important from a practical point of view, but it is also the starting point of an algebraic technique for studying recognisable languages. This is introduced in Chapter 8. Here the transition monoid of an automaton is defined and an algorithm described for computing it. At the conclusion of this chapter, semigroups and monoids are introduced.

The point of Chapter 8 will only become clear in Chapter 9, when we prove the algebraic counterpart of Kleene's Theorem: a language is recognisable if and only if its syntactic monoid is finite. The syntactic monoid of a recognisable language is isomorphic to the transition monoid of the minimal automaton of the language.

The main result of Chapter 9 tells us that finite monoids may be useful in studying recognisable languages. Because of this, Chapter 10 develops the theory of finite monoids we shall need. We also show how results about recognisable languages, which we previously proved using automata, can also be proved using monoids.

The real justification for studying the syntactic monoid of a recognisable language comes in Chapter 11, where we show that an important class of recognisable languages are characterised by the algebraic properties of their syntactic monoids. Specifically, we prove Schuumltzenberger's Theorem: a language is star-free if and only if its syntactic monoid is aperiodic.

Schuumltzenberger's Theorem opened up a whole new area of research: classifying recognisable languages by means of the algebraic properties of their syntactic monoids. In Chapter 12, we prove the Variety Theorem of Eilenberg and Schuumltzenberger, which provides the template for proving results of this type.


