Taxonomies and Toolkits of Regular Language Algorithms by Bruce William Watson.
From 1.1 Problem Statement:
A number of fundamental computing science problems have been extensively studied since the 1950s and the 1960s. As these problems were studied, numerous solutions (in the form of algorithms) were developed over the years. Although new algorithms still appear from time to time, each of these fields can be considered mature. In the solutions to many of the well-studied computing science problems, we can identify three deficiencies:
- Algorithms solving the same problem are difficult to compare to one another. This is usually due to the use of different programming languages, styles of presentation, or simply the addition of unnecessary details.
- Collections of implementations of algorithms solving a problem are difficult, if not impossible, to find. Some of the algorithms are presented in a relatively obsolete manner, either using old notations or programming languages for which no compilers exist, making it difficult to either implement the algorithm or find an existing implementation.
- Little is known about the comparative practical running time performance of the algorithms. The lack of existing implementations in one and the same framework, especially of the older algorithms, makes it difficult to determine the running time characteristics of the algorithms. A software engineer selecting one of the algorithms will usually do so on the basis of the algorithm’s theoretical running time, or simply by guessing.
In this dissertation, a solution to each of the three deficiencies is presented for each of the following three fundamental computing science problems:
- Keyword pattern matching in strings. Given a finite non-empty set of keywords (the patterns) and an input string, find the set of all occurrences of a keyword as a substring of the input string.
- Finite automata (FA) construction. Given a regular expression, construct a finite automaton which accepts the language denoted by the regular expression.
- Deterministic finite automata (DFA) minimization. Given a DFA, construct the unique minimal DFA accepting the same language.
We do not necessarily consider all the known algorithms solving the problems. For example, we restrict ourselves to batch-style algorithms1, as opposed to incremental algorithms2.
Requires updating given its age, 1995, but a work merits mention.
I first saw this in a tweet by silentbicycle.srec.