From the introduction:
flex is a tool for generating scanners. A scanner is a program which recognizes lexical patterns in text. The flex program reads the given input files, or its standard input if no file names are given, for a description of a scanner to generate. The description is in the form of pairs of regular expressions and C code, called rules. flex generates as output a C source file, lex.yy.c by default, which defines a routine yylex(). This file can be compiled and linked with the flex runtime library to produce an executable. When the executable is run, it analyzes its input for occurrences of the regular expressions. Whenever it finds one, it executes the corresponding C code.
For when you have serious scanning tasks.
This is actually the first step in building a parser. The lexer produces a stream of tokens, which is then matched against the grammar by the parser.
In Ontopia we use anltr and JFlex for lexing, and antlr for parsing. Using JFlex for lexing is vastly better than using antlr.
Here’s the CTM lexer: http://code.google.com/p/ontopia/source/browse/trunk/ontopia-engine/src/main/jflex/net/ontopia/topicmaps/utils/ctm/ctm.flex
Comment by larsga@garshol.priv.no — December 7, 2011 @ 3:28 am
Thanks!
I was thinking of flex as separate from building a compiler. On which see “flex & bison” by John Levine (O’Reilly 2009). In one chapter he illustrates how to build a SQL parser.
Comment by Patrick Durusau — December 7, 2011 @ 6:00 am
But isn’t flex limited with reference to natural languages?
Comment by CapnKirk — December 7, 2011 @ 7:44 am
Err, not sure what you mean by “limited?” In what way?
Comment by Patrick Durusau — December 7, 2011 @ 9:36 am