cppannotations/annotations/yo/concrete/bisonflex.yo
Frank B. Brokken 777b182edd Moved all files but 'excluded', 'sf', and 'sourcetar' to ./annotations
This allowed me to standardize the sourcetar and sf/* scripts: the base
    directory (containing ./git) is now empty, except for maintenance scripts,
    while the source files and build scripts of the annotations are stored in
    a subdirectory of their own.
2013-05-29 20:44:08 +02:00

79 lines
4.4 KiB
Text

The example discussed below digs into the peculiarities of using
i(parser)- and i(scanner) generators generating bf(C++) sources. Once the
input for a program exceeds a certain level of complexity, it becomes
attractive to use scanner- and parser-generators generating the code which
does the actual input recognition.
The examples in this and subsequent sections assume that the reader knows how
to use the
i(scanner generator) ti(flex) and the i(parser generator) ti(bison). Both
tt(bison) and tt(flex) are well documented elsewhere. The original
predecessors of tt(bison) and tt(flex), called ti(yacc) and ti(lex) are
described in several books, e.g. in
hi(http://www.oreilly.com/catalog/lex)
O'Reilly's book url(`lex & yacc')(http://www.oreilly.com/catalog/lex).
Scanner- and parser generators are also available as free software. Both
tt(bison) and tt(flex) are usually part of software distributions or they can
be obtained from
hi(ftp::/prep.ai.mit.edu/pub/non-gnu)
tlurl(ftp://prep.ai.mit.edu/pub/non-gnu). tt(Flex) creates a tt(C++) class
when ti(%option c++) is specified.
For parser generators the program ti(bison) is available. In the early 90's
em(Alain Coetmeur) (url(coetmeur@icdc.fr)(mailto:coetmeur@icdc.fr)) created a
bf(C++) variant (ti(bison++)) creating a parser class. Although the
tt(bison++) program produces code that can be used in bf(C++) programs it also
shows many characteristics that are more suggestive of a bf(C) context than a
bf(C++) context. In January 2005 I rewrote parts of Alain's tt(bison++)
program, resulting in the original version of the program
hi(bisonc++)
bf(bisonc++). Then, in May 2005 a complete rewrite of the tt(bisonc++)
parser generator was completed (version number 0.98). Current versions of
tt(bisonc++) can be downloaded from
tlurl(http://bisoncpp.sourceforge.net/), where it is available as source
archive and as binary (i386) url(Debian)(http://www.debian.org) package
(including tt(bisonc++)'s documentation).
tt(Bisonc++) creates a cleaner parser class than tt(bison++). In particular,
it derives the parser class from a base-class, containing the parser's token-
and type-definitions as well as all member functions which should not be
(re)defined by the programmer. As a result of this approach, the generated
parser class is very small, declaring only members that are actually defined
by the programmer (as well as some other members, generated by tt(bisonc++)
itself, implementing the parser's ti(parse()) member). One member that is
em(not) implemented by default is tt(lex), producing the next lexical
token. When the directive tt(%scanner) (see section ref(BISONDEF)) is used,
tt(bisonc++) produces a standard implementation for this member; otherwise it
must be implemented by the programmer.
In early 2012 the program
hi(flexc++)
bf(flexc++) tlurl(http://flexcpp.org/) reached its initial release. Like
tt(bisonc++) it is part of the url(Debian linux
distribution)(http://www.debian.org).
Jean-Paul van Oosten (email(j.p.van.oosten@rug.nl)) and Richard Berendsen
(email(richardberendsen@xs4all.nl)) started the tt(flexc++) project in 2008
and the final program was completed by Jean-Paul and me between 2010 and 2012.
These sections of the annotations() focus on tt(bisonc++) as our
emi(parser generator) and tt(flexc++) as our lexical scanner
generator. Previous releases of the annotations() were using tt(flex) as the
scanner generator.
Using tt(flex++) and tt(bisonc++) tt(class)-based scanners and parsers are
generated. The advantage of this approach is that the interface to the scanner
and the parser tends to become cleaner than without using tt(class)
interfaces. Furthermore, classes allow us to get rid of most if not all global
variables, making it easy to use multiple parsers in one program.
Below two example programs are developed. The first example only uses
tt(flexc++). The generated scanner monitors the production of a file from
several parts. That example focuses on the lexical scanner and on switching
files while churning through the information. The second example uses both
tt(flexc++) and tt(bisonc++) to generate a scanner and a parser transforming
standard arithmetic expressions to their postfix notations, commonly used in
code generated by compilers and in tt(HP)-calculators. In the second example
the emphasis is mainly on tt(bisonc++) and on composing a scanner object
inside a generated parser.