cppannotations/yo/concrete/usingbison.yo
Frank B. Brokken 4e6881a18e WIP
git-svn-id: https://cppannotations.svn.sourceforge.net/svnroot/cppannotations/trunk@429 f6dd340e-d3f9-0310-b409-bdd246841980
2010-02-27 21:16:23 +00:00

45 lines
2.6 KiB
Text

Once an i(input language) exceeds a certain level of complexity, a emi(parser)
is often used to control the complexity of the language. In this case, a
emi(parser generator) can be used to generate the code verifying the input's
grammatical correctness. The lexical scanner (preferably composed into the
parser) provides chunks of the input, called hi(token)em(tokens). The parser
then processes the series of tokens generated by the lexical scanner.
Starting point when developing programs that use both parsers and scanners is
the i(grammar). The grammar defines a em(set of tokens) that can be returned
by the lexical scanner (called the emi(scanner) below).
Finally, auxiliary code is provided to `fill in the blanks': the i(actions)
performed by the parser and by the scanner are not normally specified
literally in the grammar rules or lexical regular expressions, but
should be implemented in em(member functions), called from the parser's
rules or which are associated with the scanner's regular expressions.
In the previous section we've seen an example of a bf(C++) class generated by
ti(flex). In the current section we concentrate on the parser. The parser can
be generated from a grammar specification file, processed by the program
ti(bisonc++). The grammar specification file required by tt(bisonc++) is
similar to the file processed by ti(bison) (or by tt(bison)'s successor (and
tt(bisonc++)'s predecessor) ti(bison++), written in the early nineties by the
Frenchman
hi(Coetmeur, A.) em(Alain Coetmeur)).
In this section a program is developed converting
em(infix expressions), where binary operators are written between their
operands, to em(postfix expressions), where operators are written behind their
operands. Also, the unary operator tt(-) is converted from its prefix notation
to a postfix form. The unary tt(+) operator is ignored as it requires no
further actions. In essence our little calculator is a micro compiler,
transforming numeric expressions into assembly-like instructions.
Our calculator will recognize a very basic set of operators:
multiplication, addition, parentheses, and the unary minus. We'll
distinguish real numbers from integers, to illustrate a subtlety in
bison-like grammar specifications. That's all. The purpose of this section is,
after all, to illustrate the construction of a bf(C++) program that uses both
a parser and a lexical scanner, rather than to construct a full-fledged
calculator.
In the coming sections we'll develop the grammar specification for
tt(bisonc++). Then, the regular expressions for the scanner are
specified. Following that, the final program is constructed.