mirror of
https://gitlab.com/fbb-git/cppannotations
synced 2024-11-16 07:48:44 +01:00
21f62c103c
git-svn-id: https://cppannotations.svn.sourceforge.net/svnroot/cppannotations/trunk@217 f6dd340e-d3f9-0310-b409-bdd246841980
123 lines
7.5 KiB
Text
123 lines
7.5 KiB
Text
The tt(class Scanner), derived as usual from the class ti(yyFlexLexer), is
|
|
generated by bf(flex)(1)hi(flex). The derived class has access to data
|
|
controlled by the lexical scanner. In particular, it has access to the
|
|
following data members:
|
|
hi(flex: protected data members)
|
|
itemization(
|
|
itht(flex: yytext)(char *yytext), containing the hi(matched text) text
|
|
matched by a i(regular expression). Clients may access this information using
|
|
the scanner's ti(YYText()) member;
|
|
itht(flex: yyleng)(int yyleng), the hi(matched text length) length of the
|
|
text in tt(yytext). Clients may access this value using the scanner's
|
|
ti(YYLeng()) member;
|
|
itht(flex yylineno)(int yylineno): the current i(line number). This
|
|
variable is only maintained if
|
|
tt(%option yylineno) hi(flex: %option yylineno) is specified. Clients
|
|
may access this value using the scanner's ti(lineno()) member.
|
|
)
|
|
Other members are available as well, but are used less often. Details can
|
|
be found in ti(FlexLexer.h).
|
|
|
|
Objects of the tt(class Scanner) perform two tasks:
|
|
itemization(
|
|
it() They push file information about the current file to a i(file stack);
|
|
it() They pop the last-pushed information from the stack once endOfFile()
|
|
is detected in a file.
|
|
)
|
|
Several member functions are used to accomplish these tasks. As they are
|
|
auxiliary to the scanner, they are i(private) members. In practice, develop
|
|
these private members once the need for them arises. Note
|
|
that, apart from the private member functions, several private data members
|
|
are defined as well. Let's have a closer look at the implementation of the
|
|
class tt(Scanner):
|
|
itemization(
|
|
it() First, we have a look at the class's initial section, showing the
|
|
conditional inclusion of tt(FlexLexer.h), its tt(class) opening, and its
|
|
private data. At the top of the class interface the private struct
|
|
tt(FileInfo) is defined. tt(FileInfo) is used to store the names and pointers
|
|
to open files. The struct has two constructors: one merely accepting a
|
|
filename, the other also expecting a tt(bool) argument indicating that the
|
|
file is already open and should not be handled by tt(FileInfo). This former
|
|
constructor is used only once: as the initial stream is an already open file
|
|
there is no need to open it again and so tt(Scanner)'s constructor will use
|
|
this constructor to store the name of the initial file only. tt(Scanner)'s
|
|
public section starts off by defining the tt(enum Error) defining various
|
|
symbolic constants for errors that may be detected:
|
|
verbinsert(HEAD)(concrete/lexer/scanner/scanner.h)
|
|
it() As they are objects, the class's data members are initialized
|
|
automatically by tt(Scanner)'s i(constructor). It activates the initial input
|
|
(and output) file and pushes the name of the initial input file, using the
|
|
second tt(FileInfo) constructor. Here is its implementation:
|
|
verbinclude(concrete/lexer/scanner/scanner.cc)
|
|
it() The scanning process proceeds as follows:
|
|
once the scanner extracts a filename from an tt(#include) directive, a
|
|
switch to another file is performed by tt(pushSource()). If the filename
|
|
could not be extracted, the scanner throws an tt(invalidInclude) i(exception)
|
|
value. The tt(pushSource()) member and the matching function tt(popSource())
|
|
handle file switching. Switching to another file proceeds as follows:
|
|
itemization(
|
|
it() First, the current depth of the tt(include)-nesting is inspected.
|
|
If tt(s_maxDepth) is reached, the stack is considered full, and the scanner
|
|
throws a tt(nestingTooDeep) exception.
|
|
it() Next, tt(throwOnCircularInclusion()) is called to avoid circular
|
|
inclusions when switching to new files. This function throws an exception if a
|
|
filename is included twice using a simple literal name check. Here is its
|
|
implementation:
|
|
verbinclude(concrete/lexer/scanner/throwoncircular.cc)
|
|
it() Then the new filename is added to the tt(FileInfo) vector, at the
|
|
same time creating a new tt(ifstream) object. If this fails, the scanner
|
|
throws a tt(cantRead) exception.
|
|
it() Finally, a new ti(yy_buffer_state) is created for the newly
|
|
opened stream, and the lexical scanner is instructed to switch to that stream
|
|
using tt(yyFlexLexer)'s member function ti(yy_switch_to_buffer()).
|
|
)
|
|
Here is tt(pushSource())'s implementation:
|
|
verbinclude(concrete/lexer/scanner/pushsource.cc)
|
|
it() The class tt(yyFlexLexer) provides a series of member functions that
|
|
can be used to switch files. The file-switching capability of a
|
|
tt(yyFlexLexer) object is founded on the tt(struct yy_buffer_state),
|
|
containing the state of the emi(scan-buffer) of the currently read file. This
|
|
buffer is pushed on the tt(d_state) stack when an tt(#include) is
|
|
encountered. Then tt(yy_buffer_state)'s contents are replaced by the buffer
|
|
created for the file to be processed next. Note that in the tt(flex)
|
|
specification file the function tt(pushSource()) is called as
|
|
centt(pushSource(YY_CURRENT_BUFFER, YY_BUF_SIZE);)
|
|
ti(YY_CURRENT_BUFFER) and ti(YY_BUF_SIZE) are macros that are em(only)
|
|
available in the rules section of the lexer specification file, so they must
|
|
be passed as arguments to tt(pushSource()). Currently it is em(not) possible
|
|
to use these macros in the tt(Scanner) class's member functions directly.
|
|
it() Note that ti(yylineno) is not updated when a i(file switch) is
|
|
performed. If line numbers are to be monitored, then the current value of
|
|
tt(yylineno) should be pushed on a stack, and tt(yylineno) should be reset by
|
|
tt(pushSource()), whereas tt(popSource()) should reinstate a former value of
|
|
tt(yylineno) by popping a previously pushed value from the
|
|
stack. tt(Scanner)'s current implementation maintains a simple i(stack) of
|
|
ti(yy_buffer_state) pointers. Changing that into a stack of
|
|
tt(pair<yy_buffer_state *, size_t>) elements would allow us to save (and
|
|
restore) line numbers as well. This modification is left as an i(exercise) to
|
|
the reader.
|
|
it() The member function tt(popSource()) is called to pop the previously
|
|
pushed buffer from the stack, allowing the scanner to continue its scan just
|
|
beyond the just processed tt(#include) directive. The member tt(popSource())
|
|
first inspects the size of the tt(d_state) stack: if empty, tt(false) is
|
|
returned and the function terminates. If not empty, then the current buffer is
|
|
deleted, to be replaced by the state waiting on top of the stack. The file
|
|
switch is performed by the tt(yyFlexLexer) members ti(yy_delete_buffer()) and
|
|
tt(yy_switch_to_buffer()). Note that tt(yy_delete_buffer()) does em(not) close
|
|
the tt(ifstream) does em(not) delete the memory allocated for this stream by
|
|
tt(pushSource()). Therefore the tt(delete) is called for the tt(ifstream)
|
|
pointer stored at the back of tt(d_fileInfo) to take care of both. Following
|
|
this the last tt(FileInfo) entry is removed from tt(d_fileInfo). Finally the
|
|
function returns tt(true):
|
|
verbinclude(concrete/lexer/scanner/popsource.cc)
|
|
it() Two service members are offered: tt(stackTrace()) dumps the names of
|
|
the currently pushed files to the standard error stream. It may be called by
|
|
exception catchers. Here is its implementation:
|
|
verbinclude(concrete/lexer/scanner/stacktrace.cc)
|
|
it() tt(lastFile()) returns the name of the currently processed file. It
|
|
may be implemented inline:
|
|
verbinsert(LAST)(concrete/lexer/scanner/scanner.h)
|
|
it() The lexical scanner itself is defined in tt(Scanner::yylex()).
|
|
Therefore, tt(int yylex()) must be declared by the class tt(Scanner), as it
|
|
overrides tt(FlexLexer)'s virtual member tt(yylex()).
|
|
)
|