The tt(class Scanner), derived as usual from the class ti(yyFlexLexer), is generated by bf(flex)(1)hi(flex). The derived class has access to data controlled by the lexical scanner. In particular, it has access to the following data members: hi(flex: protected data members) itemization( itht(flex: yytext)(char *yytext), containing the hi(matched text) text matched by a i(regular expression). Clients may access this information using the scanner's ti(YYText()) member; itht(flex: yyleng)(int yyleng), the hi(matched text length) length of the text in tt(yytext). Clients may access this value using the scanner's ti(YYLeng()) member; itht(flex yylineno)(int yylineno): the current i(line number). This variable is only maintained if tt(%option yylineno) hi(flex: %option yylineno) is specified. Clients may access this value using the scanner's ti(lineno()) member. ) Other members are available as well, but are used less often. Details can be found in ti(FlexLexer.h). Objects of the tt(class Scanner) perform two tasks: itemization( it() They push file information about the current file to a i(file stack); it() They pop the last-pushed information from the stack once endOfFile() is detected in a file. ) Several member functions are used to accomplish these tasks. As they are auxiliary to the scanner, they are i(private) members. In practice, develop these private members once the need for them arises. Note that, apart from the private member functions, several private data members are defined as well. Let's have a closer look at the implementation of the class tt(Scanner): itemization( it() First, we have a look at the class's initial section, showing the conditional inclusion of tt(FlexLexer.h), its tt(class) opening, and its private data. At the top of the class interface the private struct tt(FileInfo) is defined. tt(FileInfo) is used to store the names and pointers to open files. The struct has two constructors: one merely accepting a filename, the other also expecting a tt(bool) argument indicating that the file is already open and should not be handled by tt(FileInfo). This former constructor is used only once: as the initial stream is an already open file there is no need to open it again and so tt(Scanner)'s constructor will use this constructor to store the name of the initial file only. tt(Scanner)'s public section starts off by defining the tt(enum Error) defining various symbolic constants for errors that may be detected: verbinsert(HEAD)(concrete/lexer/scanner/scanner.h) it() As they are objects, the class's data members are initialized automatically by tt(Scanner)'s i(constructor). It activates the initial input (and output) file and pushes the name of the initial input file, using the second tt(FileInfo) constructor. Here is its implementation: verbinclude(concrete/lexer/scanner/scanner.cc) it() The scanning process proceeds as follows: once the scanner extracts a filename from an tt(#include) directive, a switch to another file is performed by tt(pushSource()). If the filename could not be extracted, the scanner throws an tt(invalidInclude) i(exception) value. The tt(pushSource()) member and the matching function tt(popSource()) handle file switching. Switching to another file proceeds as follows: itemization( it() First, the current depth of the tt(include)-nesting is inspected. If tt(s_maxDepth) is reached, the stack is considered full, and the scanner throws a tt(nestingTooDeep) exception. it() Next, tt(throwOnCircularInclusion()) is called to avoid circular inclusions when switching to new files. This function throws an exception if a filename is included twice using a simple literal name check. Here is its implementation: verbinclude(concrete/lexer/scanner/throwoncircular.cc) it() Then the new filename is added to the tt(FileInfo) vector, at the same time creating a new tt(ifstream) object. If this fails, the scanner throws a tt(cantRead) exception. it() Finally, a new ti(yy_buffer_state) is created for the newly opened stream, and the lexical scanner is instructed to switch to that stream using tt(yyFlexLexer)'s member function ti(yy_switch_to_buffer()). ) Here is tt(pushSource())'s implementation: verbinclude(concrete/lexer/scanner/pushsource.cc) it() The class tt(yyFlexLexer) provides a series of member functions that can be used to switch files. The file-switching capability of a tt(yyFlexLexer) object is founded on the tt(struct yy_buffer_state), containing the state of the emi(scan-buffer) of the currently read file. This buffer is pushed on the tt(d_state) stack when an tt(#include) is encountered. Then tt(yy_buffer_state)'s contents are replaced by the buffer created for the file to be processed next. Note that in the tt(flex) specification file the function tt(pushSource()) is called as centt(pushSource(YY_CURRENT_BUFFER, YY_BUF_SIZE);) ti(YY_CURRENT_BUFFER) and ti(YY_BUF_SIZE) are macros that are em(only) available in the rules section of the lexer specification file, so they must be passed as arguments to tt(pushSource()). Currently it is em(not) possible to use these macros in the tt(Scanner) class's member functions directly. it() Note that ti(yylineno) is not updated when a i(file switch) is performed. If line numbers are to be monitored, then the current value of tt(yylineno) should be pushed on a stack, and tt(yylineno) should be reset by tt(pushSource()), whereas tt(popSource()) should reinstate a former value of tt(yylineno) by popping a previously pushed value from the stack. tt(Scanner)'s current implementation maintains a simple i(stack) of ti(yy_buffer_state) pointers. Changing that into a stack of tt(pair) elements would allow us to save (and restore) line numbers as well. This modification is left as an i(exercise) to the reader. it() The member function tt(popSource()) is called to pop the previously pushed buffer from the stack, allowing the scanner to continue its scan just beyond the just processed tt(#include) directive. The member tt(popSource()) first inspects the size of the tt(d_state) stack: if empty, tt(false) is returned and the function terminates. If not empty, then the current buffer is deleted, to be replaced by the state waiting on top of the stack. The file switch is performed by the tt(yyFlexLexer) members ti(yy_delete_buffer()) and tt(yy_switch_to_buffer()). Note that tt(yy_delete_buffer()) does em(not) close the tt(ifstream) does em(not) delete the memory allocated for this stream by tt(pushSource()). Therefore the tt(delete) is called for the tt(ifstream) pointer stored at the back of tt(d_fileInfo) to take care of both. Following this the last tt(FileInfo) entry is removed from tt(d_fileInfo). Finally the function returns tt(true): verbinclude(concrete/lexer/scanner/popsource.cc) it() Two service members are offered: tt(stackTrace()) dumps the names of the currently pushed files to the standard error stream. It may be called by exception catchers. Here is its implementation: verbinclude(concrete/lexer/scanner/stacktrace.cc) it() tt(lastFile()) returns the name of the currently processed file. It may be implemented inline: verbinsert(LAST)(concrete/lexer/scanner/scanner.h) it() The lexical scanner itself is defined in tt(Scanner::yylex()). Therefore, tt(int yylex()) must be declared by the class tt(Scanner), as it overrides tt(FlexLexer)'s virtual member tt(yylex()). )