Chapter 2: Some simple examples

2.1: A simple lexer file and main function

The following lexer file detects identifiers:

%%
[_a-zA-Z][_a-zA-Z0-9]* return 1;

The main() function below defines a local Scanner object, and calls lex() as long as it does not return 0. lex() will return 0 if the end of the input stream is reached. (By default std::cin will be used).

#include <iostream>
#include "scanner.h"

using namespace std;

int main()
{
	Scanner scanner;
	while (scanner.lex())
		cout << "[Identifier: " << scanner.match() << "]";

	return 0;
}

Each identifier on the input stream is replaced by itself and some surrounding text. By default, flexc++ echoes all characters it cannot match to cout. If you do not want this, simply use the following pattern:

%%
[_a-zA-Z][_a-zA-Z0-9]*		return 1;
.|\n						// ignore

The second pattern will cause flexc++ to ignore all characters on the input stream. The first pattern will still match all identifiers, even those that consist of only one letter. But everything else is ignored. The second pattern has no associated action, and that is precisely what happens in lex: nothing. The stream is simply scanned for more characters.

It is also possible to let the generated lexer do all the work. The simple lexer below will print out all identifiers itself.

%%
[_a-zA-Z][_a-zA-Z0-9]*		{
	std::cout << "[Identifier: " << match() << "]\n";
}

.|\n						// ignore

Note how a compound statement may be used instead of a one line statement at the end of the line. The opening bracket must appear on the same line as the pattern, however. Also note that inside an action, we can use members of Scanner. match() contains the token that was last matched. And below is a main() function that is used with the generated scanner.

#include "scanner.h"

int main()
{
	Scanner scanner;
	scanner.lex();

	return 0;
}

Note how simple the main function is. Scanner::lex() does not return until the entire input stream has been processed, because none of the patterns has an associated action with a return statement.