flexc++

flexc++

flexc++_0.5.1.tar.gz

2008-2010


flexc++(1)

flexc++(1)

flexc++_0.5.1.tar.gz flexc++ scanner generator

2008-2010

NAME

flexc++ - Generate a C++ scanner class and scanning function

SYNOPSIS

flexc++ [OPTIONS] [FILENAME]

DESCRIPTION

Generates classes that perform pattern-matching on text. [Needs a more descriptive text]

FORMAT OF THE INPUT FILE

The flexc++ input file consists of two (not three!) sections, ending at lines containing %%.


options and definitions
%%
rules
%%
    
The final section delimiter (%%) is optional.

Note in particular that flexc++ no longer supports (nor needs) a %header{ ... %} section.

GENERATED FILES

Flexc++ may generate the following files:

OPTIONS

If available, single letter options are listed between parentheses following their associated long-option variants. Single letter options require arguments if their associated long options require arguments as well.

Options to set filenames

Skeleton options

Options to force overwriting

Output options

Miscellaneous options

Not (yet) implemented options

DIRECTIVES

The following directives can be used in the initial section of the grammar specification file. When command-line options for directives exist, they overrule the corresponding directives given in the grammar specification file.

Multiple options may be specified on the same line, like %option classheader="scanner.h" classname="Scanner". Each line with options begins with the %option directive. Examples show defaults unless indicating otherwise.

Not (yet) implemented

PUBLIC MEMBERS AND -TYPES

The following public members can be used by users of the scanner classes generated by flexc++. The Scanner:: prefixes are silently implied:

The following public types are available:

PROTECTED MEMBER FUNCTIONS

The following members can be used in actions and other member functions:

REGULAR EXPRESSIONS

The patterns in the input (see Rules Section of the flexc++ manual) are written using an extended set of regular expressions. They are summarized below. Characters x, y and z represent single characters, characters s and r represent regular expressions.

x
match the character x;

.
any character (byte) except newline;

[xyz]
a character class; in this case, the pattern matches either an x, a y, or a z;

[abj-oZ]
a character class containing a range; matches an a, a b, any letter from j through o, or a Z;

[^A-Z]
a negated character class, i.e., any character but those in the range following the caret. In this case, any character except an uppercase letter;

[^A-Z\n]
any character except an uppercase letter or a newline;

r*
zero or more rs (note that r represents a regular expression);

r+
one or more rs;

r?
zero or one rs (i.e., an optional r);

{name}
the expansion of the name definition provided in the definition section of the lexer specification file;

"[xyz]\"foo+"
the literal string: [xy]"foo;

\X
if X is a, b, f, n, r, t, or v, then the ANSI-C interpretation of \x. Otherwise, a literal X (used to escape regular expression operators like *, ? and |);

\0
a NUL character (ASCII code 0);

\123
the character with octal value 123;

\x2a
the character with hexadecimal value 2a;

(r)
matches r. Parentheses are used to override precedence (see below);

rs
concatenation: the regular expression r followed by the regular expression s;

r|s
either r or s;

r/s
trailing context: an r but only when followed by s. The text matched by s is included when determining whether this rule is the longest match, but is then returned to the input before executing its associated action. So the action only sees the text matched by r (some combinations of r/s are incorrectly matched by flexc++ (cf. the flexc++ manual for details and examples).

^r
an r, but only at the beginning of a line (i.e., when just starting to scan, or right after a newline has been scanned);

r$
an r, but only at the end of a line (i.e., just before a newline). Equivalent to r/\n. Note that flexc++'s notion of `newline' is equal to your C++ compiler's interpretation of a \n character. E.g., on some DOS systems you must either filter the carriage return (\rs) from the input, or explicitly use r/\r\n instead of r$.

<s>r
an r, but only in start condition (mini scanner) s;

<s1,s2,s3>r
an r, but only in start condition (mini scanner) s1, s2 or s3;

<<EOF>>
an end-of-file condition;

<s1,s2><<EOF>>'
an end-of-file when in start condition s1 or s2. ) Note that within a character class specification all regular expression operators lose their special meaning except for the escape character (\) and the character class operators (-, ]], and --at the beginning of the character class specification-- ^).

PROTECTED ENUMS AND -TYPES

To do

PRIVATE MEMBER FUNCTIONS

To do

PROTECTED DATA MEMBERS

To do

TYPES AND VARIABLES IN THE ANONYMOUS NAMESPACE

To do

DIFFERENCES WITH FLEX(++)

In general, YY-symbols are no longer used. Below specific changes are mentioned:

OBSOLETE SYMBOLS

All DECLARATIONS and DEFINE symbols not listed above but defined in flex++ are obsolete with flexc++. In particular, there is no %header{ ... %} section anymore. Also, all DEFINE symbols related to member functions are now obsolete. There is no need for these symbols anymore as they can simply be declared in the class header file and defined elsewhere.

CODE BLOCKS

Flexc++ does not support code blocks, except for multi-line actions. Code previously placed in code blocks can now be placed in methods.

USER CODE

Related to the CODE BLOCKS section, flexc++ does not support a last section of the input file for user code.

EXAMPLE

To do

FILES

To do

SEE ALSO

flexc++(1),
http://www.flexcpp.org

BUGS

This is an experimental version of Flexc++, very much under development. There are still many open tickets and the reader is kindly requested to consult the list of open tickets at http://code.flexcpp.org/projects/flexcpp/ prior to filing a bug.

However, any bug filed against flexc++ is greatly appreciated and will receive the authors' undivided attention.

ABOUT FLEXC++

Flexc++ was based on flex++(1), derived from flex(1).

Flexc++ a complete rewrite of a lexical scanner generator, closely following the theory of deterministic and non-deterministic automatons as described in Aho, Sethi and Ullman's (1986) book Compilers (i.e., the Dragon book).

However, flex(1) variables, declarations that are obsolete and (of course!) C-like macros were removed from flexc++ and replaced by (member) functions. In particular, all primitive forms of name protection as used by flex++ were replaced by state-of-the-art name protection techniques, like class-embedding and using name spaces.

AUTHOR

Frank B. Brokken (f.b.brokken@rug.nl),
Jean-Paul van Oosten (jpoosten@ai.rug.nl) Richard Berendsen (richardberendsen@xs4all.nl),