Nutshell Handbook, the Nutshell Handbook logo, and the O’Reilly logo are .. BSD and the GNU Project also distribute flex (Fast Lexical Analyzer Generator), “ a. The flex project has moved: The project repository is now hosted at github: https:// Releases can be found at. Flexc++ reads one or more input files (called `lexer’ in this manual), Flexc++ is highly comparable to the programs flex and flex++, written by Vern Paxson.
|Published (Last):||22 September 2013|
|PDF File Size:||13.85 Mb|
|ePub File Size:||3.25 Mb|
|Price:||Free* [*Free Regsitration Required]|
JFlex is a lexical analyser generator for Java 1 written in Java. A lexical analyser generator takes as input a specification with a set of regular expressions and corresponding actions. It generates a program a lexer that reads input, matches the input against the regular expressions in the spec file, and runs the corresponding action if a regular expression matched.
Lexers usually are the first front-end step in compilers, matching keywords, comments, operators, etc, and generating an input token stream for parsers. They can also be used lxer many other purposes. This manual gives a brief but complete description of the tool JFlex.
It assumes that you are familiar with the topic of lexical analysis in parsing. The references Aho, Sethi, and Ullman and Appel provide a good introduction. The next section of this manual describes installation procedures for JFlex.
Flex (lexical analyser generator) – Wikipedia
Working with JFlex – an example runs through an example specification and explains how it works. The section on Lexical specifications presents all JFlex options and the complete specification syntax; Encodings, Platforms, and Unicode provides information about Unicode and scanning text vs.
A few words on performance gives tips on how to write fast scanners. The section on porting scanners shows how to port scanners from JLex, and from the lex and flex tools for C. The example is for site wide installation. You need to be root for that. User installation works exactly the same way — just choose a directory where you have write permission. You can verify the integrity of the downloaded file with the SHA1 checksum available on the JFlex download page.
If you put the checksum file in the same directory as the archive, and run:. The input files and options are in both cases optional. This is mainly for JFlex maintenance and special low level customisations. Use only when you know what you are doing!
JFlex comes with a skeleton file in the src directory that reflects exactly the internal, pre-compiled skeleton and can be used with the -skel option. This feature is still in alpha status, and not fully implemented yet.
The plugin reads JFlex grammar definition files. The name and package of the generated Java source code are the ones defined in the grammar.
More information in the POM reference guide on plugins. JFlex can easily be integrated with the Ant build tool. Unless the target directory is specified with the destdir option, the generated class will be saved to the same directory where the grammar file resides. Like javacthe JFlex task creates subdirectories in destdir according to the generated class package. If not set, the files are written to the directory containing the grammar file.
To demonstrate how a lexical specification with JFlex looks like, this section presents a part of the specification for the Java language. The example does not describe the whole lexical structure of Java programs, but only a small and simplified part of it some keywords, some operators, comments and only two kinds of literals.
The examples directory also contains a complete JFlex specification of the lexical structure of Java programs together with the CUP parser specification for Java by C. From this specification JFlex generates a. The class will have a constructor taking a java. Reader from which the input is read. Next to package and import statements there is usually not much to do here.
If the code ends with a javadoc class comment, the generated class will get this comment, if not, JFlex will generate one automatically. The second section options and declarations is more interesting. It consists of a set of options, code that is included inside the generated scanner class, lexical states and macro declarations. In our example the following options are used:. The Unicode version may be specified, e. If no version is specified, the most recent supported Unicode version will be used – in JFlex 1.
See also Encodings for more information on character sets, encodings, and scanning text vs. Here you can declare member variables and functions that are used inside scanner actions. The specification continues with macro declarations. Macros are abbreviations for regular expressions, used to make lexical specifications easier to read and understand. This regular expression may itself contain macro usages. Although this allows a grammar-like specification style, macros are still just abbreviations and not non-terminals — they cannot be recursive.
Cycles in macro definitions are detected and reported at generation time by JFlex. This is not the only, but one of the simpler expressions matching non-nesting Java comments. See the macros DocumentationComment and CommentContent for an alternative. Identifier matches each string that starts with a character of class jletter followed by zero or more characters of class jletterdigit. The last part of the second section in our lexical specification is a lexical state declaration: The lexical rules section of a JFlex specification contains regular expressions and actions Java code that are executed when the scanner matches the associated regular expression.
As the scanner reads its input, it keeps track of all regular expressions and activates the action of the expression that has the longest match. If two regular expressions both have the longest match for a certain input, the scanner chooses the action of the expression that appears first in the specification. In that way, we get for input break the keyword break and not an Identifier break. In addition to regular expression matches, one can use lexical states to refine a specification.
A lexical state acts like a start condition. A start condition of a regular expression can contain more than one lexical state. It is then matched when the lexer is in any of these lexical states. If a regular expression has no start conditions it is matched in all lexical states. When the string abstract is matched, the scanner function returns the CUP symbol sym. If an action does not return a value, the scanning process is resumed immediately after executing the action. Because we do not yet return a value to the parser, our scanner proceeds immediately.
The matched region of the input is referred to by yytext and appended to the content of the string literal parsed so far. The last lexical rule in the example specification is used as an error fallback.
It matches any character in any state that has not been matched by another rule. If you have written your specification file or chosen one from the examples directorysave it say under the name java-lang.
JFlex should then show progress messages about generating the scanner and write the generated code to the directory of your specification file. If you use CUP, generate your parser classes first. The first part contains user code that is copied verbatim to the beginning of the generated source file before the scanner class declaration. As shown in the example spec, this is the place to put package declarations and import statements.
It is possible, but not considered good Java style to put helper classes, such as token classes, into this section; they are usually better declared in their own. The second part of the lexical specification contains options and directives to customise the generated lexer, declarations of lexical states and macro definitions. Directives that have one or more parameters are described as follows.
Tells JFlex to give the generated class the name classname and to write the generated code to a file classname. Makes the generated lexer class implement the specified interfaces. Makes the generated class a subclass of the class classname. Makes all generated methods and fields of the class private. All occurrences of public one space character before and after public in the skeleton file are replaced by private even if a user-specified skeleton is used.
flex: The Fast Lexical Analyzer has moved
Access to the generated class is expected to be mediated by user class code see next switch. Here you can define your own member variables and functions in the generated scanner. If more than one initialiser option is present, the code is concatenated in order of appearance in the specification.
Causes the specified exceptions to be declared in the throws clause of the constructor. Adds the specified argument to the constructors of the generated scanner. If more than one such directive is present, the arguments are added in order of occurrence in the specification.
JFlex will warn in this case and generate an additional default constructor without these parameters and without user init code which might potentially refer to the parameters.
Flex (lexical analyser generator)
Causes the generated scanner to throw an instance of lxeer specified exception in case of an internal error default is java. Note that this exception is only for internal scanner errors.
With usual specifications it should never occur i. Set the initial size of the scan buffer to the specified value decimal, in bytes. The default value is This section shows how the scanning method can be customised.
You can redefine the name and return type of the method and it is possible to declare exceptions that may be thrown in one of the actions of the specification. If no return type is specified, the scanning method will be declared as returning values of class Yytoken. Causes the scanning method to get the specified name. It is of course possible to provide a dummy implementation of that method in the class code section if you still want to override manuao function name.