|
Organization of the Parser
The compiler compiler used for the syntax checker
is SableCC, developed at McGill University, Canada. SableCC
generates Java code for the parser. The compiler makes a clean separation
between grammar and the code that does semantic analysis. Coventional compiler
compilers use translation schemes for semantic analysis. SableCC generates
a strictly typed Abstract Syntax Tree on the basis of the
lexical and syntactic analysis alone. Semantic analysis is done by traversing
the AST. This section gives a brief overview of the output of SableCC
on running it on a grammar file, including the structure of the AST and
the tree traversal classes generated by it.
SableCC generates a strictly typed Abstract Syntax
Tree from the grammar present in the input file. It generates an abstract
class representing each non-terminal in the grammar, and this abstract
class is extended by one or more concrete classes, each representing an
alternative rule in the production for that particular non-terminal. Each
such concrete class has references to objects for classes that represent
the non-terminals and terminals occuring in that particular grammar alternative,
as well as accessor functions for obtaining these references.
For example, let the following rules be part of some
dummy grammar (written in the SableCC grammar specification format):
expression =
{additive} additive_expression |
{multipicative} multiplicative_expression;
additive_expression =
{integer} [first]:integer t_plus [second]:integer;
{real} [first]:real t_plus [second]:real;
multiplicative_expression =
[first]:real t_mult [second]:real;
Here, t_plus, t_mult, integer and real are tokens.
For these rules SableCC will generate the token classes TTPlus, TTMult,
TInteger and TReal (The first letter is capitalized, underscores are omitted
and the letter following immediately after hypen is capitalized). The abstract
classes generated to represent the non-terminals will be PExpression, PAdditiveExpression
and PMultiplicativeExpression. PExpression will be extended by the concrete
classes AAdditiveExpression and AMultiplicativeExpression, PAdditiveExpression
by PIntegerAdditiveExpression and PRealAdditiveExpression and PMultiplicativeExpression
by AMultiplicativeExpression (since the only alternative does not require
a name). The concrete classes will have accessor methods to get the objects
representing the entities on the right hand side of the grammar rule. For
example, AIntegerAdditiveExpression will have methods called getFirstInteger(),
getTPlus() and getSecondInteger().
SableCC also generates traversal classes for depth
first traversal of the tree, based on the "Visitor" pattern. These files
are kept in lglepfl\ocl\parser\analysis. The classes for checking the AST
or extracting information from it that are used throughout this project
are extensions of AnalysisAdapter or DepthFirstAdapter. DepthFirstAdapter
has "in", "out" and "case" methods. The "case" methods (for example, caseAAdditiveExpression
for the above example) first calls the "in" method for the node ("inAAdditiveExpression"),
then calls the "case" methods for each of the children (thus effecting
a recursive descent) and finally calls the "out" method for that particular
node ("outAAdditiveExpression"). Any specific functionality at a node can
be implemented (such as, storing the type of the element stored at that
node in a symbol table) can be implemented by over-riding any of these
methods. However, while over-riding the "case" methods, care has to be
taken to apply the traversal class to each of the children of the node,
to ensure that the depth first traversal proceeds to the sub-tree below
the node whose "case" method is over-ridden.
The lexer class is present in the package lglepfl\ocl\parser\lexer.
The file generated by SableCC is not modified in any way for the purpose
of this tool.
The parser classes are present in lglepfl\ocl\parser\parser.
The default parser generated by SableCC is extended by the class lglepfl.ocl.parser.OclParser
(note that as already mentioned in the section on Directory Structure,
this file is not affected when SableCC is run afresh to generate a new
parser). The purpose of this is to fix the AST generated by SableCC, to
make it conformant to the OCL grammar. This problem arises because the
following grammar rule of the standard OCL grammar is not LALR(1) parsable:
feature_call_parameters=
l_par declarator? actual_parameter_list? r_par;
It is therefore replaced by the following rules,
which generate a concrete syntax tree:
feature_call_parameters=
{empty} l_par r_par |
{concrete} l_par expression fcp_helper* r_par;
fcp_helper=
{comma} comma expression |
{colon} colon simple_type_specifier |
{iterate} semicolon name colon simple_type_specifier equal expression
|
{bar} bar expression;
However, without moving on to further stages like
type-checking, it is necessary to convert the tree from the above modifed
rules to the tree that would have been obtained from the original grammar
rules. This is done by the class OclParser, which uses an internal class
called ASTFix, which is an extension of AnalysisAdapter. OclParser also
checks to see that the AST conforms to the standard OCL rules if the standard
OCL option is specified. This is done through the internal class ASTCheck.
|