Home 

Introduction

Operation Schemas

User Manual

Directory Structure

Parser

Model Extractor

Type Checker

Modifications

Extending the compiler

Outstanding Issues

Current Status

References and Links

Javadoc

Organization of the Parser

The compiler compiler used for the syntax checker is SableCC, developed at McGill University, Canada. SableCC generates Java code for the parser. The compiler makes a clean separation between grammar and the code that does semantic analysis. Coventional compiler compilers use translation schemes for semantic analysis. SableCC generates a strictly typed Abstract Syntax Tree  on the basis of the lexical and syntactic analysis alone. Semantic analysis is done by traversing the AST.  This section gives a brief overview of the output of SableCC on running it on a grammar file, including the structure of the AST and the tree traversal classes generated by it.

SableCC generates a strictly typed Abstract Syntax Tree from the grammar present in the input file. It generates an abstract class representing each non-terminal in the grammar, and this abstract class is extended by one or more concrete classes, each representing an alternative rule in the production for that particular non-terminal. Each such concrete class has references to objects for classes that represent the non-terminals and terminals occuring in that particular grammar alternative, as well as accessor functions for obtaining these references.

For example, let the following rules be part of some dummy grammar (written in the SableCC grammar specification format):

expression =

{additive} additive_expression |

{multipicative} multiplicative_expression;

additive_expression =

{integer} [first]:integer t_plus [second]:integer;

{real} [first]:real t_plus [second]:real;

multiplicative_expression =

[first]:real t_mult [second]:real;

Here, t_plus, t_mult, integer and real are tokens. For these rules SableCC will generate the token classes TTPlus, TTMult, TInteger and TReal (The first letter is capitalized, underscores are omitted and the letter following immediately after hypen is capitalized). The abstract classes generated to represent the non-terminals will be PExpression, PAdditiveExpression and PMultiplicativeExpression. PExpression will be extended by the concrete classes AAdditiveExpression and AMultiplicativeExpression, PAdditiveExpression by PIntegerAdditiveExpression and PRealAdditiveExpression and PMultiplicativeExpression by AMultiplicativeExpression (since the only alternative does not require a name). The concrete classes will have accessor methods to get the objects representing the entities on the right hand side of the grammar rule. For example, AIntegerAdditiveExpression will have methods called getFirstInteger(), getTPlus() and getSecondInteger().

SableCC also generates traversal classes for depth first traversal of the tree, based on the "Visitor" pattern. These files are kept in lglepfl\ocl\parser\analysis. The classes for checking the AST or extracting information from it that are used throughout this project are extensions of AnalysisAdapter or DepthFirstAdapter. DepthFirstAdapter has "in", "out" and "case" methods. The "case" methods (for example, caseAAdditiveExpression for the above example) first calls the "in" method for the node ("inAAdditiveExpression"), then calls the "case" methods for each of the children (thus effecting a recursive descent) and finally calls the "out" method for that particular node ("outAAdditiveExpression"). Any specific functionality at a node can be implemented (such as, storing the type of the element stored at that node in a symbol table) can be implemented by over-riding any of these methods. However, while over-riding the "case" methods, care has to be taken to apply the traversal class to each of the children of the node, to ensure that the depth first traversal proceeds to the sub-tree below the node whose "case" method is over-ridden.

The lexer class is present in the package lglepfl\ocl\parser\lexer. The file generated by SableCC is not modified in any way for the purpose of this tool.

The parser classes are present in lglepfl\ocl\parser\parser. The default parser generated by SableCC is extended by the class lglepfl.ocl.parser.OclParser (note that as already mentioned in the section on Directory Structure, this file is not affected when SableCC is run afresh to generate a new parser). The purpose of this is to fix the AST generated by SableCC, to make it conformant to the OCL grammar. This problem arises because the following grammar rule of the standard OCL grammar is not LALR(1) parsable:

feature_call_parameters=

l_par declarator? actual_parameter_list? r_par;

It is therefore replaced by the following rules, which generate a concrete syntax tree:

feature_call_parameters=

{empty} l_par r_par |

{concrete} l_par expression fcp_helper* r_par;

fcp_helper=

{comma} comma expression |

{colon} colon simple_type_specifier |

{iterate} semicolon name colon simple_type_specifier equal expression |

{bar} bar expression;

However, without moving on to further stages like type-checking, it is necessary to convert the tree from the above modifed rules to the tree that would have been obtained from the original grammar rules. This is done by the class OclParser, which uses an internal class called ASTFix, which is an extension of AnalysisAdapter. OclParser also checks to see that the AST conforms to the standard OCL rules if the standard OCL option is specified. This is done through the internal class ASTCheck.