CSc 4340/6340, Introduction to Compilers
Fall 2009
PROJECT - Phase II
Phase II: (Due: October 25, 2009 (Sunday)
Perform the following Semantic Checks on the DRC expression tree:
- Schema Check: Under the schema checks the following
will be tested for:
- Predicate names correspond to relation names.
- Predicate arities are correct.
- Predicate arguments satisfy type constraints
- "Constant" arguments are of correct types
- Repeated "Variable" arguments are of correct types.
- Comparison arguments satisfy type constraints
- Safety Check:
Safe DRC Expressions
- Miscellaneous Check:
- No duplicate variables in Var List of query node as well as Var List of
exists and forall nodes.
- Variables in Var List of query node matches Free Variables of DRC
expression.
Before these checks are made, it will be a good idea if you perform the
following transformations to the DRC Expression tree:
- First, remove double "nots"
- Next, compress maximal sequence of "and" and "or" nodes respectively
- Finally, apply DeMorgan's Law transformation to "not-or" sequences
Phase 2 Notes
- Parent Pointer in DRCNode:
Adding a "parent" pointer to DRCNode is useful in several Phase 2
functions: e.g. Demorgan and double not transformations.
In addition, an integer field, called childPositionInParent,
will also be useful. This will be set to i, where the DRCNode
is the ith child of its parent.
- Please implement: compressTree, removeDoubleNots and
applyDeMorgan methods to transform the binary tree produced
by the parser.
- The following three variables will be extremely useful in the
DRCNode data structure:
Vector freeVariables; // will record the free variables for the node.
HashMap freeVariableType; // will record the data type of the free
// variable (indexed by variable name).
HashMap isFreeVariableLimited; // will record "yes" if free variable
// is Limited, "no" otherwise
The following describes how the above 3 variables are computed
for each node type (in an inductive manner):
- predicate node:
each variable argument is a FREE variable (is LIMITED as well)
and its DATA TYPE is determined by the database schema
- comparison node:
each variable argument is a FREE variable;
DATA TYPE of the variable is "UNKNOWN" if both arguments are variables;
otherwise the constant argument will dictate the datatype of the other variable
argument; we will disallow both arguments being constants.
if comparison is "=" and one of the arguments is a constant (num or str)
then the other argument is limited; otherwise all variable arguments are
not limited.
- not node:
each free variable in the child node is a free variable, none of which
are limited; data types are the same as in the child node.
- exists node:
each free variable in the child node which is not in the exists varList
is a free variable; data types are the same as in the child node;
the limitedness is the same as in the child node.
- and node:
free variables are the union of the free variables from the children.
for multiple occurrences of free variables across children, types
must match - otherwise report error; if the data types of one of the
occurrences is UNKNOWN then it can be set to the known data type of another
occurrence. Limitedness is true if at least one occurrence is limited.
- or node:
free variables are the union of the free variables from the children.
for multiple occurrences of free variables across children, types
must match - otherwise report error; if the data types of one of the
occurrences is UNKNOWN then it can be set to the known data type of another
occurrence.
limitedness is true if at ALL occurrences are limited.
- query node:
each free variable in the child node
is a free variable; data types are the same as in the child node;
the limitedness is the same as in the child node.
Semantic Checks
- Predicate Check:
Make sure all predicate names used in query correspond to database tables.
Also make sure that arities of predicate names match number of columns in
database table.
- Type Check:
Type check all variables at each node (types are obtained from constants in
query as well as data types in database.
- Safety Check:
query variables on lhs of | must match free variables on rhs.
free variables in all disjuncts of or-nodes must match.
All free variables in and-nodes must be limited.
parent of not-node MUST be and-node
parent of comparison node MUST be and-node.
Misc
- Transform not-comparison sequence: for example not(x=20) should be
transformed into (x<>20)
- Disallow constant COMPARISON constant. For example 20 = 40
[raj@tinman phase1]$ java DRC db