Object-Oriented Query Languages

Query-related features of ODL:

 A third kind of property of classes: methods

  Declaring method signatures in ODL: (code is not part of ODL)

  interface Movie (extent Movies key (title,year)) {
    attribute string title;
    attribute integer year;
    attribute integer length;
    attribute enumeration(color,blackAndWhite) filmType;
    relationship Set stars inverse Star::starredIn;
    relationship Studio ownedBy inverse Studio::owns;
    float lengthInHours() raises (noLengthFound);
    starNames(out Set);
    otherMovies(in Star, out Set) raises(noSuchStar);
  };

  - extent of the class is a name for the current set of objects of the
     class; OQL queries refer to the extent of a class, not to the class name

  - in, out, inout parameters in methods

  - functions may raise exceptions

  - there is no guarantee that the function implements what their names
    suggest!

OQL (Object Query Language -- ODMG standard) 

  - OQL queries may be interpreted (as in SQL*Plus of Oracle)
    or may be embedded in a host language such as C++, Java.
  - Much easier to embed OQL queries in host language since both 
    are compatible (values easily transferred between the two)

OQL Type system:

  Constants are constructed as follows:
    - Basic Types: atomic types integers, floats, characters, strings, and
                      booleans (surrounded by ")
                   enumerations declared in ODL!
    - Complex Types: built using
          Set(...)
          Bag(...)
          List(...)
          Array(...)
          Struct(...)

         examples: bag(2,1,2)
                   struct(foo:bag(2,1,2), bar: "baz")
                   set(struct(title:"My Fair Lady",year:1965),
                       struct(title:"ET",year:1985),
                       struct(title:"Jaws",year:1981))
  Path Expressions:

    if a denotes an object belonging to class C and p is some property of the
      class (attribute or relationship or method) then a.p is a path expression
      interpreted as follows:
    - if p is an attribute then a.p is the value of that attribute in object a
    - if p is a relationship then a.p is the object or collection of objects
        related to a by relationship p
    - if p is a method (perhaps with parameters) then a.p is the result of
      applying p to a

 examples: Movie myMovie;
           Set sNames;

    myMovie.length
    myMovie.lengthInHours()
    myMovie.stars
    myMovie.starNames(sNames) 
    myMovie.ownedBy.name
 
  OQL Queries:

  (1) Find the year of movie "Gone With the Wind"

      select m.year
      from Movies m
      where m.title = "Gone With the Wind"

  In general the select-from-where statement in OQL is constructed as follows:

  SELECT keyword followed by a list of expressions (using constants and
      variables defined in the FROM clause)

  FROM keyword followed by a list of variable declarations
    variable is declared by
      - giving an expression whose value is a collection type (typically an
        extent, but not necessarily; could be another select-from-where
        statement)
      - followed by an optional AS keyword
      - followed by the name of the variable

  WHERE keyword followed by a boolean-valued expression (can use only constants
      and variables declared in the FROM clause); Similar operators as in SQL
       except != instead of <>

  The OQL query produces a bag of objects.

  (2) Find the names of the stars of "Casablanca"

    select s.name
    from  Movies m, m.stars s
    where m.title = "Casablanca"

  (3) Eliminating duplicates (distinct keyword)
      Find the names of stars of "Disney" movies

    select distinct s.name
    from Movies m, m.stars s
    where m.ownedBy.name = "Disney"

Complex output type:

  (4) Get set of pairs of stars living at the same address

    select distinct Struct(star1: s1, star2: s2)
    from Stars s1, Stars s2
    where s1.addr = s2.addr and s1.name < s2.name

  The result type of this query is
      Set

  Note: Such a type cannot appear in an ODL declaration!

    shortcut: select star1: s1, star2: s2

Subqueries:

  (5) Get the stars in movies made by Disney.

      select m
      from Movies m
      where m.ownedBy.name = "Disney"
 
     gives us the Disney Movies. This can be used in the from clause as
       follows:

      select distinct s.name
      from (select m
            from Movies m
            where m.ownedBy.name = "Disney") d, d.stars s

Ordering the Result:

   (6) Get Disney movies ordered by length (ties broken by title)

      select m
      from Movies m
      where m.ownedBy.name = "Disney"
      order by m.length, m.title

    - asc or desc may be spevcified after order by (default is asc)
      
For-all, There-exists, aggregation operators, set operators etc:

Quantifier Expressions:

  for all x in S: C(x)
  exists x in S: C(x)

  (7) Get stars acting in Disney movies

    select s
    from Stars s
    where exists m in s.starredIn : m.ownedBy.name = "Disney"

  (8) Get stars who appear only in Disney movies

    select s
    from Stars s
    where for all m in s.starredIn : m.ownedBy.name = "Disney"

Aggregation Expressions:

same 5 operations as in SQL: avg, min, max, sum, count
  These apply to collections whose members are of a suitable type.
  count applies to any collection
  sum, avg apply to any collection of numbers
  min, max apply to any collection in which the members can be compared.

(9) Find the average length of all movies.

  avg(select m.length from Movies m)

a bag of movie lengths is created; then the avg operator is applied.
(note: set of movie lengths would be incorrect!)

Group By:

The form of the group by expression is:

  group by f1:e1, f2:e2, ..., fn:en

where fi:ei is called a partition attribute (field fi expression ei)
The expressions ei may refer to variables mentioned in the FROM clause

Let us assume there is only one variable, x, in the from clause (for
sake of explanation of how the group by works)

The value of x ranges over some collection, C.

For each member of C, say i, that satisfies the where clause condition,
the expressions e1(i), ..., en(i) are calculated. Value i belongs to
this list of values.

The actual value returned by GROUP BY is a set of structures of the form:
  Struct(f1:v1, ..., fn:vn, partition:p)

where v1, ..., vn are  e1(i), ..., en(i) respectively for a value of i that
  satisfies the where clause. These values indicate the group.
The last field, partition, is a special field, whose value, p, is a bag of
Struct(x:i)  where i belongs to this group.

The select clause may only refer to fields f1, ..., fn, partition.
Through partition, the select clause may refer to the fields of x,
but only via an aggregation operator.

(10) For each studio and each year, list the total length of movies.

  select std, yr, sumLength: sum(select q.m.length from partition q)
  from   Movies m
  group by std:m.studio, yr:m.year

Collection C is the collection of all movies (variable m)
Group By returns objects of the form:
  Struct(std:"Disney", yr:1980, partition: p1)
  Struct(std:"Disney", yr:1981, partition: p2)
  Struct(std:"MGM", yr:1980, partition: p3)
  ...
  ...
  where p1 is a bag of structures: Struct(m:lionking), struct(m:mulan), ...
        etc. where lionking, mulan etc are movie objects.

  variable q in the select clause ranges over objects of the form Struct(m:o)
  so, q.m.length refers to the length of the movie m.

Final answer of the query is a bag of structures:
  Struct(std: "Disney", yr:1980, sumLength:1200)
  Struct(std: "Disney", yr:1981, sumLength:1680)
  ...

In the even that there are more than one variable in the From clause, say
x1,...,xk then the following generalizations can be made in the interpretation
of queries with GROUP BY clause:
  - All variables x1,...,xn may be used in each of the ei's of the GROUP BY
    clause
  - the value of the partition field consists of structures with the fields
     x1,...,xk.
  - Suppose i1,...,ik are the values for variables x1,...,xk that make the
    WHERE clause true then result of GROUP BY application is of the form:

    Struct(f1:e1(i1,...,ik), ..., fn:en(i1,...,ik),partition:P)

  and partition has the structure Struct(x1:i1,....xk,ik)

HAVING clause:

A GROUP BY clause may be followed by a HAVING clause.
  HAVING 
Purpose is to eliminate some of the groups created by the GROUP BY clause.
The  applies to the value of the partition field. If condition
evaluates to true the partition is passed on to next step otherwise it is
eliminated from further consideration.

(11) For each studio and each year get the sum of the lengths of the movies,
  but only if the one of the movies produced by that studio in that year
  is longer than 120 minutes

  select std, yr, sumLength:sum(select q.m.length from partition q)
  from Movies m
  group by std:m.studio, yr:m.year
  having max(select q.m.length from partition q) > 120

Set Operators:

  union, difference, and intersection on two objects of set or bag type.

  (12) Find movies starring "Harrison Ford" that were not made by "Disney"

    (select distinct m
     from   Movies m, m.stars s
     where s.name = "Harrison Ford")
    except
    (select distinct m
     from   Movies m
     where m.ownedBy.name = "Disney")

Note: If the one or both operands of these set operations is a bag, the "bag"
  meaning is used. Say x appears n1 times in B1 and n2 times in B2 then
   x appears n1+n2 times in (B1 union B2)
   x appears min(n1,n2) times in (B1 intersect B2)
   x appears 0 times in (B1 difference B2) if n1 <= n2
             (n1-n2) times otherwise
The result of the query is a set if both operands are sets otherwise
   it is a bag.


Object Assignment and Creation in OQL.

OQL and host language (good fit!)

Assigning Values to host variables:

  Set oldMovies;
  oldMovies = select distinct m from movies m where m.year < 1920;

Notice the ease with which results of a query are transferred to a host
variable.

Extracting Elements of Collections:

  Movie gwtw;
  gwtw = element(select m from Movies m where m.title = "Gone With the Wind");

element function extracts single element fro bag of one  element.

Extracting each element from a colleaction:

List movieList;
movieList = (select m from Movies m order by m.title,m.year);
movieList[i] now refers to the ith movie in the list.

Note: order by clause automatically converts the result type of query to a list
instead of bag/set.

small program fragment to display movie titles, years and lengths:

  List movieList;
  movieList = (select m from Movies m order by m.title,m.year);
  int numberOfMovies = count(movies);
  for (int i=0;i