CSc 8711. Databases and the Web
Spring 2011, Programming Project 3 (XML Stuff)
Due: 19 March 2011

Solve the following problems related to XML Schemas, XQuery, and XSL Transforms.

  1. Consider the XML document from exam 1:
    <Transcript> 
      <Sid>111111111</Sid> 
      <Name>John Doe</Name> 
      <Courses> 
        <Course> 
          <Code>CS308</Code> 
          <Sem>F1997</Sem> 
          <Grade>B</Grade> 
        </Course> 
        … 
      </Courses> 
    </Transcript> 
    
    The <Sid> element contains 9 digit social security numbers, the <Sem> element contains a letter (F, S, or U) followed by a 4-digit year, the <Code> element contains a string beginning with an upper-case letter followed by 2 or 3 upper- or lower-case letters followed by 3 digits, and the <Grade> element contains one of the grades: A, B, C, D, F, IP, S, or U. Write an XML Schema specification for the transcript document. Submit the schema under the file transcript.xsd.
  2. Consider the binary-tree XML document from exam 1:
    <node value="5">
      <node value="8">
        <node value="6"/>
        <node value="18">
          <node value="2">
            <node value="60"/>
          </node>
        </node>
      </node>
      <node value="9">
        <node value="14">
          <node/>
          <node value="80"/>
        </node>
        <node value="24"/>
      </node>
    </node>
    
    Write an XML Schema for such XML documents. Submit the schema under the file btree.xsd.
  3. Consider the XML document gradebook.xml. This document describes grade book data as kept by instructors of courses in a university. Here are some constraints in the data: Write an XML Schema for the gradebook XML documents. Submit the schema under the file gradebook.xsd.
  4. Consider the XML document related to movies available at movie.xml. Write XQuery expressions to answer the following queries:
    1. Get the title and years of movies in the Crime genre.
    2. Get the names of persons who have acted in a movie and have directed it as well.
    3. Get the title and years of movies in which James Caan has acted.
    4. Get the names of performers and the number of movies in which they have acted.
    5. Get the names of performers who have acted in at least 10 movies and directed at least 2 movies.
    6. Get the names of the youngest performers.
    7. Get the names of the performers who have directed some actor who is older than they are.
  5. Consider the XML document pubsData.xml which records data about bars, beers, and drinkers in a local neighborhood. Write XQuery expressions to answer the following queries:
    1. Find bars that serve a beer that Donald likes.
    2. Find drinkers who frequent at least one bar that serves a beer they like.
    3. Find drinkers who frequent at least all those bars that Donald frequents.
    4. Find drinkers who frequent no bar that serves a beer they like.
    5. Find drinkers who frequent only bars that serve at least one beer they like.
  6. Consider the two XML documents msgrads.xml and phdgrads.xml. The msgrads.xml contains most data for the past 10+ years but the phdgrads.xml file needs to be populated (need help in doing this - any volunteers?).
    1. Write a XSL Transform program that takes as input the phdgrads.xml file and produces the Web page shown at: phd-graduates.html.
    2. Write a XSL Transform program that takes as input the msgrads.xml file and produces the Web page shown at: ms-graduates.html.
    3. Write a Java Servlet (called PhDGraduatesByAdvisor.java) which produces a tabular output of the number of Ph.D. graduates per advisor. The tabular listing should have 2 columns: advisor name and number of graduates. Initially the table should be sorted by advisor name. The column headings should be hyperlinked so that the table gets sorted by the column when the hyperlink gets clicked.
    4. Write a Java Servlet (called MSGraduatesByAdvisor.java) which produces a tabular output of the number of M.S. graduates per advisor. The tabular listing should have 4 columns: advisor name, number of M.S. thesis graduates, number of M.S. Project graduates, and total number of graduates. Initially the table should be sorted by advisor name. The column headings should be hyperlinked so that the table gets sorted by the column when the hyperlink gets clicked.
    5. Write a Java Servlet (called PhDGraduatesByYear.java) which produces a tabular output of the number of Ph.D. graduates per year. The tabular listing should have 2 columns: Year and number of Ph.D. graduates. The table should be listed with most recent year at the top. No column sorting is required.
    6. Write a Java Servlet (called MSGraduatesByYear.java) which produces a tabular output of the number of M.S. graduates per year. The tabular listing should have 4 columns: Year, number of M.S. thesis graduates, number of M.S. Project graduates, and total number of graduates. The table should be listed with most recent year at the top. No column sorting is required.