Consider transcript.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<transcripts>
<transcript>
<student sid="111111111" name="John Doe"/>
<course code="CS308" semester="F1997" grade="B"/>
<course code="MAT123" semester="F1997" grade="B"/>
<course code="EE101" semester="F1997" grade="A"/>
<course code="CS305" semester="F1995" grade="A"/>
</transcript>
<transcript>
<student sid="987654321" name="Bart Simpson" />
<course code="CS305" semester="F1995" grade="C"/>
<course code="CS308" semester="F1994" grade="B"/>
</transcript>
<transcript>
<student sid="123454321" name="Joe Blow" />
<course code="CS315" semester="S1997" grade="A" />
<course code="CS305" semester="S1996" grade="A" />
<course code="MAT123" semester="S1996" grade="C" />
</transcript>
<transcript>
<student sid="023456789" name="Homer Simpson" />
<course code="EE101" semester="F1995" grade="B" />
<course code="CS305" semester="S1996" grade="A" />
</transcript>
</transcripts>
In basex run the following commands to create a database called enrollDB:
> create db enrollDB
> add transcript.xml
> exit
FOR variable declarations
LET variable := expression,
variable := expression, …
WHERE condition
ORDER BY fields
RETURN document
Example 1: Find students who took MAT123
(: students who took MAT123 :)
for $t in db:open("enrollDB")//transcript
where $t/course/@code = "MAT123"
return $t/student
Result:
<student sid="111111111" name="John Doe" />
<student sid="123454321" name="Joe Blow" />
Previous XQuery expression doesn’t produce a well-formed XML document; the following does:
<studentlist>
{
for $t in db:open("enrollDB")//transcript
where $t/course/@code = "MAT123"
return $t/student
}
</studentlist>
for
binds $t
to transcript
elements one by one, filters using where
, then places student
-children as e-children of studentlist
using return
FOR = iteration
for $t in db:open("enrollDB")//transcript
return <result> { $t } </result>
will return the following result:
<result> <transcript> ... </transcript> </result>
<result> <transcript> ... </transcript> </result>
<result> <transcript> ... </transcript> </result>
...
LET = assignment of set value to variable
let $t := db:open("enrollDB")//transcript
return <result> { $t } </result>
will return the following result:
<result>
<transcript> ... </transcript>
<transcript> ... </transcript>
<transcript> ... </transcript>
...
</result>
siblingdata.
Can we write a XQUERY query to restructure the XML document so that the rosters of each class is available immediately? YES!
ATTEMPT 1
<rosters>
{
for $c in db:open("enrollDB")//course
order by $c/@code
return
<roster code="{$c/@code}" semester="{$c/@semester}">
{
for $t in db:open("enrollDB")//course
order by $t/../student/@sid
where $t/@code = $c/@code and
$t/@semester = $c/@semester
return $t/../student
}
</roster>
}
</rosters>
Output elements have the form:
<roster code="CS305" semester="F1995">
<student sid="111111111" name="John Doe"/>
<student sid="987654321" name="Bart Simpson"/>
</roster>
But the above element will be output twice, once for each of the following two bindings of $c
:
<course code="CS305” semester="F1995” grade="C”/>
<course code="CS305” semester="F1995” grade="A”/>
Note: grades are different; so distinct-values()
won’t eliminate transcript records that refer to same class!
ATTEMPT 2
We will extract all courses in a variable; then set up iteration on distinct elements in this set.
import module namespace functx = 'http://www.functx.com';
<rosters>
{
let $courses :=
<courses>
{
for $c in db:open("enrollDB")//course
return <course code="{$c/@code}" semester="{$c/@semester}" />
}
</courses>
for $c in functx:distinct-deep($courses/course)
order by $c/@code
return
<roster code="{$c/@code}" semester="{$c/@semester}">
{
for $t in db:open("enrollDB")//course
order by $t/../student/@sid
where $t/@code = $c/@code and
$t/@semester = $c/@semester
return $t/../student
}
</roster>
}
</rosters>
ATTEMPT 3
Consider the additional XML data (class.xml
):
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<classes>
<class code="CS308" semester="F1997">
<title>SE</title>
<instructor>Adrian Jones</instructor>
</class>
<class code="MAT123" semester="F1997">
<title>Algebra</title>
<instructor>Ann White</instructor>
</class>
<class code="EE101" semester="F1997">
<title>Circuits</title>
<instructor>David Jones</instructor>
</class>
<class code="CS305" semester="F1995">
<title>Databases</title>
<instructor>Mary Doe</instructor>
</class>
<class code="EE101" semester="F1995">
<title>Circuits</title>
<instructor>David Jones</instructor>
</class>
<class code="CS308" semester="F1994">
<title>SE</title>
<instructor>Adrian Jones</instructor>
</class>
<class code="CS315" semester="S1997">
<title>TP</title>
<instructor>John Smyth</instructor>
</class>
<class code="CS305" semester="S1996">
<title>Databases</title>
<instructor>John Doe</instructor>
</class>
<class code="MAT123" semester="S1996">
<title>Algebra</title>
<instructor>Ann White</instructor>
</class>
<class code="CS8711" semester="S2018">
<title>Databases and the Web</title>
<instructor>Raj Sunderraman</instructor>
</class>
</classes>
In basex run the following commands to add a XML resource into enrollDB:
> open db enrollDB
> add class.xml
> exit
The following query would produce the rosters for each class (by oterating over each class in class.xml):
<rosters>
{
for $c in db:open("enrollDB")/classes/class
order by $c/@code
return
<roster code="{$c/@code}" semester="{$c/@semester}">
{
for $t in db:open("enrollDB")/transcripts/transcript/course
order by $t/../student/@sid
where $t/@code = $c/@code and $t/@semester = $c/@semester
return $t/../student
}
</roster>
}
</rosters>
Notice that the answer contains
<roster code="CS8711" semester="S2018"/>
which has no students! To avoid printing such rosters, we may have to modify the query (by adding a where clause to outer-for) as follows:
<rosters>
{
for $c in db:open("enrollDB")/classes/class
where db:open("enrollDB")/
transcripts/transcript/
course[@code = $c/@code and @semester = $c/@semester]
order by $c/@code
return
<roster code="{$c/@code}" semester="{$c/@semester}">
{
for $t in db:open("enrollDB")/transcripts/transcript/course
order by $t/../student/@sid
where $t/@code = $c/@code and $t/@semester = $c/@semester
return $t/../student
}
</roster>
}
</rosters>
for
clauseThe for clause binds each variable to a list of nodes specified by an XQuery expression. The expression can be:
End result of a for clause:
Example (bindings):
$A
and $B
$A
to document nodes {v,w}
; $B
to {x,y,z}
$A
and $B
:
$A/v
, $B/x
$A/v
, $B/y
$A/v
, $B/z
$A/w
, $B/x
$A/w
, $B/y
$A/w
, $B/z
where
clausewhere
trueExample:
Consider
where $A/course/@code = $B/class/@code
and the bindings:
$A/w
, where w = <course code="CS308" …/>
$B/x
, where x = <class code="CS308" … />
Then, w/course/@code = x/class/@code
, so the where
condition is satisfied and the binding is retained
return
clauseAggregation Example
Produce a list of students along with the number of courses each student took:
for $t in db:open("enrollDB")//transcript,
$s in $t/student
let $c := $t/course,
$result :=
<StudentSummary
sid = "{$s/@sid}"
name = "{$s/@name}"
numCourses = "{fn:count($c)}"
/>
order by $result/@numCourses descending, $result/@name
return $result
The grouping effect is achieved because $c
is bound to a new set of nodes for each binding of $t
XQuery supports explicit quantification:
Example: Find students who have taken MAT123.
for $t in db:open("enrollDB")//transcript
where
some $ct in $t/course satisfies $ct/@code = "MAT123"
return $t/student
Example: Retrieve class codes for classes which are taken by EVERY student (need not be taken in the same semester).
distinct-values(
for $c in db:open("enrollDB")//class
let $g :=
(: transcripts for $c :)
for $t in db:open("enrollDB")//transcript
where $t/course/@code = $c/@code
return $t
let $h :=
(: all transcripts :)
for $t in db:open("enrollDB")//transcript
return $t
where
every $t1 in $h
satisfies
some $t2 in $g
satisfies $t1/student/@sid = $t2/student/@sid
return $c/@code
)
FOR-LET-WHERE-ORDER BY-RETURN = FLWOR