XPath – core query language. Very limited, a glorified selection operator. Very useful, though: used in XML Schema, XSLT, XQuery, many other XML standards
XQuery – W3C standard. Very powerful, fairly intuitive, SQL-style
XSLT – a functional style document transformation language. Very powerful
Why Query XML?

Sample Document Corresponding to the Tree
<?xml version="1.0" ?>
<!-- Some comment -->
<students>
<student sid="111111111" >
<name>
<first>John</first>
<last>Doe</last>
</name>
<status>U2</status>
<course code="CS308" semester="F1997" grade="4"/>
<course code="MAT123" semester="F1997" grade="3"/>
</student>
<student sid="987654321" >
<name>
<first>Bart</first>
<last>Simpson</last>
</name>
<status>U4</status>
<course code="CS308" semester="F1994" grade="3" />
</student>
<student sid="444444444" >
<name>
<last>Simpson</last>
</name>
<status>U4</status>
</student>
</students>
<!-- Some other comment --><!-- Some other comment -->/ are absolute path expressions
/ returns root node of XPath tree/students/student returns all Student-elements that are children of Students elements, which in turn must be children of the root/student returns empty set (no such children at root)Current (or context node) – exists during the evaluation of XPath expressions (and in other XML query languages)
. denotes the current node; .. denotes the parent node
foo/bar returns all bar nodes that are children of foo nodes, which in turn are children of the current node ./foo/bar same as above ../abc/cde all cde e-children of abc e-children of the parent of the current node/ are relative (to the current node)Attributes, Text, etc.
/students/student/@sid returns all sid a-children of student, which are e-children of students, which are children of the root/students/student/name/last/text() returns all t-children of last e-children of …/comment() returns comment nodes under rootAn XPath expression is:
/locationStep1/locationStep2/…or
locationStep1/locationStep2/…Location step:
Axis::nodeSelector[predicate]Navigation axis:
child, parentancestor, descendant, ancestor-or-self, descendant-or-self , right-sibling, left-sibling etc.Node selector: node name or wildcard; e.g.,
./child::Student (we used ./Student, which is an abbreviation)./child::* – any e-child (abbreviation: ./*)Predicate: a selection condition; e.g.,
/students/student[course/@code = "MAT123"]
The meaning of the expression locationStep1/locationStep2/… is the set of all document nodes obtained as follows:
locationStep1 from the current nodelocationStep2; take the union of all these nodeslocationStep3, etc.locationStep1/locationStep2/… means:
locationStep1locationStep2 using N as the current nodelocationStep2 do the samelocationStep = axis::node[predicate]
axis::node2nd course child of 1st student child of students:
/students/student[1]/course[2]All last course elements within each student element:
/students/student/course[last()]Wildcards are useful when the exact structure of document is not known
Descendant-or-self axis, // : allows to descend down any number of levels (including 0)
//course` – allcourse`` nodes under the root/students//@sid – all sid attribute nodes under the elementstudents./last and last are same.//last and //last are differentThe * wildcard:
* (any element) e.g. /student/*/text()@* (any attribute) e.g. /students//@*Axis::nodeSelector[predicate]Axis::nodeSelector[predicate] ⊆ Axis::nodeSelector but contains only the nodes that satisfy predicatehttps://www.w3.org/XML/Group/qtspecs/specifications/xpath-functions-31/html/Overview.html
(1) Students who have taken CS308:
//student[course/@code="CS308"]True if : CS308
∈ //student/course/@code
(2) A more complex example:
//student[status="U2" and
starts-with(.//last, "D") and
contains(string-join(.//@code),"MAT") and
not (.//last = .//first) ]
(3) Testing whether a subnode exists:
students who have a grade (for some course)
//student[course/@grade] students who have either a first name or have taken a course in some semester or have status U4
//student[name/first or course/@semester or status/text() = "U2"](4) Aggregation: sum( ), count( )
//student[course/@grade and sum(.//@grade) div count(.//@grade) > 3.2](5) Union operator |
//course[@semester="F1994"] | //course[@semester="F1997"]union lets us define heterogeneous collections of nodes