http://www.learn-xslt-tutorial.com/XPath.cfm
XPath in XSLT
To do anything significant with XSLT, you must work with the XML Path Language (XPath) . XPath is a W3C Recommendation that is used for identifying elements, attributes, text and other nodes within an XML document. XPath looks at an XML document as a tree . Each element is a branch that may have branches of its own.
In XSLT, XPath is used to match nodes from the source document for output templates via the match attribute of thexsl:template tag.
<xsl:template match="SOME_XPATH">
XPath is also used in the xsl:value-of tag to specify elements, attributes and text to decide what to output.
<xsl:value-of select="SOME_XPATH"/>
In addition, XPath is used in conditionals and flow control statements, which is covered in the lesson on Flow Control.
XPath Expression
XPath expressions are statements used by an XSLT processor to produce a result in the form of one of the following:
- node-set (set of one or more elements, attributes, or text nodes)
- Boolean (true/false)
- number
- string
The table below explains some common terms used in XPath.
Term | Definition |
---|---|
Context Node | The starting point for the expression. In XSLT, the context node is often (but not always) determined by the XPath in the match attribute of the xsl:template tag. |
Current Node | Changes as an expression is evaluated. The next part of the expression uses the last current node as its context node. |
Context Size | The number of nodes being evaluated at any point in an expression. |
Proximity Position | The position of a node relative to other nodes in a node list. The proximity position of the first node in a node list is always one (1). |
Location Paths
Location paths are used to point to and select portions of an XML document. The syntax of a location path is shown below.
axis::node_test[predicate]
The table below explains the parts of the location path.
Term | Definition |
---|---|
axis | Indicates the relationship between the selected node and the context node. |
node test | Provides the name or class of the nodes to reference. |
predicate | Further filters the nodeset. |
We will examine each part of the location path soon, but first let’s look at some examples, some of which have been taken as is from the W3C documentation ; others have been slightly modified.
- child::firstname selects the firstname element children of the context node
- child::* selects all element children of the context node
- child::text() selects all text node children of the context node
- child::node() selects all the children of the context node, whatever their node type
- attribute::name selects the name attribute of the context node
- attribute::* selects all the attributes of the context node
- descendant::firstname selects the firstname element descendants of the context node
- ancestor::name selects all name ancestors of the context node
- ancestor-or-self::div selects the div ancestors of the context node and, if the context node is a div element, the context node as well
- descendant-or-self::para selects the para element descendants of the context node and, if the context node is a para element, the context node as well
- self::para selects the context node if it is a para element, and otherwise selects nothing
- child::chapter/descendant::para selects the para element descendants of the chapter element children of the context node
- child::*/child::para selects all para grandchildren of the context node
- / selects the document root (which is always the parent of the document element)
- /descendant::para selects all the para elements in the same document as the context node
- /descendant::olist/child::item selects all the item elements that have an olist parent and that are in the same document as the context node
- child::para[position()=1] selects the f irst para child of the context node
- child::para[position()=last()] selects the last para child of the context node
- child::para[position()=last()-1] selects the last but one para child of the context node
- child::para[position()>1] selects all the para children of the context node other than the firstpara child of the context node
- following-sibling::chapter[position()=1] selects the next chapter sibling of the context node
- preceding-sibling::chapter[position()=1] selects the previous chapter sibling of the context node
- /descendant::figure[position()=42] selects the forty-second figure element in the document
- /child::doc/child::chapter[position()=5]/child::section[position()=2] selects the second section of the fifth chapter of the doc document element
- child::para[attribute::type="warning"] selects all para children of the context node that have a type attribute with value warning
- child::para[attribute::type='warning'][position()=5] selects the fifth para child of the context node that has a type attribute with value warning
- child::para[position()=5][attribute::type="warning"] selects the fifth para child of the context node if that child has a type attribute with value warning
- child::chapter[child::title='Introduction'] selects the chapter children of the context node that have one or more title children with string-value equal to Introduction
- child::chapter[child::title] selects the chapter children of the context node that have one or more title children
- child::*[self::chapter or self::appendix] selects the chapter and appendix children of the context node
- child::*[self::chapter or self::appendix][position()=last()] selects the lastchapter or appendix child of the context node
Location paths are either relative (i.e, starting with the context node ) or absolute (i.e, starting with the root node of the XML document . A location path is divided in location steps. Each step consists of an axis followed by a node test. The node test may have predicates, which are used to further specify the node to be accessed. In the next sections, we will examine the different portions of these location steps.
Node Test
axis::node_test[predicate]
Each location step has a node test, which provides the name or class of the nodes to reference. The processor looks through the nodes at the specified axis and returns a nodeset including all nodes with the name or class specified in the node test. Some examples are shown below:
Example | Description |
---|---|
/ | Indicates the root, which is one level above the document element. |
FirstName | Indicates a FirstName node. Depending on the axis, this could be an element or an attribute. |
text() | Indicates a text node. |
comment() | Indicates a comment node. |
processing-instruction() | Indicates a processing instruction. |
Axis
axis::node_test[predicate]
An axis indicates the relationship between the selected node and the context node. Below is a reference table of the available axes.
Axis | Description |
---|---|
child | children of the context node |
descendant | descendants of the context node |
parent | parent of the context node |
ancestor | ancestors of the context node |
following-sibling | all siblings that follow the context node |
preceding-sibling | all siblings that precede the context node |
following | all nodes that follow the context node |
preceding | all nodes that precede the context node |
attribute | attributes of the context node |
namespace | namespace nodes of the context node |
self | the context node |
descendant-or-self | the context node and all its descendants |
ancestor-or-self | the context node and all its ancestors |
Some location paths using just the axis and the node test are shown below.
Example | Description |
---|---|
child::FirstName | Indicates the FirstName element children of the context node. |
child::* | Indicates all element children of the context node. |
child::text() | Indicates all text node children of the context node. |
child::node() | Indicates all the children of the context node, whatever their node type. Note that attributes are not considered children of elements. |
parent::node() | Indicates the parent of the context node regardless of type. |
parent::* | Indicates the parent of the context node if that parent is an element (the only other possibility is that the parent is the document root). |
parent::Topic | Indicates the parent of the context node if that parent is an element named "Topic". |
attribute::href | Indicates the href attribute of the context node. |
attribute::* | Indicates all the attributes of the context node. |
Example | Description |
---|---|
descendant::FirstName | Indicates the FirstName element descendants of the context node. |
ancestor::Topics | Indicates all Topics ancestors of the context node. |
ancestor-or-self::div | Indicates the div ancestors of the context node and, if the context node is a div element, the context node as well. |
descendant-or-self::List |
Predicate
axis::node_test[predicate]
Predicates are used to filter node sets selected in the node test. Predicates are placed in square brackets following a node test. Multiple steps in a location path may have predicates.
Predicates can be relatively simple, such as child::para[position()=1] , which returns the first para child of the context node. They can also be fairly complicated, such as:
/child::doc/child::chapter[position()=5]/ child::section[position()=2]
which returns the second section of the fifth chapter of the doc document element.
There are several example XSLT files in XPath/Demos that illustrate how XPath works. Take a moment to run through these examples by transforming XPath/Demos/Beatles.xml against each. They include:
- child.xsl
- childstar.xsl
- childtext.xsl
- attribute.xsl
- attributestar.xsl
- descendant.xsl
- self.xsl
Exercise: Accessing Nodes
In this exercise, you will practice using XPath by modifying the XSLT used to transform the XML document below.
Code Sample: XPath/Exercises/BusinessLetter.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="XPaths.xsl"?>
<BusinessLetter>
<Head>
<SendDate>November 29, 2005</SendDate>
<Recipient>
<Name Title="Mr.">
<FirstName>Joshua</FirstName>
<LastName>Lockwood</LastName>
</Name>
<Company>Lockwood & Lockwood</Company>
<Address>
<Street>291 Broadway Ave.</Street>
<City>New York</City>
<State>NY</State>
<Zip>10007</Zip>
<Country>United States</Country>
</Address>
</Recipient>
</Head>
<Body>
<List>
<Heading>
Along with this letter, I have enclosed the following items:
</Heading>
<ListItem>
two original, execution copies of the Webucator
Master Services Agreement
</ListItem>
<ListItem>
two original, execution copies of the Webucator Premier Support for
Developers Services Description between
Lockwood & Lockwood and Webucator, Inc.
</ListItem>
</List>
<Para>Please sign and return all four original, execution copies to me at
your earliest convenience. Upon receipt of the executed copies, we will
immediately return a fully executed, original copy of both agreements
to you.</Para>
<Para>
Please send all four original execution copies to my attention as follows:
<Person>
<Name>
<FirstName>Bill</FirstName>
<LastName>Smith</LastName>
</Name>
<Address>
<Company>Webucator, Inc.</Company>
<Street>4933 Jamesville Rd.</Street>
<City>Jamesville</City>
<State>NY</State>
<Zip>13078</Zip>
<Country>USA</Country>
</Address>
</Person>
</Para>
<Para>If you have any questions, feel free to call me at
<Phone>
x123</Phone> or e-mail me at
<Email>bsmith@webucator.com</Email>.</Para>
</Body>
<Foot>
<Closing>
<Name>
<FirstName>Bill</FirstName>
<LastName>Smith</LastName>
</Name>
<JobTitle>VP of Operations</JobTitle>
</Closing>
</Foot>
</BusinessLetter>Please follow these steps.
- Open XPath/Exercises/XPaths.xsl for editing.
- This file contains three xsl:template s, one matching Head , Body , and Foot . In each template is a comment showing what the goal output is. You will use xsl:value-of tags and XPath to create this output from the XML file shown above.
- You may need to use some other XSL tags, such as the xsl:text tag.
- To test your solutions, transform XPath/Exercises/BusinessLetter.xml againstXPath/Exercises/XPaths.xsl .
Abbreviated Syntax
The XPath syntax used above is lengthy. Thankfully, there is an abbreviated syntax that is much more commonly used. The table below shows some of these abbreviations.
Axis | Description |
---|---|
child:: | |
. | self::node() |
.. | parent::node() |
@ | attribute:: |
.// | ./descendant-or-self::node()/ |
// | descendant-or-self::node()/ |
* | all child elements of the context node |
@* | all attributes of the context node |
[n] | [position() = n] |
The child:: axis is the default axis, so it can go unnamed; hence, the empty cell in the table above.
Long Form | Abbreviated Syntax |
---|---|
child::firstname | firstname |
child::* | * |
child::text() | text() |
attribute::name | @name |
attribute::* | @* |
descendant-or-self::firstname | .//firstname |
child::chapter/descendant::para | chapter//para |
child::*/child::para | */para |
/descendant::para | //para |
child::para[position()=1] | para[1] |
child::para[attribute::type="warning"] | para[@type="warning"] |
child::*[self::chapter or self::appendix] | *[name()='chapter' or name()='appendix'] |
Exercise: Accessing Nodes with Abbreviated Syntax
This exercise is identical to the previous exercise except that you will be using the abbreviated syntax of XPath.
Please follow these steps.
- Open XPath/Exercises/XPathsAbbr.xsl for editing.
- This file contains three xsl:templates , one matching Head , Body , and Foot . In each template is a comment showing what the goal output is. You will use xsl:value-of tags and XPath to create this output.
- You may need to use some other XSL tags, such as the xsl:text tag.
- To test your solutions, transform XPath/Exercises/BusinessLetter2.xml againstXPath/Exercises/XPathsAbbr.xsl .
XPath Functions
Functions are often used within predicates to help identify a node or node set or to find out information about a node or node set. Below are reference tables showing some of the more common core XPath functions.
Function | Description |
---|---|
last() | Returns the number of the number of items in the selected node set. |
position() | Returns the position of the context node in the selected node set. |
count() | Takes a location path as an argument and returns the number of nodes in that location path. |
id() | Takes an id as an argument and returns the node that has that id. |
Function | Description |
---|---|
starts-with() | Takes a string and substring as arguments. Returns true if the string begins with the substring. Otherwise, returns false. |
contains() | Takes a string and substring as arguments. Returns true if the string contains the substring. Otherwise, returns false. |
substring-before(string, substring ) | Returns the portion of the string to the left of the first occurrence of the substring. |
substring-after(string, substring ) | Returns the portion of the string to the right of the first occurrence of the substring. |
substring() | Takes a string, start position and length as arguments. Returns the substring of length characters beginning with the character at start position. |
string-length() | Takes a string as an argument and returns its length. |
name() | Returns the name of an element. |
text() | Returns the text child nodes of an element. |
Function | Description |
boolean() | Takes an object as an argument. Returns true if: the object is a number greater than zero the object is a non-empty node-set the object is a string with at least one character. |
Function | Description |
---|---|
sum() | Takes a node-set as an argument and returns the sum of of the string values of the node-set. |
ceiling() | Takes a number as an argument and returns the rounded-up value. |
floor() | Takes a number as an argument and returns the rounded-down value. |
round() | Takes a number as an argument and returns the rounded value. |
XPath Operators
The table below shows the XPath operators.
Operator | Description |
---|---|
and | Boolean AND |
or | Boolean OR |
= | Equals |
!= | Not equal |
< | Less than |
<= | Less than or equal |
> | Greater than |
>= | Greater than or equal |
+ | Addition |
- | Subtraction |
* | Multiplication |
div | Division |
mod | Modulus |
The sample below shows how some operators and functions are used in practice.
Code Sample: XPath/Demos/BeatlesFunctions.xsl
<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="html"/> <xsl:template match="/"> <h1>Functions and Operators</h1> <h2>count()</h2> <code>count(beatles/beatle):</code> <b> <xsl:value-of select="count(beatles/beatle)"/> </b> <h2>contains()</h2> <code>contains(//beatle[last()]/@link,'webucator'):</code> <b> <xsl:value-of select="contains(//beatle[last()]/@link,'webucator')"/> </b><br/> <code>contains(//beatle[last()]/@link,'ringostarr'):</code> <b> <xsl:value-of select="contains(//beatle[last()]/@link,'ringostarr')"/> </b> <h2>=</h2> <code>beatles/beatle[ @real = 'no' ]//firstname:</code> <b> <xsl:value-of select="beatles/beatle[ @real = 'no' ]//firstname"/> </b> <h2>!=</h2> <code>beatles/beatle[ @real != 'no' ]//firstname:</code> <b> <xsl:value-of select="beatles/beatle[ @real != 'no' ]//firstname"/> </b> <h2>not()</h2> <code>beatles/beatle[ not(@real) ][2]//firstname:</code> <b> <xsl:value-of select="beatles/beatle[ not(@real) ][2]//firstname"/> </b> <h2>last()</h2> <code>beatles/beatle[ not(@real) ][last()]//firstname:</code> <b> <xsl:value-of select="beatles/beatle[ not(@real) ][last()]//firstname"/> </b> <h2>not() & =</h2> <code>beatles/beatle[ not(@real='no') ][2]//firstname:</code> <b> <xsl:value-of select="beatles/beatle[ not(@real='no') ][2]//firstname"/> </b> <h2>not() & = & last()</h2> <code>beatles/beatle[ not(@real='no') ][last()]//firstname:</code> <b> <xsl:value-of select="beatles/beatle[ not(@real='no') ][last()]//firstname"/> </b> </xsl:template> </xsl:stylesheet>
When XPath/Demos/BeatlesFunctions.xml , which has the same XML as XSLTBasics/Demos/Beatles.xml , is transformed against XPath/Demos/BeatlesFunctions.xsl and viewed in a browser, the output looks like this: