XPath expressions (II)




Basic XPath patterns

Pattern Description Examples
/ document root or child of the node (parent/child) /html/body/div
// all descendants of type ... (parent//descendant) /html/body//p
* all elements
example: all children of body /html/body/*
example: all descendants of body /html/body//*
node() child list of current element /html/body/node()
text() matches a text /html/body/div/h1/text()
comment() matches a comment /html/body/comment()
@attr values of attribute name ... /html/body/div/@name
@* elements with any attribute
example: divs with any attributes //div/@*
[expr] filter condition
example: any elements which have any attributes //*[@*]
example: any elements which have class attribute //*[@class*]
example: any elements with class = 'description' //*[@class='description']
example: div elements with name = 'div_c' //div[@name='div_c']
< , > arithmetic comparison //img[@width > 10]
and and condition //div[@name="div_b3" and @class="description"]
or or condition //div[@name="div_b3" or @class="description"]
not not condition //p[not(@name="p1")]
.. parent node //h1[text()="First Heading"]/..
. current node
python/lxml example: current node selection .//p


X/search_patterns/xpath/ axis (node-sets relative to the current node)

Definitions

ancestor
= its parent, its parent's parent, and so on up to the root element
descendant
= element's children, their children, and so on
sibling
= children element of same parent, in document order, except the element itself.
..or self
= ...including current node

Axis name Result
ancestor Selects all ancestors (parent, grandparent, etc.) of the current node
ancestor-or-self Selects all ancestors (parent, grandparent, etc.) of the current node and the current node itself
attribute Selects all attributes of the current node
child Selects all children of the current node
descendant Selects all descendants (children, grandchildren, etc.) of the current node
descendant-or-self Selects all descendants (children, grandchildren, etc.) of the current node and the current node itself
following Selects everything in the document after the closing tag of the current node
following-sibling Selects all siblings after the current node
namespace Selects all namespace nodes of the current node
parent Selects the parent of the current node
preceding Selects all nodes that appear before the current node in the document, except ancestors, attribute nodes and namespace nodes
preceding-sibling Selects all siblings before the current node
self Selects the current node
W3School

Axis name equivalent example equivalent
self:: . self::*//p .//p
parent:: .. //div[@name="div_c"]/parent::* //div[@name="div_c"]/..
child:: / /html/body/child::div /html/body/div
ancestor:: //div[@name="div_b"]/ancestor::*
ancestor-or-self:: //div[@name="div_b"]/ancestor-or-self::*
descendant:: // //div[@name="div_b"]/descendant::* /html/body/div[@name="div_b"]//*
descendant-or-self:: //div[@name="div_b"]/descendant-or-self::*
attribute:: @ //div[attribute::name="div_c"] //div[@name='div_c']
following-sibling:: //div[@name="div_b"]/following-sibling::*
preceding-sibling:: //div[@name="div_b"]/preceding-sibling::*
following:: //div[@name="div_b"]/following::*
preceding:: //div[@name="div_b"]/preceding::*