XPath Axes

XPath is a powerful tool used for navigating XML documents. With its 13 axes, XPath provides an efficient way to select nodes in an XML or HTML document. Each axis defines a different direction for XPath to search through the node tree. In this article, we will dive deep into XPath axes, their usage, and how they can make your XML parsing much easier.

Understanding XPath Axes

XPath axes are used to navigate through XML or HTML documents in different directions. They allow you to select elements based on their relationships with other nodes. Let’s explore the 13 different XPath axes and understand how they function.

XPath Axis Name	Description
self	Contains only the context node.
ancestor	Contains the ancestors of the context node, such as the parent node, its parent, and so on.
ancestor-or-self	Contains both the context node and its ancestors.
attribute	Contains all the attribute nodes of the context node, if any.
child	Contains the children of the context node.
descendant	Contains the children of the context node, and the children of those children, and so on.
descendant-or-self	Contains the context node and all of its descendants.
following	Contains all nodes that occur after the context node, in document order.
following-sibling	Selects all siblings after the context node.
namespace	Contains all the namespace nodes of the context node, if any.
parent	Contains the parent node of the context node, if it has one.
preceding	Contains all nodes that appear before the context node in document order.
preceding-sibling	Contains all preceding siblings of the context node.

1. Self Axis

The self axis refers to the context node itself. It helps to select the node you are currently working with. This axis can be used when you want to stay focused on the node without navigating to any other.

Syntax:

//self::node()

This axis is useful when you need to check the node you’re working with, and it is often implicitly applied.

2. Ancestor Axis

The ancestor axis contains all ancestor nodes of the context node. These are the parent, grandparent, and so on, up to the root node. It helps to go up the tree structure and find parent elements.

Syntax:

//ancestor::node()

This axis is crucial when you need to locate elements higher in the document hierarchy.

3. Ancestor-or-Self Axis

The ancestor-or-self axis is similar to the ancestor axis but also includes the context node itself. It’s useful when you want to include the current node in the search results.

Syntax:

//ancestor-or-self::node()

With this axis, you can retrieve both the context node and its ancestors, allowing for a broader selection range.

4. Attribute Axis

The attribute axis allows you to select all attribute nodes of the context node. It helps in extracting specific attributes of elements, like id, class, or other attributes.

Syntax:

//attribute::node()

This axis is extremely useful when dealing with HTML or XML documents with lots of attributes.

5. Child Axis

The child axis defines all child nodes of the context node. This axis is commonly used to extract nested elements within an element.

Syntax:

//child::node()

This is the default axis and doesn’t need to be explicitly specified. However, you can always use child:: when you want to make the axis clear.

6. Descendant Axis

The descendant axis refers to all the child nodes of the context node, including their children, grandchildren, and so on. It helps you get all descendant nodes in a tree structure.

Syntax:

//descendant::node()

You would use this axis when you want to capture all nested elements below the context node.

7. Descendant-or-Self Axis

The descendant-or-self axis is a combination of the descendant axis and the context node itself. It retrieves the context node and its entire descendant tree.

Syntax:

//descendant-or-self::node()

This is useful when you want to include the context node along with its entire subtree.

8. Following Axis

The following axis contains all nodes that come after the context node in document order. This includes nodes that are not necessarily siblings of the context node.

Syntax:

//following::node()

It helps in selecting elements that occur after the current node.

9. Following-Sibling Axis

The following-sibling axis selects all sibling nodes that come after the context node. This axis is useful when you need to find elements that are at the same hierarchical level as the context node.

Syntax:

//following-sibling::node()

It is ideal for working with nodes that share the same parent.

10. Namespace Axis

The namespace axis selects all namespace nodes associated with the context node. This axis is often used when dealing with XML documents that use namespaces.

Syntax:

//namespace::node()

It helps to isolate and work with namespaces, especially in XML documents.

11. Parent Axis

The parent axis contains the immediate parent of the context node. If the context node is the root node, this axis will be empty.

Syntax:

//parent::node()

This is used when you need to move up one level in the document hierarchy.

12. Preceding Axis

The preceding axis contains all nodes that occur before the context node in document order. It’s helpful when you want to get nodes that come before a certain element.

Syntax:

//preceding::node()

This axis allows you to search for elements that appear earlier in the document.

13. Preceding-Sibling Axis

The preceding-sibling axis selects all siblings of the context node that come before it in the document. It’s used to find elements that share the same parent but appear earlier in the document.

Syntax:

//preceding-sibling::node()

This axis helps when you need to go backward in the document and find earlier siblings.

XPath Axes in Action

XPath axes are powerful tools for navigating through XML and HTML documents. When combined with specific queries, they enable you to extract the exact elements you need. Let’s look at some examples of how you can use these axes to select nodes efficiently.

Child Axis Example

To select all td elements that are children of a table element:

//table/tbody//child::*/child::td[position()>1]

This example selects all td elements that are positioned after the first one in a table.

Parent Axis Example

To select the parent of an element with id='email':

//input[@id='email']/parent::*

This selects the parent node of the input element.

Following Axis Example

To select the node following an input element with id='email':

//input[@id='email']/following::*

This selects all nodes that come after the input element.

Following-Sibling Axis Example

To select the sibling element that comes after an element with id='month':

//select[@id='month']/following-sibling::select/

This selects the select sibling element that appears after the id='month'.

Preceding Axis Example

To select the node preceding an input element with id='pass':

//input[@id='pass']/preceding::tr

This selects the tr element that comes before the input element.

Preceding-Sibling Axis Example

To select the sibling element that comes before an element with id='day':

//select[@id='day']/preceding-sibling::select/

This selects the select sibling element that appears before the id='day'.

XPath axes are an essential tool for navigating XML and HTML documents. By understanding and leveraging the 13 different axes, you can efficiently locate and select the exact nodes you need. Whether you’re working with web scraping, data extraction, or XML parsing, mastering XPath axes will make your work much easier and more effective.

XPath Axes

XPath Axes

Understanding XPath Axes

1. Self Axis

2. Ancestor Axis

3. Ancestor-or-Self Axis

4. Attribute Axis

5. Child Axis

6. Descendant Axis

7. Descendant-or-Self Axis

8. Following Axis

9. Following-Sibling Axis

10. Namespace Axis

11. Parent Axis

12. Preceding Axis

13. Preceding-Sibling Axis

XPath Axes in Action

Child Axis Example

Parent Axis Example

Following Axis Example

Following-Sibling Axis Example

Preceding Axis Example

Preceding-Sibling Axis Example

Related Posts

Create XML Documents using NetBeans

XML Elements

XQuery – XML Query Language