XPath is specification to address how to access parts of an XML document.
The XML Path Language(XPath) was originally developed as part of a draft of the XSL but was eventually extracted into its own standard. The primary purpose of XPath is to address parts of an XML document.
- XPath uses a compact, non-XML syntax to facilitate use of XPath within URIs and XML attribute values.
- XPath models an XML document as a tree of nodes.
- XPath defines a way to compute a string-value for each type of node.
2. XPath Concepts
- XPath emphasizes the hierarchical relationship among the nodes of tree.
- An XPath expression is essentially a path to a particular set of nodes in the tree or a value extracted from the XML document.
- XPath expressions are always evaluated in the context of a particular node called the context node. The context node represents the starting point of an XPath query.
3. Pattern, Path, and Expression
- The pattern : ‘/book/title’
- The path : ‘preceding::title’
- The expression : count(/book/author) > 5
4. Data Model
XPath operates on an XML document as a tree. The tree contains nodes. There are seven types of node:
- root node
- element nodes
- text nodes
- attribute nodes
- namespace nodes
- processing instruction (PI) nodes
- comment nodes
5. Root Node
The root node is the root of the tree. The element node for the document element is a child of the root node.
6. Element Nodes
There is an element node for every element in the document. The children of an element node are the element nodes, comment nodes, processing instruction nodes and text nodes for its content.
An element node may have a unique identifier(ID). This is the value of the attribute that is declared in the DTD as type ID.
7. Attribute Nodes
Each element node has an associated set of attribute nodes; the element is the parent of each of these attribute nodes; however, an attribute node is not a child of its parent element. This is different from the DOM, which does not treat the element bearing an attribute as the parent of the attribute.
8. Namespace Nodes
Each element has an associated set of namespace nodes, one for each distinct namespace prefix that is in scope for the element (including the xml prefix, which is implicitly declared by the XML Namespaces Recommendation) and one for the default namespace if one is in scope for the element.
9. Processing Instruction Nodes
There is a processing instruction node for every processing instruction, except for any processing instruction that occurs within the document type declaration. The XML declaration is not a processing instruction. Therefore, there is no processing instruction node corresponding to the XML declaration.
10. Comment Nodes
There is a comment node for every comment, except for any comment that occurs within the document type declaration.
11. Text Nodes
Character data is grouped into text nodes.Characters inside comments, processing instructions and attribute values do not produce text nodes.
12. String values of Nodes
For every type of node, there is a way of determining a string-value.
- root nodes : the concatenation of the string-values of all text node descendants of the root node in document order
- element nodes : the concatenation of the string-values of all text node descendants of the element node in document order
- text nodes : the character data
- namespace nodes : the namespace URI
- processing instruction nodes : the part of the processing instruction following the target and any whitespace
- comment nodes : the content of the comment