This package contains the JAST 1.1 XPath Search Engine, © Anthony J H Simons, 2010-2015. This software is currently on experimental alpha release, and is offered as-is, under a free experimental license (see full terms below).

Java Abstract Syntax Trees, v1.1

If you are seeking to use any of the above software, please refer to the brief instructions immediately below and also the documentation on the the JAST website for more details:

Licensing Terms

This alpha-release software is free to use by academic and commercial users. The terms of the license are that you are free to use the software in any product (whether free or commercial), provided that any usage is acknowledged by citing "©Anthony J H Simons" as the copyright holder and referring to the JAST website "http://staffwww.dcs.shef.ac.uk/people/A.Simons/jast/" as the source. While this alpha license is perpetual and not subject to any restriction, we reserve the right to change the licensing terms of subsequent releases. The software is offered as-is, without any implied warranty for fitness of purpose. Please refer to the JAST website for further details:

The XPath Search Engine

The following assumes that you, the developer, wish to build a third-party application, which incorporates the JAST 1.1 default XML processing tools. The components for reading and writing XML files, and for manipulating in-memory XML trees using the default memory model, are to be found in another package org.jast.xml. The components for conducting XPath searches in XML memory-trees are to be found here, in this package org.jast.xpath. The basic components for filtering XML memory-trees are to be found in another package org.jast.filter.

The W3C XPath Search Protocol

The World-Wide Web Consortium (W3C) specifies a standard for conducting searches in XML trees (whether in memory or on disk) called XPath. This standard specifies the syntax of a language for constructing pattern expressions, which denote paths through an XML tree. The pattern expressions are matched against an XML tree and select one or more nodes (if the match succeeds) that match the pattern. The result may be a set of elements, of attributes, or of other kinds of content, according to the pattern.

The full XPath pattern language syntax is quite rich, including specifiers for which axis to explore, what kinds of node to select and what predicates to apply to the nodes. The full syntax includes a large subset of the functions normally found in a programming language. The W3C also defines an abbreviated syntax, which focuses on matching elements and attributes by their names and values. Examples of this more convenient abbreviated syntax include the following:

. selects the current node, known as the context
.. selects the parent node of the context node (a relative path)
/. selects the document node containing the context node (an absolute path, starting from the document root)
//. selects all nodes in the current document (an absolute pattern starting from the document root)
Film selects the Film children of the context node (a relative path)
Film/Director selects the Director children of the Film children of the context node (a relative path)
/Catalogue/Film selects the Film children of the Catalogue root element of the document node (an absolute path)
@year selects the year attribute of the context node (a relative path)
Film/@year selects the year attributes of the Film children of the context node (a relative path)
//Film selects all Film descendants of the document node (an absolute path)
Catalogue//Title selects all Title descendants of the Catalogue child of the context node (a relative path)
Film[@year=1976] selects all Film children of the context node, whose year attribute has the value 1976
Film[Director='George Lucas'] selects all Film children of the context node, whose Director child node has the value "George Lucas"

The JAST XPath Search Engine

The main class of interest is XPath, which represents a compiled XPath pattern; and also contains the search engine for matching a pattern against an XML memory tree. JAST only supports matching XPaths against memory-trees; it does not support matching against XML files on disk. An XPath object is created using the constructor XPath(String), supplying the XPath pattern string as the argument. Behind the scenes, this invokes a parser called XPathReader, which compiles the pattern string into a sequence of navigation rules and filters, known as XPathRules. When you perform an XPath search, these rules are applied, one at a time, to the current context, yielding a new context. The matching process is top-down and breadth-first, finishing when the rules are exhausted, or no further nodes match the pattern.

To initiate an XPath search, a program need only construct an XPath instance and invoke one of its match() methods on a single node, or list of nodes, which serve as the starting context for the match. For example:

	XPath findFilms = new XPath("/Catalogue/Film");
	List<Content> films = findFilms.match(document);

	XPath findYears = new XPath("@year");
	List<Content> years = findYears.match(films);

	XPath findText = new XPath("//Director/text()");
	List<Content> names = findText.match(document);
The first XPath searches for all Film elements under the root element Catalogue of the document. This returns a list of nodes. These are the new context for the second XPath search, which returns all year attributes of the Films returned by the first search. The third example shows how to return a list of Text nodes storing the values of all the Director elements anywhere in the document. If the same XPath search pattern needs to be used many times, it is most efficient to create the XPath object once, and re-use it many times; otherwise the XPathReader will be invoked again to recompile the string search pattern.

While the developer need not be concerned about the various XPathRule classes, you may like to know that these rules use exactly the same kinds of Filter as those supplied in another JAST package. In general, a rule may advance one step in the XPath search, but apply one or more filters to the following context, such as testing its element-type, its name, or applying a constraint to its value, or its attribute's value, or its child's value. Only those nodes passing the filters are returned in the next cycle.

XPath Features Supported by JAST

The JAST implementation of XPath supports the W3C abbreviated syntax for most simple XPath patterns. It supports searching along the self-axis, parent-axis, child-axis, attribute-axis and descendants-or-self axis. It supports absolute paths starting from the root and relative paths starting from the context node. It supports predicates testing the self-axis (by value), the attribute-axis (by name and by value) and the child-axis (by name and by value); and supports position selection at the positions n, last() and last()-n, where n is an integer. Predicates on values may use any of the six usual inequality operators. JAST supports the node(), text(), comment() and processing-instruction() content selectors in addition to the default child selector. The wildcard * may be given for attribute or element names as a whole, (but not used as part of a name).

The JAST implementation does not support the W3C full syntax for XPath. It does not yet support the preceding, following, preceding-sibling, following-sibling, ancestor, ancestor-or-self or (just) descendant axes, which have no shortform in the W3C abbreviated syntax. It only supports predicates defined on the context node, its immediate children or attributes (not on general nested XPaths). It does not support the position()<n predicate. It does not yet support alternative paths |, or predicates combined using explicit and, or and the not() function (but a sequence of predicates is implicitly conjoined). Arbitrary arithmetical and string functions are not supported. These restrictions were chosen to ensure optimum efficiency for the majority of cases where XPath searches on a memory-document are appropriate.