This package contains the JAST 1.1 XML Filter Kit, © Anthony J H Simons, 2010-2015. This software is currently on experimental alpha release, and is offered as-is, under a free experimental license (see full terms below).
org.jast.ast
contains tools for mapping XML
files to user-defined Java syntax trees, and vice-versa.org.jast.xml
contains tools for mapping XML
files to JAST's standard XML memory model, and vice-versa.org.jast.xpath
contains an XPath search
engine for use with the standard XML model.org.jast.dtd
contains a document validation
engine for use with the standard XML model.org.jast.filter
contains
filters for searching and validating the standard XML model.This alpha-release software is free to use by academic and commercial users. The terms of the license are that you are free to use the software in any product (whether free or commercial), provided that any usage is acknowledged by citing "©Anthony J H Simons" as the copyright holder and referring to the JAST website "http://staffwww.dcs.shef.ac.uk/people/A.Simons/jast/" as the source. While this alpha license is perpetual and not subject to any restriction, we reserve the right to change the licensing terms of subsequent releases. The software is offered as-is, without any implied warranty for fitness of purpose. Please refer to the JAST website for further details:
The following assumes that you, the developer, wish to build a third-party
application, which incorporates the JAST 1.1 default XML
processing tools. The components for reading and writing XML files, and for
manipulating in-memory XML trees using the default memory model, are to be
found in another package org.jast.xml
. The components for
filtering XML memory-trees are to be found here, in this package
org.jast.filter
. The components for conducting XPath searches
in XML memory-trees are to be found in another package
org.jast.xpath
.
Filters can be used to select subsets of nodes from an XML memory tree.
Every XML memory tree is rooted in some node that conforms to the base type
Content
. This class provides an API for retrieving all the
children, or subsets of the children of a node. Relevant methods in the API
of Content
include:
public List<Content> getContents(); // all contents public List<Content> getContents(Filter); // a filtered subset public Content getContent(int); // node at index public Content getContent(Filter, int); // filtered at indexwhere the kind of
Filter
supplied determines what types of node
are returned. Filtering may return nodes of one or more specific types, or
with a given name, or constraint on the value, or any combination of these.
The Content
API also includes methods for removing single or
multiple children.
public List<Content> removeContents(Filter); // remove a subset public Content removeContent(Content); // remove a node public Content removeContent(int); // remove at indexRemoving a node, or list of nodes, returns the node, or node list, that was removed. Removed nodes no longer have a parent node, so may be reattached elsewhere, if desired.
Filters can be used freely by the programmer. They provide the most
efficient Java implementation of how to filter subsets of nodes. They
are used internally within the JAST software. For example, the methods
of the Element
class that return named children are encoded
using NameFilter
objects to select contents that are nodes
of the type Element
that have the desired name. The
following calls are nearly identical:
List<Content> contents = element.getContents( new NameFilter("Person", Content.ELEMENT)); List<Element> children = element.getChildren("Person);except that the second expression returns a list of
Element
,
rather than a list of Content
, for convenience.
The main API class to use is Filter
and its descendants.
All Filter
subclasses provide a method:
public boolean accept(Content); // does the filter accept the node?which returns
true
if the particular filter accepts the node
and false
otherwise. This method is implemented differently
in each subclass of Filter
, according to the test that the
filter is seeking to apply.
Filters may be used individually, or in combination, using methods that
logically combine filters in the most efficient way. The most commonly
used filters are:
NodeFilter
- filters for general nodes of a specified
content-type, or set of content-types.
NameFilter
- filters for markup nodes with a specified
name, optionally restricting the content-type.
PrefixFilter
- filters for markup nodes with a specified
namespace prefix, optionally of a restricted content-type.
ValueFilter
- filters for value nodes with a specified
value, optionally restricting the content-type.
RangeFilter
- filters for value nodes containing a value
from a specified enumerated range of values.
TypeFilter
- filters for value nodes, whose value conforms
to a specified XSD or DTD type.
RestrictFilter
- filters for value nodes, whose value obeys
a specified equality or inequality restriction.
WidthFilter
- filters for value nodes, whose text field
width obeys a specified inequality restriction.
ChildFilter
- filters for general nodes whose children
satisfy a specified filter pattern.
PropertyFilter
- filters for general nodes whose attributes
satisfy a specified filter pattern.
TypeFilter
uses a bitmask pattern to select nodes
whose content-type matches the pattern. Comparison uses fast bitwise
arithmetic on content-type identifiers. NameFilter
compiles the name pattern to decide whether to match against a short
name, or full identifier including the XML namespace prefix. All
symbols are interned and compared using basic equality for speed.
ValueFilter
compiles a comparison operator and a reference
value into a closure that may be applied to Element
or
Attribute
values, which are converted to the correct
types before comparison, and may be applied efficiently multiple times
to many nodes.
Filters may be combined with other filters using logical combination
methods that return the most suitable filter. The root class
Filter
declares the following API, which is suitably
implemented in each descendant:
public Filter and(Filter other); // Satisfy this and other public Filter or(Filter other); // Satisfy this or other public Filter not(); // Satisfy the complementSometimes, these methods construct a compound filter; but frequently they merge the constraints of the pair of filters, returning a more efficient filter than any manually created compound filter.
For example, whereas and()
will by default return the
AndFilter
compound filter, when any filter is combined
with a TypeFilter
, this simply results in a specialised
version of the same filter, accepting a more restricted type.
Similarly, whereas not()
will by default return a
NotFilter
, asking for the complement of the latter
will return the original filter. Similar rules apply De Morgan's
Law over compound filters to simplify the levels of nesting and
also merge child and attribute constraints when combining multiple
ChildFilter
or PropertyFilter
filters.