The JAST User Guide
The Java Abstract Syntax Trees toolkit
provides a complete Java library for processing XML files. The various
components support reading XML files into DOM-trees, writing DOM-trees
to XML files, scanning large XML files to build Java structures on
demand, marshalling bespoke Java models to serial XML, and unmarshalling
XML to bespoke Java models consisting of the programmer's own classes.
Further components exist for checking document validity according to DTD
or XSD specifications, and for searching or filtering XML DOM-trees
according to the XPath abbreviated syntax. The most frequently used
readers and writers may be found in the top-level package:
uk.ac.sheffield.jast .
A number of demonstration programs are supplied in the package:
uk.ac.sheffield.jast.test and require XML files found
in the unzipped download bundle. Please refer to the
Download Guide.
Parsing Document Object Models
The standard XML toolkit allows Java programs to read and write XML
files to and from a standard Document Object Model (DOM), a syntax tree that
corresponds exactly to the hierarchical structure of the XML document.
The nodes of the DOM-tree have obvious Java class names, such as
Declaration , Instruction , Element ,
Attribute , Text , Data and
Comment , inspired by the W3C XML specification. The top-level
package: uk.ac.sheffield.jast contains XMLReader
for reading XML files into DOM-trees, and XMLWriter for writing
DOM-trees as XML files. Please refer to the
DOM Processing Guide.
Once read into memory, the DOM-tree is returned as an instance of the
type Document , from which the root Element may
be extracted. The root Element and all of its descendant nodes
may be inspected and manipulated using methods of the relevant nodes.
The APIs of all the nodes used for building a DOM-tree are
described in the package: uk.ac.sheffield.jast.xml . By default,
XML documents are only checked for well-formedness. It is also possible to
validate a document against a Document Type Definition (DTD), or alternatively
against an XML Schema Definition (XSD). Tools for doing this are provided
in the package: uk.ac.sheffield.jast.valid ; and validation can
also be triggered automatically when reading with XMLReader .
Please refer to the
XML Validation Guide.
Marshalling Bespoke Java Models
The custom AST toolkit allows Java programs to marshal a custom Abstract
Syntax Tree (AST) to a serialised XML File and to unmarshal the XML file
back to an in-memory AST. The the nodes of the AST are provided as custom
Java classes designed by the programmer. The Java AST model may be a simple
tree, or a cyclic and re-entrant graph of arbitrarily-connected Java objects.
Marshalling will write such structures to serial XML files without duplication,
writing references when an object is encountered more than once. The top-level
package:
uk.ac.sheffield.jast contains ASTReader for
unmarshalling XML files into Java ASTs, and ASTWriter for
marshalling ASTs as XML files. Please refer to the
Java Binding Guide.
The AST node classes supplied by the programmer are designed according to simple
API conventions, rather like Java Beans, and do not require complicated Java
annotations or XML mapping files to support conversion to and from XML. Data is
stored in these classes and accessed using the usual strongly-typed setter- and
getter-methods familiar to the programmer; these methods are automatically discovered
by the marshalling framework through Java reflection. All textual data is converted
into suitable strongly-typed values, before being stored in the programmer's own
classes. By way of example, a collection of AST classes for modelling a film
catalogue is provided in the package: uk.ac.sheffield.jast.ast .
Please refer to the
Java Binding Guide.
Scanning Very Large XML Streams
The Streaming API for XML (SAX) allows Java programs to scan very large XML
files and perform programmer-defined building-actions when specific XML events are
encountered. This is a suitable strategy when the XML files to be processed
are simply too large to hold in memory (although JAST outclasses all other
DOM-tree builders in its tree-storing capacity). The components for building
are provided in the package: uk.ac.sheffield.jast.build . The programmer
must supply the scanning XMLParser with a builder-class that conforms
to the Builder interface, and which defines how to respond to
specific events scanned by the streaming XMLParser . Please refer
to the SAX Builder Guide.
The programmer's builder-class may be provided more quickly and simply by inheriting
from BasicBuilder , which defines empty responses to each event.
The structures created by a builder are arbitrary, depending on how the programmer
defines the methods of the builder-class. There is no corresponding method
for inserting the extracted data back into the XML file. However, two builders
are provided, called XMLBuilder and ASTBuilder , which
mimic exactly the behaviour of XMLReader and ASTReader ,
and whose data can be serialised using the corresponding writers. Please refer
to the SAX Builder Guide.
Searching and Filtering XML Documents
The XPath search engine is an accompaniment to the standard XML toolkit. It
implements a subset of the W3C XPath standard, supporting the abbreviated
syntax for XPath searching. A search query is represented by an
XPath object, which compiles the query-string into a collection of
rules that filter and navigate through the XML memory tree. A search is initiated
by matching an XPath against a starting point in a DOM-tree, and
returns a list of matching nodes. The XPath search engine is provided in the
package: uk.ac.sheffield.jast.xpath . Please refer to the
XPath Query Guide.
The XML tree filtering kit is an accompaniment to the standard XML
toolkit. It provides a useful set of Filter classes that
may be used to construct arbitrarily complex criteria for matching
different kinds of XML node in a DOM-tree. Different filters can
test the content-type, the name, the value, the attributes or the
children of different kinds of node. The tree filtering kit is
also an integral component of the XPath search engine and the document
validation engine. The XML filtering kit is provided in the package:
uk.ac.sheffield.jast.filter . Please refer to the
XML Filter Guide.
|