public class XMLReader extends BasicReader
uk.ac.sheffield.jast.xml
.
XMLReader uses the low-level token scanner XMLScanner to supply it with integer tokens, members of the Tokens class, that represent different recognised XML events. It consumes these events and, where events are associated with segmented text, it consumes the associated text. The resulting model is returned as an instance of Document, which by default is compact (discarding formatting whitespace surrounding XML elements). XMLReader may be requested to preserve full native format (keeping all original layout text as extra Text nodes).
By default, XMLReader will return a well-formed XML Document, or report that the XML syntax was faulty. XMLReader may be requested to validate the Document, to ensure that it conforms to some grammar defined by a Doctype, or an XML schema, so long as the Document refers to these. It may also be instructed to expand new entity references as part of a Doctype. XMLReader implements the Closeable interface by virtue of inheriting from BasicReader.
Modifier and Type | Field and Description |
---|---|
protected Document |
document
The top-level Document constructed by this reader.
|
private boolean |
hasDoctype
Flag indicating whether the Document ha a Doctype node.
|
private boolean |
hasElement
Flag indicating whether the Document has a root Element.
|
private boolean |
rawLayout
A flag describing whether to preserve the native format of the XML.
|
lastToken, lexicon, scanner, validation
Constructor and Description |
---|
XMLReader(java.io.File file)
Creates an XMLReader reading from a file, using the default UTF-8
character encoding.
|
XMLReader(java.io.File file,
java.lang.String encoding)
Creates an XMLReader reading from a file, using the specified
character encoding.
|
XMLReader(java.io.InputStream stream,
java.lang.String encoding)
Creates an XMLReader reading from a basic input stream, using the
specified character encoding.
|
XMLReader(java.io.Reader reader,
java.lang.String encoding)
Creates an XMLReader reading from a character reader, using the
specified character encoding.
|
XMLReader(java.net.URL url)
Creates an XMLReader reading from a URL, using the default ISO-8859-1
(Latin-1) character encoding specified for the MIME-type text/xml.
|
XMLReader(java.net.URL url,
java.lang.String encoding)
Creates an XMLReader reading from a URL, using the specified character
encoding.
|
Modifier and Type | Method and Description |
---|---|
protected Content |
parseAnyContent()
Parses one kind of XML content.
|
protected Attribute |
parseAttribute()
Parses an XML attribute.
|
protected Comment |
parseComment()
Parses an XML comment.
|
protected Declaration |
parseDeclaration()
Parses the XML Declaration.
|
protected Doctype |
parseDoctype()
Parses a doctype declaration.
|
protected void |
parseDocumentBody()
Parses the rest of the Document after any required Declaration (for
XML files) or Doctype (for HTML files).
|
protected Element |
parseElement()
Parses an XML element.
|
protected Data |
parseEscapedData()
Parses some escaped character data.
|
protected Instruction |
parseInstruction()
Parses an optional XML processing instruction.
|
protected Text |
parseTextContent()
Parses some text content.
|
void |
preserveLayout(boolean value)
Instructs this XMLReader whether to preserve, or ignore layout text.
|
Document |
readDocument()
Parses XML data from the input source and creates a classic Java XML
syntax tree.
|
protected void |
validate(Document document)
Validates the Document using its own XMLSchema.
|
checkEncoding, close, endOfStream, getContext, getEncoding, getLexicon, getLineNumber, parseQuotedValue, setLexicon, setValidation
encodingError, semanticError, syntaxError
protected Document document
private boolean rawLayout
private boolean hasElement
private boolean hasDoctype
public XMLReader(java.io.File file) throws java.io.FileNotFoundException, java.io.UnsupportedEncodingException
file
- the input text file.java.io.FileNotFoundException
- if the file cannot be found or opened.java.io.UnsupportedEncodingException
- if UTF-8 is not supported.public XMLReader(java.io.File file, java.lang.String encoding) throws java.io.FileNotFoundException, java.io.UnsupportedEncodingException
file
- the input file.encoding
- the name of the character encoding.java.io.FileNotFoundException
- if the file cannot be found or opened.java.io.UnsupportedEncodingException
- if the character encoding is not
supported.public XMLReader(java.net.URL url) throws java.io.IOException, java.io.UnsupportedEncodingException
url
- the absolute URL giving the location of the XML data.java.io.IOException
- if the attempt to connect to the URL and open an
input stream to read from it fails.java.io.UnsupportedEncodingException
- if ISO-8859-1 is not supported.public XMLReader(java.net.URL url, java.lang.String encoding) throws java.io.IOException, java.io.UnsupportedEncodingException
url
- the absolute URL giving the location of the XML data.encoding
- the name of the character encoding.java.io.IOException
- if the attempt to connect to the URL and open an
input stream to read from it fails.java.io.UnsupportedEncodingException
- if the character encoding is not
supported.public XMLReader(java.io.InputStream stream, java.lang.String encoding) throws java.io.UnsupportedEncodingException
stream
- the basic InputStream.encoding
- the name of the character encoding.java.io.UnsupportedEncodingException
- if the character encoding is not
supported.public XMLReader(java.io.Reader reader, java.lang.String encoding)
reader
- the character Reader.encoding
- the name of the character encoding.public void preserveLayout(boolean value)
value
- true, to retain a Document with all original layout text.public Document readDocument() throws java.io.UnsupportedEncodingException, java.io.IOException, SyntaxError, SemanticError
java.io.UnsupportedEncodingException
- if an encoding fault occurs.java.io.IOException
- if any other I/O error occurs.SyntaxError
- if the XML syntax is faulty.SemanticError
- if model constraints are broken.protected void parseDocumentBody() throws java.io.IOException, SyntaxError
java.io.IOException
- if any other I/O error occurs.SyntaxError
- if the XML syntax is faulty.protected Content parseAnyContent() throws java.io.IOException, SyntaxError
java.io.IOException
- if the token stream fails.SyntaxError
- if the XML syntax is faulty.protected Declaration parseDeclaration() throws java.io.UnsupportedEncodingException, java.io.IOException, SyntaxError
java.io.UnsupportedEncodingException
- if the declared encoding does not
match the token stream's actual encoding.java.io.IOException
- if the token stream fails.SyntaxError
- if the XML syntax is faulty.protected Instruction parseInstruction() throws java.io.IOException, SyntaxError
java.io.IOException
- if the token stream fails.SyntaxError
- if the XML syntax is faulty.protected Doctype parseDoctype() throws java.io.IOException, SyntaxError, SemanticError
java.io.IOException
- if the token stream fails.SyntaxError
- if the XML syntax is faulty.SemanticError
- if more than one doctype is declared.protected Element parseElement() throws java.io.IOException, SyntaxError, SemanticError
java.io.IOException
- if the token stream fails.SyntaxError
- if the XML syntax is faulty.SemanticError
- if the DOM is complete.protected Attribute parseAttribute() throws java.io.IOException, SyntaxError
java.io.IOException
- if the token stream fails.SyntaxError
- if the XML syntax is faulty.protected Text parseTextContent() throws java.io.IOException, SyntaxError
java.io.IOException
- if the token stream fails.SyntaxError
- if the XML syntax is faulty.protected Data parseEscapedData() throws java.io.IOException, SyntaxError
java.io.IOException
- if the token stream fails.SyntaxError
- if the XML syntax is faulty.protected Comment parseComment() throws java.io.IOException, SyntaxError
java.io.IOException
- if the token stream fails.SyntaxError
- if the XML syntax is faulty.protected void validate(Document document) throws java.io.UnsupportedEncodingException, java.io.IOException, SyntaxError, SemanticError
document
- the Document.java.io.UnsupportedEncodingException
- if an encoding fault occurs.java.io.IOException
- if any other I/O error occurs.SyntaxError
- if the XML syntax is faulty.SemanticError
- if schema constraints are broken.