/* * Copyright (c) 2004 World Wide Web Consortium, * * (Massachusetts Institute of Technology, European Research Consortium for * Informatics and Mathematics, Keio University). All Rights Reserved. This * work is distributed under the W3C(r) Software License [1] in the hope that * it will be useful, but WITHOUT ANY WARRANTY; without even the implied * warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. * * [1] http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231 */ package org.w3c.dom.ls; import org.w3c.dom.DOMConfiguration; import org.w3c.dom.DOMException; import org.w3c.dom.Document; import org.w3c.dom.Node; /** * An interface to an object that is able to build, or augment, a DOM tree * from various input sources. *
LSParser
provides an API for parsing XML and building the
* corresponding DOM document structure. A LSParser
instance
* can be obtained by invoking the
* DOMImplementationLS.createLSParser()
method.
*
As specified in [DOM Level 3 Core] * , when a document is first made available via the LSParser: *
value
and
* nodeValue
attributes of an Attr
node initially
* return the XML 1.0
* normalized value. However, if the parameters "
* validate-if-schema" and "
* datatype-normalization" are set to true
, depending on the attribute normalization
* used, the attribute values may differ from the ones obtained by the XML
* 1.0 attribute normalization. If the parameters "
* datatype-normalization" is set to false
, the XML 1.0 attribute normalization is
* guaranteed to occur, and if the attributes list does not contain
* namespace declarations, the attributes
attribute on
* Element
node represents the property [attributes] defined in [XML Information Set]
* .
* Asynchronous LSParser
objects are expected to also
* implement the events::EventTarget
interface so that event
* listeners can be registered on asynchronous LSParser
* objects.
*
Events supported by asynchronous LSParser
objects are:
*
LSParser
finishes to load the document. See also the
* definition of the LSLoadEvent
interface. LSParser
signals progress as data is parsed. This
* specification does not attempt to define exactly when progress events
* should be dispatched. That is intentionally left as
* implementation-dependent. Here is one example of how an application might
* dispatch progress events: Once the parser starts receiving data, a
* progress event is dispatched to indicate that the parsing starts. From
* there on, a progress event is dispatched for every 4096 bytes of data
* that is received and processed. This is only one example, though, and
* implementations can choose to dispatch progress events at any time while
* parsing, or not dispatch them at all. See also the definition of the
* LSProgressEvent
interface. Note: All events defined in this specification use the
* namespace URI "http://www.w3.org/2002/DOMLS"
.
*
While parsing an input source, errors are reported to the application
* through the error handler (LSParser.domConfig
's "
* error-handler" parameter). This specification does in no way try to define all possible
* errors that can occur while parsing XML, or any other markup, but some
* common error cases are defined. The types (DOMError.type
) of
* errors and warnings defined by this specification are:
*
"check-character-normalization-failure" [error]
"doctype-not-allowed" [fatal]
true
* and a doctype is encountered. "no-input-specified" [fatal]
LSInput
object. "pi-base-uri-not-preserved" [warning]
false
and the following XML file is parsed:
* * <!DOCTYPE root [ <!ENTITY e SYSTEM 'subdir/myentity.ent' ]> * <root> &e; </root>* And
subdir/myentity.ent
* contains:
* <one> <two/> </one> <?pi 3.14159?> * <more/>*
"unbound-prefix-in-entity" [warning]
true
and an unbound namespace prefix is
* encountered in an entity's replacement text. Raising this warning is not
* enforced since some existing parsers may not recognize unbound namespace
* prefixes in the replacement text of entities. "unknown-character-denormalization" [fatal]
false
and a character is encountered for which the
* processor cannot determine the normalization properties. "unsupported-encoding" [fatal]
"unsupported-media-type" [fatal]
true
and an unsupported media type is encountered. In addition to raising the defined errors and warnings, implementations * are expected to raise implementation specific errors and warnings for any * other error and warning cases such as IO errors (file not found, * permission denied,...), XML well-formedness errors, and so on. *
See also the Document Object Model (DOM) Level 3 Load
and Save Specification.
*/
public interface LSParser {
/**
* The DOMConfiguration
object used when parsing an input
* source. This DOMConfiguration
is specific to the parse
* operation. No parameter values from this DOMConfiguration
* object are passed automatically to the DOMConfiguration
* object on the Document
that is created, or used, by the
* parse operation. The DOM application is responsible for passing any
* needed parameter values from this DOMConfiguration
* object to the DOMConfiguration
object referenced by the
* Document
object.
*
In addition to the parameters recognized in on the
* DOMConfiguration interface defined in [DOM Level 3 Core]
* , the DOMConfiguration
objects for LSParser
* add or modify the following parameters:
*
"charset-overrides-xml-encoding"
true
LSInput
overrides
* any encoding from the protocol. false
"disallow-doctype"
true
false
"ignore-unknown-character-denormalizations"
true
false
"infoset"
DOMConfiguration
for a description of
* this parameter. Unlike in [DOM Level 3 Core]
* , this parameter will default to true
for
* LSParser
. "namespaces"
true
false
"resource-resolver"
LSResourceResolver
object, or null. If
* the value of this parameter is not null when an external resource
* (such as an external XML entity or an XML schema location) is
* encountered, the implementation will request that the
* LSResourceResolver
referenced in this parameter resolves
* the resource. "supported-media-types-only"
true
false
"validate"
DOMConfiguration
for a description of this parameter.
* Unlike in [DOM Level 3 Core]
* , the processing of the internal subset is always accomplished, even
* if this parameter is set to false
. "validate-if-schema"
DOMConfiguration
for a description of this parameter.
* Unlike in [DOM Level 3 Core]
* , the processing of the internal subset is always accomplished, even
* if this parameter is set to false
. "well-formed"
DOMConfiguration
for a description of this parameter.
* Unlike in [DOM Level 3 Core]
* , this parameter cannot be set to false
. DOMConfiguration
parameters have been applied. For
* example, if "
* validate" is set to true
, the validation is done before invoking the
* filter.
*/
public LSParserFilter getFilter();
/**
* When a filter is provided, the implementation will call out to the
* filter as it is constructing the DOM tree structure. The filter can
* choose to remove elements from the document being constructed, or to
* terminate the parsing early.
* DOMConfiguration
parameters have been applied. For
* example, if "
* validate" is set to true
, the validation is done before invoking the
* filter.
*/
public void setFilter(LSParserFilter filter);
/**
* true
if the LSParser
is asynchronous,
* false
if it is synchronous.
*/
public boolean getAsync();
/**
* true
if the LSParser
is currently busy
* loading a document, otherwise false
.
*/
public boolean getBusy();
/**
* Parse an XML document from a resource identified by a
* LSInput
.
* @param input The LSInput
from which the source of the
* document is to be read.
* @return If the LSParser
is a synchronous
* LSParser
, the newly created and populated
* Document
is returned. If the LSParser
is
* asynchronous, null
is returned since the document
* object may not yet be constructed when this method returns.
* @exception DOMException
* INVALID_STATE_ERR: Raised if the LSParser
's
* LSParser.busy
attribute is true
.
* @exception LSException
* PARSE_ERR: Raised if the LSParser
was unable to load
* the XML document. DOM applications should attach a
* DOMErrorHandler
using the parameter "
* error-handler" if they wish to get details on the error.
*/
public Document parse(LSInput input)
throws DOMException, LSException;
/**
* Parse an XML document from a location identified by a URI reference [IETF RFC 2396]. If the URI
* contains a fragment identifier (see section 4.1 in [IETF RFC 2396]), the
* behavior is not defined by this specification, future versions of
* this specification may define the behavior.
* @param uri The location of the XML document to be read.
* @return If the LSParser
is a synchronous
* LSParser
, the newly created and populated
* Document
is returned, or null
if an error
* occured. If the LSParser
is asynchronous,
* null
is returned since the document object may not yet
* be constructed when this method returns.
* @exception DOMException
* INVALID_STATE_ERR: Raised if the LSParser.busy
* attribute is true
.
* @exception LSException
* PARSE_ERR: Raised if the LSParser
was unable to load
* the XML document. DOM applications should attach a
* DOMErrorHandler
using the parameter "
* error-handler" if they wish to get details on the error.
*/
public Document parseURI(String uri)
throws DOMException, LSException;
// ACTION_TYPES
/**
* Append the result of the parse operation as children of the context
* node. For this action to work, the context node must be an
* Element
or a DocumentFragment
.
*/
public static final short ACTION_APPEND_AS_CHILDREN = 1;
/**
* Replace all the children of the context node with the result of the
* parse operation. For this action to work, the context node must be an
* Element
, a Document
, or a
* DocumentFragment
.
*/
public static final short ACTION_REPLACE_CHILDREN = 2;
/**
* Insert the result of the parse operation as the immediately preceding
* sibling of the context node. For this action to work the context
* node's parent must be an Element
or a
* DocumentFragment
.
*/
public static final short ACTION_INSERT_BEFORE = 3;
/**
* Insert the result of the parse operation as the immediately following
* sibling of the context node. For this action to work the context
* node's parent must be an Element
or a
* DocumentFragment
.
*/
public static final short ACTION_INSERT_AFTER = 4;
/**
* Replace the context node with the result of the parse operation. For
* this action to work, the context node must have a parent, and the
* parent must be an Element
or a
* DocumentFragment
.
*/
public static final short ACTION_REPLACE = 5;
/**
* Parse an XML fragment from a resource identified by a
* LSInput
and insert the content into an existing document
* at the position specified with the context
and
* action
arguments. When parsing the input stream, the
* context node (or its parent, depending on where the result will be
* inserted) is used for resolving unbound namespace prefixes. The
* context node's ownerDocument
node (or the node itself if
* the node of type DOCUMENT_NODE
) is used to resolve
* default attributes and entity references.
* Document
node and the action
* is ACTION_REPLACE_CHILDREN
, then the document that is
* passed as the context node will be changed such that its
* xmlEncoding
, documentURI
,
* xmlVersion
, inputEncoding
,
* xmlStandalone
, and all other such attributes are set to
* what they would be set to if the input source was parsed using
* LSParser.parse()
.
* LSParser
is asynchronous (LSParser.async
is
* true
).
* ErrorHandler
instance associated with the "
* error-handler" parameter of the DOMConfiguration
.
* parseWithContext
, the values of the
* following configuration parameters will be ignored and their default
* values will always be used instead: "
* validate", "
* validate-if-schema", and "
* element-content-whitespace". Other parameters will be treated normally, and the parser is expected
* to call the LSParserFilter
just as if a whole document
* was parsed.
* @param input The LSInput
from which the source document
* is to be read. The source document must be an XML fragment, i.e.
* anything except a complete XML document (except in the case where
* the context node of type DOCUMENT_NODE
, and the action
* is ACTION_REPLACE_CHILDREN
), a DOCTYPE (internal
* subset), entity declaration(s), notation declaration(s), or XML or
* text declaration(s).
* @param contextArg The node that is used as the context for the data
* that is being parsed. This node must be a Document
* node, a DocumentFragment
node, or a node of a type
* that is allowed as a child of an Element
node, e.g. it
* cannot be an Attribute
node.
* @param action This parameter describes which action should be taken
* between the new set of nodes being inserted and the existing
* children of the context node. The set of possible actions is
* defined in ACTION_TYPES
above.
* @return Return the node that is the result of the parse operation. If
* the result is more than one top-level node, the first one is
* returned.
* @exception DOMException
* HIERARCHY_REQUEST_ERR: Raised if the content cannot replace, be
* inserted before, after, or as a child of the context node (see also
* Node.insertBefore
or Node.replaceChild
in [DOM Level 3 Core]
* ).
* LSParser
doesn't
* support this method, or if the context node is of type
* Document
and the DOM implementation doesn't support
* the replacement of the DocumentType
child or
* Element
child.
* LSParser.busy
* attribute is true
.
* @exception LSException
* PARSE_ERR: Raised if the LSParser
was unable to load
* the XML fragment. DOM applications should attach a
* DOMErrorHandler
using the parameter "
* error-handler" if they wish to get details on the error.
*/
public Node parseWithContext(LSInput input,
Node contextArg,
short action)
throws DOMException, LSException;
/**
* Abort the loading of the document that is currently being loaded by
* the LSParser
. If the LSParser
is currently
* not busy, a call to this method does nothing.
*/
public void abort();
}