QXmlStreamReader Class
The QXmlStreamReader class provides a fast parser for reading well-formed XML via a simple streaming API. More...
| Header: | #include <QXmlStreamReader> |
| qmake: | QT += core |
| Since: | Qt 4.3 |
This class was introduced in Qt 4.3.
Note: All functions in this class are reentrant.
Public Types
| enum | Error { NoError, CustomError, NotWellFormedError, PrematureEndOfDocumentError, UnexpectedElementError } |
| enum | ReadElementTextBehaviour { ErrorOnUnexpectedElement, IncludeChildElements, SkipChildElements } |
| enum | TokenType { NoToken, Invalid, StartDocument, EndDocument, StartElement, …, ProcessingInstruction } |
Properties
- namespaceProcessing : bool
Public Functions
| bool | namespaceProcessing() const |
| void | setNamespaceProcessing(bool) |
Detailed Description
QXmlStreamReader provides a simple streaming API to parse well-formed XML. It is an alternative to first loading the complete XML into a DOM tree (see QDomDocument). QXmlStreamReader reads data either from a QIODevice (see setDevice()), or from a raw QByteArray (see addData()).
Qt provides QXmlStreamWriter for writing XML.
The basic concept of a stream reader is to report an XML document as a stream of tokens, similar to SAX. The main difference between QXmlStreamReader and SAX is how these XML tokens are reported. With SAX, the application must provide handlers (callback functions) that receive so-called XML events from the parser at the parser's convenience. With QXmlStreamReader, the application code itself drives the loop and pulls tokens from the reader, one after another, as it needs them. This is done by calling readNext(), where the reader reads from the input stream until it completes the next token, at which point it returns the tokenType(). A set of convenient functions including isStartElement() and text() can then be used to examine the token to obtain information about what has been read. The big advantage of this pulling approach is the possibility to build recursive descent parsers with it, meaning you can split your XML parsing code easily into different methods or classes. This makes it easy to keep track of the application's own state when parsing XML.
A typical loop with QXmlStreamReader looks like this:
QXmlStreamReader xml; ... while (!xml.atEnd()) { xml.readNext(); ... // do processing } if (xml.hasError()) { ... // do error handling }
QXmlStreamReader is a well-formed XML 1.0 parser that does not include external parsed entities. As long as no error occurs, the application code can thus be assured that the data provided by the stream reader satisfies the W3C's criteria for well-formed XML. For example, you can be certain that all tags are indeed nested and closed properly, that references to internal entities have been replaced with the correct replacement text, and that attributes have been normalized or added according to the internal subset of the DTD.
If an error occurs while parsing, atEnd() and hasError() return true, and error() returns the error that occurred. The functions errorString(), lineNumber(), columnNumber(), and characterOffset() are for constructing an appropriate error or warning message. To simplify application code, QXmlStreamReader contains a raiseError() mechanism that lets you raise custom errors that trigger the same error handling described.
The QXmlStream Bookmarks Example illustrates how to use the recursive descent technique to read an XML bookmark file (XBEL) with a stream reader.
Namespaces
QXmlStream understands and resolves XML namespaces. E.g. in case of a StartElement, namespaceUri() returns the namespace the element is in, and name() returns the element's local name. The combination of namespaceUri and name uniquely identifies an element. If a namespace prefix was not declared in the XML entities parsed by the reader, the namespaceUri is empty.
If you parse XML data that does not utilize namespaces according to the XML specification or doesn't use namespaces at all, you can use the element's qualifiedName() instead. A qualified name is the element's prefix() followed by colon followed by the element's local name() - exactly like the element appears in the raw XML data. Since the mapping namespaceUri to prefix is neither unique nor universal, qualifiedName() should be avoided for namespace-compliant XML data.
In order to parse standalone documents that do use undeclared namespace prefixes, you can turn off namespace processing completely with the namespaceProcessing property.
Incremental Parsing
QXmlStreamReader is an incremental parser. It can handle the case where the document can't be parsed all at once because it arrives in chunks (e.g. from multiple files, or over a network connection). When the reader runs out of data before the complete document has been parsed, it reports a PrematureEndOfDocumentError. When more data arrives, either because of a call to addData() or because more data is available through the network device(), the reader recovers from the PrematureEndOfDocumentError error and continues parsing the new data with the next call to readNext().
For example, if your application reads data from the network using a network access manager, you would issue a network request to the manager and receive a network reply in return. Since a QNetworkReply is a QIODevice, you connect its readyRead() signal to a custom slot, e.g. slotReadyRead() in the code snippet shown in the discussion for QNetworkAccessManager. In this slot, you read all available data with readAll() and pass it to the XML stream reader using addData(). Then you call your custom parsing function that reads the XML events from the reader.
Performance and Memory Consumption
QXmlStreamReader is memory-conservative by design, since it doesn't store the entire XML document tree in memory, but only the current token at the time it is reported. In addition, QXmlStreamReader avoids the many small string allocations that it normally takes to map an XML document to a convenient and Qt-ish API. It does this by reporting all string data as QStringRef rather than real QString objects. QStringRef is a thin wrapper around QString substrings that provides a subset of the QString API without the memory allocation and reference-counting overhead. Calling toString() on any of those objects returns an equivalent real QString object.
Member Type Documentation
enum QXmlStreamReader::Error
This enum specifies different error cases
| Constant | Value | Description |
|---|---|---|
QXmlStreamReader::NoError | 0 | No error has occurred. |
QXmlStreamReader::CustomError | 2 | A custom error has been raised with raiseError() |
QXmlStreamReader::NotWellFormedError | 3 | The parser internally raised an error due to the read XML not being well-formed. |
QXmlStreamReader::PrematureEndOfDocumentError | 4 | The input stream ended before a well-formed XML document was parsed. Recovery from this error is possible if more XML arrives in the stream, either by calling addData() or by waiting for it to arrive on the device(). |
QXmlStreamReader::UnexpectedElementError | 1 | The parser encountered an element that was different to those it expected. |
enum QXmlStreamReader::ReadElementTextBehaviour
This enum specifies the different behaviours of readElementText().
| Constant | Value | Description |
|---|---|---|
QXmlStreamReader::ErrorOnUnexpectedElement | 0 | Raise an UnexpectedElementError and return what was read so far when a child element is encountered. |
QXmlStreamReader::IncludeChildElements | 1 | Recursively include the text from child elements. |
QXmlStreamReader::SkipChildElements | 2 | Skip child elements. |
This enum was introduced or modified in Qt 4.6.
enum QXmlStreamReader::TokenType
This enum specifies the type of token the reader just read.
| Constant | Value | Description |
|---|---|---|
QXmlStreamReader::NoToken | 0 | The reader has not yet read anything. |
QXmlStreamReader::Invalid | 1 | An error has occurred, reported in error() and errorString(). |
QXmlStreamReader::StartDocument | 2 | The reader reports the XML version number in documentVersion(), and the encoding as specified in the XML document in documentEncoding(). If the document is declared standalone, isStandaloneDocument() returns true; otherwise it returns false. |
QXmlStreamReader::EndDocument | 3 | The reader reports the end of the document. |
QXmlStreamReader::StartElement | 4 | The reader reports the start of an element with namespaceUri() and name(). Empty elements are also reported as StartElement, followed directly by EndElement. The convenience function readElementText() can be called to concatenate all content until the corresponding EndElement. Attributes are reported in attributes(), namespace declarations in namespaceDeclarations(). |
QXmlStreamReader::EndElement | 5 | The reader reports the end of an element with namespaceUri() and name(). |
QXmlStreamReader::Characters | 6 | The reader reports characters in text(). If the characters are all white-space, isWhitespace() returns true. If the characters stem from a CDATA section, isCDATA() returns true. |
QXmlStreamReader::Comment | 7 | The reader reports a comment in text(). |
QXmlStreamReader::DTD | 8 | The reader reports a DTD in text(), notation declarations in notationDeclarations(), and entity declarations in entityDeclarations(). Details of the DTD declaration are reported in in dtdName(), dtdPublicId(), and dtdSystemId(). |
QXmlStreamReader::EntityReference | 9 | The reader reports an entity reference that could not be resolved. The name of the reference is reported in name(), the replacement text in text(). |
QXmlStreamReader::ProcessingInstruction | 10 | The reader reports a processing instruction in processingInstructionTarget() and processingInstructionData(). |
Property Documentation
namespaceProcessing : bool
The namespace-processing flag of the stream reader
This property controls whether or not the stream reader processes namespaces. If enabled, the reader processes namespaces, otherwise it does not.
By default, namespace-processing is enabled.
Access functions:
| bool | namespaceProcessing() const |
| void | setNamespaceProcessing(bool) |