Details
-
Bug
-
Resolution: Done
-
P2: Important
-
5.9.1
-
* MacOs 10.13
* Windows 10 (VS 2015)
-
f286027e6b8fb89e05d64a2a848ec18a95934526 (qt/qtbase/5.12)
Description
Scenario: reading an xml file with QXmlStreamReader and writing every (unmodified) token to another file with QXmlStreamWriter.
while (!reader.atEnd())
{
reader.readNext();
writer.writeCurrentToken(reader);
}
Input file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <root> <father><child xmlns:unknown='http://mydomain'>Text</child></father> </root>
Expected: output should equal input. Actual:
<?xml version="1.0" encoding="UTF-8"?><root> <father xmlns:unknown="http://mydomain"><child>Text</child></father> </root>
Note that the xmlns attribute was moved to the "father" tag. If any whitespace (space, tab, newline) is added after the "father" tag, namespace is correctly placed in the "child" tag. Tags and namespaces seem to be read correctly by the QXmlStreamReader.
The problem can be reproduced with the XML Stream Lint Example from Qt docs and the attached xml file.
If confirmed, this is a severe issue for any applications using xml data.
Also note other small problems:
- "standalone" attribute is removed from the start token;
- the first newline is arbitrarily removed
A workaround to 1. is to handle the startDocument token explicitly:
if( reader.isStartDocument() ) { writer.setCodec( QTextCodec::codecForName( reader.documentEncoding().toLatin1() ) ); writer.writeStartDocument( reader.documentVersion().toString(), reader.isStandaloneDocument() ); } else writer.writeCurrentToken( reader );
Side note: the reader API returns QStringRef while the writer takes const QString, thus requiring an unnecessary copy for every string ref.