Details
-
Bug
-
Resolution: Done
-
P1: Critical
-
5.7.0
-
None
-
Qt 5.6.0 and Qt 5.7.0 on OS X 10.11 x86_64
(tested same Problem for Windows 8.1 and Windows 10)
-
60563855e589cb51b4966ca51a4d390a2f1609a4 6e8fcab7e07717526c8ea6eac8785bf27fa090c3
Description
Problem
Qt 5.7 changed how QIODevice::peek works internally. For a use case presented here, this leads to a big performance drop.
Related to
"QIODevice: handle incomplete reads
" (7d257aab8175f3d4fd3b838d4f47457a8d868885) https://codereview.qt-project.org/#/c/114792/
https://bugreports.qt.io/browse/QTBUG-44418
Use Case
In my particular use case, I am parsing a file with variable length encoded data (always peeking maximum theoretical size of a field and after parsing incrementing read pointer by a lesser amount). This could also be done in other ways avoiding the problem described here (using map() or ungetChar()). Still, this reveals a performance problem which not existed before.
How to reproduce
char readBuf[20];
while(!testFile.atEnd())Unknown macro: { testFile.peek(readBuf, 20); testFile.read(readBuf, 15); }
Compile the attached MWE "qiodevice-peek-performance.zip" on Qt 5.6 and 5.7 and run it with a file as argument with size at least 1gb.
For a 1.6GB file on SSD drive, run on the same system, output is following:
Time: 0m 5s 914ms (Qt 5.6.0)
Time: 4m 19s 682ms (Qt 5.7.0)
I'm happy to provide any more data you might need.
Qt 5.6 vs. Qt 5.7 technical details
For the presented case, here is what 5.6 and 5.7 does:
Qt 5.6
- peek() issues qfiledevice read
- peek() immediately writes the read data back into the linear buffer with ungetBlock()
- a followup call to read() will read the peek'ed data from the linear buffer, the qfiledevice is only read for new data
- from perspective of qfiledevice, it is read sequentially (because peek'ed data is buffered for the second access)
Qt 5.7
- peek() makes sure a transaction is started
- peek() issues qfiledevice read
- peek() will always clear the qringbuffer before returning (either through seekBuffer() within a previous transaction or through roolBackTransaction() calling seekBuffer(), seekBuffer clears the qringbuffer when going 'back' on random access devices)
- a follow up call to read() will invoke a seek of qfiledevice and a second read of same data on the qfiledevice, because the qringbuffer was cleared by peek()
- effectively from perspective of qfiledevice, it is not a sequential read, but jumping around and reading data twice
Thoughts on improvement
These are loosely thoughts from different angles to improve current situation.
Put peek'ed data back into buffer
Does peek() really need a transaction? Within an active transaction or not, the iodevice is expected to keep the internal read and transaction positions. This should be possible with just reading the device and putting everything back into the qringbuffer (as qt 5.6 does it with linear buffer). Similar like ungetChar, but for a whole block.
Avoid deallocation of memory
Clearing a classical ring buffer should be as fast as writing two pointers. I'm not sure, but I think seekBuffer() calling qringbuffer::clear deallocates memory which is slow when done often (currently for every peek!).
Support rewinding of qringbuffer
When seeking back, currently everyting is cleard. For a ring buffer, it could potentially be possible to rewind the read pointer. Instead of clearing, the buffer could provide the contents without a device-read as long it was not overwritten by the write pointer in the mean time.
Implement custom read-like function for peek
peek() could be implemented to always read directly into the qringbuffer. First it would try the qringbuffer::peek if enough data is already available and if an underlying device read is necessary, it would read into qringbuffer and copy data to the user's buffer. This could be faster than using ::read() (incrementing the data position with freeing memory of qringbuffer) and than writing it back to the internal buffer again.
Documentation change
If proposed improvements are not suiteable (e.g. because of other use cases I did not think of or any misguided thoughts on my part), a note could be included in documentation of QIODevice::peek saying that from 5.7 performance is compareable of doing a device read+backseek for random access devices.
Attachments
Issue Links
- relates to
-
QTBUG-44418 QDataStream should gracefully handle incomplete reads from QIODevice
- Closed