Results 1 to 5 of 5

Thread: Read gzip file line-by-line

  1. #1
    Join Date
    Mar 2010
    Posts
    21
    Qt products
    Qt4
    Platforms
    MacOS X Unix/X11 Windows

    Default Read gzip file line-by-line

    Hi,

    I'm currently trying to read a gzipped file. It contains an ASCII table, so I would like to read it line-by-line in order to get the values from the table. For this I wrote a small class CompressedFile, which is derived from QIODevice.

    The overloaded readLineData function looks like this:

    Qt Code:
    1. qint64 CompressedFile::readLineData(char *data, qint64 maxlen)
    2. {
    3. qint64 len = 0;
    4. while(len < maxlen) {
    5. if (_bufPos >= _bufSize) {
    6. fillBuffer();
    7. if (_bufSize == 0) return 0;
    8. }
    9. data[len++] = _buffer[_bufPos++];
    10. if (data[len-1] == '\n') {
    11. data[len-1] = '\0';
    12. break;
    13. }
    14. }
    15. return len;
    16. }
    To copy to clipboard, switch view to plain text mode 

    Since I got problems with qUncompress, I'm using zlib directly, so fillBuffer() looks like this:

    Qt Code:
    1. void CompressedFile::fillBuffer()
    2. {
    3. _bufOffset += _bufSize;
    4. _bufSize = gzread(_file, _buffer, BUFSIZE);
    5. _bufPos = 0;
    6. }
    To copy to clipboard, switch view to plain text mode 

    So in principle I uncompress the file into a buffer (I tried sizes between 1kB and 1MB) and read from that. When I come to the end of the buffer, I read again.

    I use it like this:

    Qt Code:
    1. // open file
    2. CompressedFile file(filename);
    3. file.open(QIODevice::ReadOnly);
    4.  
    5. // start reading
    6. QString line;
    7. while (!(line = file.readLine()).isEmpty()) {
    8. // do something
    9. }
    To copy to clipboard, switch view to plain text mode 

    My problem is, that with one of the files I want to read with this, it takes about 35 seconds. I've got some IDL and python code here, that does the same in ~10 seconds. Any ideas, how can speed things up?

    Cheers,
    fallen

  2. #2
    Join Date
    Jan 2006
    Location
    Munich, Germany
    Posts
    4,714
    Thanks
    21
    Thanked 418 Times in 411 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows

    Default Re: Read gzip file line-by-line

    This code never gets call IMHO, or only once, if _busPos is initialized with something other than zero which is greater than _busSize when it gets initialized:
    Qt Code:
    1. if (_bufPos >= _bufSize) {
    2. fillBuffer();
    3. if (_bufSize == 0) return 0;
    4. }
    To copy to clipboard, switch view to plain text mode 

    Since in here, _bufPos is always set to 0:
    Qt Code:
    1. void CompressedFile::fillBuffer()
    2. {
    3. _bufOffset += _bufSize;
    4. _bufSize = gzread(_file, _buffer, BUFSIZE);
    5. _bufPos = 0; //<--- always 0.
    6. }
    To copy to clipboard, switch view to plain text mode 

    Is there a reason you are reading char after char and not a whole line using QIDevice::readLine()?
    Because this is what takes so long...
    ==========================signature=============== ==================
    S.O.L.I.D principles (use them!):
    https://en.wikipedia.org/wiki/SOLID_...iented_design)

    Do you write clean code? - if you are TDD'ing then maybe, if not, your not writing clean code.

  3. #3
    Join Date
    Mar 2010
    Posts
    21
    Qt products
    Qt4
    Platforms
    MacOS X Unix/X11 Windows

    Default Re: Read gzip file line-by-line

    Hi,

    Quote Originally Posted by high_flyer View Post
    This code never gets call IMHO, or only once, if _busPos is initialized with something other than zero which is greater than _busSize when it gets initialized:
    _bufOffset is being increased in readLineData() and is reset in fillBuffer(), since I start to read at the beginning of the buffer after filling it. So the code works fine, it's just quite slow...

    Quote Originally Posted by high_flyer View Post
    Is there a reason you are reading char after char and not a whole line using QIDevice::readLine()?
    Because this is what takes so long...
    But there is no QIODevice, that's what I want to create right now. The CompressedFile class is a wrapper for a gzipped file that inherits QIODevice and readLineData() is the protected function that is internally called from readline(). And I'm reading the buffer char after char, since I'm looking for a line break...

    Cheers,
    fallen

  4. #4
    Join Date
    Jan 2006
    Location
    Munich, Germany
    Posts
    4,714
    Thanks
    21
    Thanked 418 Times in 411 Posts
    Qt products
    Qt3 Qt4 Qt5 Qt/Embedded
    Platforms
    Unix/X11 Windows

    Default Re: Read gzip file line-by-line

    _bufOffset is being increased in readLineData()
    Oh, right, missed that one... sorry.

    And I'm reading the buffer char after char, since I'm looking for a line break...
    But realLine() will also recognize line breaks... that is exactly what it does!

    Why not just unzip the file, get the buffer, put the buffer in QTexstStream, and read it line by line?
    ==========================signature=============== ==================
    S.O.L.I.D principles (use them!):
    https://en.wikipedia.org/wiki/SOLID_...iented_design)

    Do you write clean code? - if you are TDD'ing then maybe, if not, your not writing clean code.

  5. #5
    Join Date
    Mar 2010
    Posts
    21
    Qt products
    Qt4
    Platforms
    MacOS X Unix/X11 Windows

    Default Re: Read gzip file line-by-line

    Quote Originally Posted by high_flyer View Post
    Why not just unzip the file, get the buffer, put the buffer in QTexstStream, and read it line by line?
    File is too large, couple of hundred MB. But maybe I'll do that anyway... Thanks.

Similar Threads

  1. Read a specific line from a file
    By rleojoseph in forum Qt Programming
    Replies: 11
    Last Post: 21st March 2011, 11:58
  2. Replies: 3
    Last Post: 13th August 2010, 11:50
  3. read QTextBrowser line by line
    By navid in forum Newbie
    Replies: 1
    Last Post: 1st March 2010, 15:05
  4. How to read line number in a text file
    By grsandeep85 in forum Qt Programming
    Replies: 7
    Last Post: 31st July 2009, 09:09
  5. How to read line from file
    By Krishnacins in forum Newbie
    Replies: 10
    Last Post: 1st June 2006, 23:14

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.