Results 1 to 7 of 7

Thread: qUncompress data from gzip

  1. #1
    Join Date
    Dec 2008
    Location
    Poland
    Posts
    383
    Thanks
    52
    Thanked 42 Times in 42 Posts
    Qt products
    Qt4
    Platforms
    MacOS X Unix/X11 Windows Android

    Default qUncompress data from gzip

    Hello,
    I struggle with this almost a month, with break of course, and can't figure out way to decompress gzip data from http.
    I use fallowing code on QNetworkAccessManager SIGNAL( finished(QNetworkReply *) :

    Qt Code:
    1. const QByteArray ba = reply->readAll();
    2. QByteArray dataPlusSize;
    3. const unsigned size = ba.size();
    4.  
    5. dataPlusSize.prepend( ((size >> 24) & 0xFF));
    6. dataPlusSize.prepend( ((size >> 16) & 0xFF));
    7. dataPlusSize.prepend( ((size >> 8) & 0xFF));
    8. dataPlusSize.prepend( ((size >> 0) & 0xFF));
    9.  
    10. dataPlusSize.prepend(ba);
    11.  
    12. dataPlusSize = qUncompress(dataPlusSize);
    13. qDebug() << dataPlusSize;
    To copy to clipboard, switch view to plain text mode 
    and output is as allways, when data is wrong, "qUncompress: Z_DATA_ERROR: Input data is corrupted".
    So my question's do readAll() actually return ONLY gzip data or some additional chunk of data? Or maybe I do something wrong? (checked with wireshark and data is indeed gziped, tested with qCompress and everything works fine, so only conclusion is that readAll() return something more)

    Best regards

  2. #2
    Join Date
    Dec 2008
    Location
    Poland
    Posts
    383
    Thanks
    52
    Thanked 42 Times in 42 Posts
    Qt products
    Qt4
    Platforms
    MacOS X Unix/X11 Windows Android

    Default Re: qUncompress data from gzip

    After some more reading, I found out that qzip saves uncompressed data size in last 4 bytes (hence 4GB limitation to gzip max files, and prepending size by qUncompres that expect that information in the first 4 bytes not the last), and that the data from readAll() all indeed gzip stream.
    So my only problem, it seams, is BigEndian. I don't know, or actually don't quite get from doc, if only size needs to be in BigEndian or entire compressed data ?
    My code so far:
    Qt Code:
    1. const char dat[40] = {
    2. 0x1F, 0x8B, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0xAA, 0x2E, 0x2E, 0x49, 0x2C, 0x29,
    3. 0x2D, 0xB6, 0x4A, 0x4B, 0xCC, 0x29, 0x4E, 0xAD, 0x05, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x03, 0x00,
    4. 0x2A, 0x63, 0x18, 0xC5, 0x0E, 0x00, 0x00, 0x00
    5. };
    6.  
    7. //dat is eq to string => {status:false}
    8. //gzip last 4 bytes uncompressed lenght, 0E = 14, with is correct
    9. QByteArray data;
    10. data.fromRawData( dat, 40);
    11.  
    12. QByteArray uncomp = qUncompress( data );
    13. qDebug() << uncomp;
    To copy to clipboard, switch view to plain text mode 
    Tried also changing last 4 bytes and placing them in the beginning without positive result.(I did also change order of 0x0E to first place in 32bit word and to last place)
    So if there is kind soul that could point me my mistake I would be more then appreciate.
    Best regards
    Last edited by Talei; 21st April 2010 at 05:54. Reason: updated contents

  3. #3
    Join Date
    Feb 2007
    Location
    Karlsruhe, Germany
    Posts
    469
    Thanks
    17
    Thanked 90 Times in 88 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: qUncompress data from gzip

    Hi Talei!

    You are using fromRawData wrong. It's a static member, that returns a QByteArray.

    Qt Code:
    1. #include <QtCore>
    2. #include <QtGui>
    3.  
    4. int main(int argc, char *argv[])
    5. {
    6. QApplication a(argc, argv);
    7.  
    8. QString test = "{status:false}";
    9. QByteArray ba = qCompress(test.toUtf8());
    10. qDebug() << "compressed: " << ba.toHex();
    11. qDebug() << "uncompressed: " << qUncompress(ba);
    12.  
    13. const char dat[40] = {
    14. 0x1F, 0x8B, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0xAA, 0x2E, 0x2E, 0x49, 0x2C, 0x29,
    15. 0x2D, 0xB6, 0x4A, 0x4B, 0xCC, 0x29, 0x4E, 0xAD, 0x05, 0x00, 0x00, 0x00, 0xFF, 0xFF, 0x03, 0x00,
    16. 0x2A, 0x63, 0x18, 0xC5, 0x0E, 0x00, 0x00, 0x00
    17. };
    18.  
    19. //dat is eq to string => {status:false}
    20. //gzip last 4 bytes uncompressed lenght, 0E = 14, with is correct
    21. QByteArray data = QByteArray::fromRawData(dat, 40);
    22. qDebug() << "faulty data: " << data.toHex();
    23. QByteArray uncomp = qUncompress( data );
    24. qDebug() << uncomp.toHex();
    25.  
    26. return 0;
    27. }
    To copy to clipboard, switch view to plain text mode 

    Have you read: http://doc.qt.nokia.com/4.6/qbytearray.html#qUncompress ?

    It states that you need just to prepend the size in big endian. Nothing else. Correct me if I'm wrong, but the prepend snippet you provided prepends the ba at the end. So effictively the size is appended in "most signficant byte last" order..

    Could you provide your dat array just the way your read it from file?

    Johannes

  4. #4
    Join Date
    Dec 2008
    Location
    Poland
    Posts
    383
    Thanks
    52
    Thanked 42 Times in 42 Posts
    Qt products
    Qt4
    Platforms
    MacOS X Unix/X11 Windows Android

    Default Re: qUncompress data from gzip

    Thank you very much for the input.
    So first thing firs: my second snippet indeed don't prepend size at the beginning, but I did that that way also:

    Qt Code:
    1. //gzip stream, with header and trailer
    2. static const char dat[40] = {
    3. 0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03, 0xaa, 0x2e, 0x2e, 0x49, 0x2c, 0x29,
    4. 0x2d, 0xb6, 0x4a, 0x4b, 0xcc, 0x29, 0x4e, 0xad, 0x05, 0x00, 0x00, 0x00, 0xff, 0xff, 0x03, 0x00,
    5. 0x2a, 0x63, 0x18, 0xc5, 0x0e, 0x00, 0x00, 0x00
    6. };
    7. unsigned int size = 14; //expected uncompresed size, reconstruct it BigEndianes, last 4bytes 0x0e, 0x00, 0x00, 0x00 = 0x0e = 14
    8.  
    9. QByteArray dataPlusSize; //empty array, add uncompresed size at the begining
    10. //BigEndian order
    11. dataPlusSize.append( (unsigned int)((size >> 24) & 0xFF));
    12. dataPlusSize.append( (unsigned int)((size >> 16) & 0xFF));
    13. dataPlusSize.append( (unsigned int)((size >> 8) & 0xFF));
    14. dataPlusSize.append( (unsigned int)((size >> 0) & 0xFF));
    15.  
    16. dataPlusSize.append( data, data.size() );
    17. QByteArray uncomp = qUncompress( dataPlusSize );
    18. qDebug() << uncomp;
    To copy to clipboard, switch view to plain text mode 

    qDebug() print me size, and data as:
    Qt Code:
    1. //data
    2. "1f8b0800000000000003aa2e2e492c292db64a4bcc294ead05000000ffff03002a6318c50e000000" data size: 40
    3. //dataPlusSize, this I want to qUncompress
    4. "0000000e1f8b0800000000000003aa2e2e492c292db64a4bcc294ead05000000ffff03002a6318c50e000000" dataPlusSize size: 44
    To copy to clipboard, switch view to plain text mode 

    dat[] is gzip stream captured with Wireshark and indeed it's gzip stream (when I save it to file i.e. data.gzip, either zip/rar can open/decompress that data, and wireshark also).
    It consist with: 10Byte header, deflate payload, 12byte trailer (8Byte CRC32 + 4 byte uncompressed size).
    For gzip header consists with (first element, in my numeration, at pos. 1 not 0):
    * Header size: 10 bytes
    * First byte : ID1 = 0x1F
    * Secound byte: ID2 = 0x8B
    * Third byte: CM - compression method: 1-7 reserved - 0x08 == DEFLATE
    bytes 4-10 extra flags, like file name, comments, CRC16, etc.. In my array bytes 4-9 are 0, and 10 is OS (operating system) == 0x03 with is UNIX (that is also true, www that I got this data is *nix machine.)
    Then GZIP DEFLATE peyload (actual DEFLATE stream)
    After Stream is 8Byte CRC32, and last 4 Byte is UNCOMPRESSED SIZE of DEFLATE stream.
    Information source: RFC1952, RFC2616, RFC1951, zlib web page.
    I don't know why qUncompress want size at the beginning, but normally, according to the standard, it should be last 4 bytes not the first one, hence the "magic" wit prepending size.
    I debugged above code, and gave me:
    inflate.c, it seams that inflate allocate correctly stream, but, inflate.c line 596:
    Qt Code:
    1. if ((state->wrap & 2) && hold == 0x8b1f) { /* gzip header *...}// == false why?
    To copy to clipboard, switch view to plain text mode 
    and then, inflate.c lines 606 to 613 :
    Qt Code:
    1. if (!(state->wrap & 1) || /* check if zlib header allowed */
    2. #else
    3. if (
    4. #endif
    5. ((BITS(8) << 8) + (hold >> 8)) % 31) { //why this true?
    6. strm->msg = (char *)"incorrect header check"; //<- here error
    7. state->mode = BAD; //and BAD and break
    8. break;
    9. }
    To copy to clipboard, switch view to plain text mode 
    and miserable:
    Qt Code:
    1. qUncompress: Z_DATA_ERROR: Input data is corrupted
    To copy to clipboard, switch view to plain text mode 
    It seams that error is either in inflateInit() or inflateInit2().
    But when I decompress this myself, like this:
    Qt Code:
    1. #include "zlib.h"
    2. QByteArray gzipHttpDec::gzipDecompress( QByteArray compressData )
    3. {
    4. //decompress GZIP data
    5.  
    6. //strip header and trailer
    7. compressData.remove(0, 10);
    8. compressData.chop(12);
    9.  
    10. const int buffersize = 16384;
    11. quint8 buffer[buffersize];
    12.  
    13. z_stream cmpr_stream;
    14. cmpr_stream.next_in = (unsigned char *)compressData.data();
    15. cmpr_stream.avail_in = compressData.size();
    16. cmpr_stream.total_in = 0;
    17.  
    18. cmpr_stream.next_out = buffer;
    19. cmpr_stream.avail_out = buffersize;
    20. cmpr_stream.total_out = 0;
    21.  
    22. cmpr_stream.zalloc = Z_NULL;
    23. cmpr_stream.zalloc = Z_NULL;
    24.  
    25. if( inflateInit2(&cmpr_stream, -8 ) != Z_OK) {
    26. qDebug() << "cmpr_stream error!";
    27. }
    28.  
    29. QByteArray uncompressed;
    30. do {
    31. int status = inflate( &cmpr_stream, Z_SYNC_FLUSH );
    32.  
    33. if(status == Z_OK || status == Z_STREAM_END) {
    34. uncompressed.append(QByteArray::fromRawData((char *)buffer, buffersize - cmpr_stream.avail_out));
    35. cmpr_stream.next_out = buffer;
    36. cmpr_stream.avail_out = buffersize;
    37. } else {
    38. inflateEnd(&cmpr_stream);
    39. }
    40.  
    41. if(status == Z_STREAM_END) {
    42. inflateEnd(&cmpr_stream);
    43. break;
    44. }
    45.  
    46. }while(cmpr_stream.avail_out == 0);
    47.  
    48. return uncompressed;
    49. }
    To copy to clipboard, switch view to plain text mode 
    data is correctly decoded (not only the one posted, dat[40], but ANY data that comes from WWW with gzip, tested and works fine). Above function is not perfect, because it don't check if stream is valid, but it is only a draft, but works with above dat[] array.

    So to sum it up,I don't have slightest idea why qUncompress don't decompress it. I saw in inflate.c that they use inflateInit2(), and my data Indeed pass there to Inflateinit2, but the error occurs.
    I don't really want to write my own decompresser but atm. it works, so if someone know where my mistake is please point them out.
    Best regards.
    EDIT: Here is my other post, due to lack of response here http://stackoverflow.com/questions/2...ress-gzip-data.
    EDIT2: I assume that bigEndian should be only SIZE not the DATA itself, maybe that's the mistake?
    Last edited by Talei; 26th April 2010 at 04:38. Reason: updated contents

  5. #5
    Join Date
    Feb 2007
    Location
    Karlsruhe, Germany
    Posts
    469
    Thanks
    17
    Thanked 90 Times in 88 Posts
    Qt products
    Qt4
    Platforms
    Unix/X11 Windows

    Default Re: qUncompress data from gzip

    Hi!

    Maybe qUncompress doesn't like that zlib version? The header seems different between your zlib stream and the one produced by qCompress. If you debug qUncompress, are there any branches for different headers?

    Good luck,

    Johannes

  6. #6
    Join Date
    Sep 2010
    Posts
    6

    Default Re: qUncompress data from gzip

    Hi,
    Did you succed in unCompressing gzip data with qUncompress?
    Can you give me some hint?

    I have the same problem when trying to decode swf file header...

    Thanks

  7. #7
    Join Date
    Dec 2008
    Location
    Poland
    Posts
    383
    Thanks
    52
    Thanked 42 Times in 42 Posts
    Qt products
    Qt4
    Platforms
    MacOS X Unix/X11 Windows Android

    Default Re: qUncompress data from gzip

    No. I wrote my own function to do that, see last code snippet.
    In the near future - corporate networks reach out to the stars. Electrons and light flow throughout the universe.
    The advance of computerization however, has not yet wiped out nations and ethnic groups.

Similar Threads

  1. QNetworkAccessManager with Accept-Encoding gzip
    By patrik08 in forum Qt Programming
    Replies: 2
    Last Post: 6th February 2013, 11:46
  2. qUncompress and gzipped files
    By roxton in forum Qt Programming
    Replies: 3
    Last Post: 17th April 2011, 09:41
  3. qUncompress with QHttp gzipped response
    By ricky92 in forum Qt Programming
    Replies: 10
    Last Post: 1st December 2009, 19:36
  4. Corrupt JPEG data: premature end of data segment
    By node_ex in forum Qt Programming
    Replies: 1
    Last Post: 19th August 2008, 08:57
  5. How to convert binary data to hexadecimal data
    By yellowmat in forum Newbie
    Replies: 4
    Last Post: 8th March 2006, 16:17

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.