Results 1 to 5 of 5

Thread: Parsing/extracting from a binary QByteArray

  1. #1
    Join Date
    Jan 2011
    Posts
    70
    Thanks
    43
    Thanked 4 Times in 2 Posts
    Qt products
    Qt4
    Platforms
    Windows

    Default Parsing/extracting from a binary QByteArray

    Forgive me if "parse" only applies to text-based data, but the idea is the same. I have a chunk of bytes stored in a 30-byte QByteArray that I extracted from a very organized external file via QFile::read(int). I know that the first 12 bytes is 3 longs, the next 2 bytes is an unsigned short, etc.
    Qt Code:
    1. QFile infile("c:\temp\file.bin");
    2. //... open the file, etc etc
    3. QByteArray chunk = infile.read(30);
    To copy to clipboard, switch view to plain text mode 
    I then pass that QByteArray into a parser function running in another thread while the main thread moves onto the next chunk. This happens millions of times in sequence, so wrapping that QByteArray into a QDataStream as suggested in the documentation slows the process down so much that I want to try manipulating the QByteArray directly.

    How do I (for example) read the first 4 bytes from a QByteArray, concatenate them, then cast the result into a long int? I'm looking for a faster equivalent to this that doesn't require me to construct a QDataStream (and, by extension, a QBuffer):
    Qt Code:
    1. QDataStream io(chunk);
    2. qint32 data;
    3. io >> data;
    To copy to clipboard, switch view to plain text mode 
    I've tried this:
    Qt Code:
    1. bool ok = true;
    2. chunk.mid(0,4).toInt(&ok); //ok==false
    To copy to clipboard, switch view to plain text mode 
    but it doesn't work since the data is binary, not characters.

    The only option I've come up with looks really ugly:
    Qt Code:
    1. quint32 a = chunk.at(0);
    2. quint32 b = chunk.at(1);
    3. quint32 c = chunk.at(2);
    4. quint32 d = chunk.at(3);
    5. quint32 abcd = 0 | (a << 24) | (b << 16) | (c << 8) | d;
    To copy to clipboard, switch view to plain text mode 
    I'm not even sure that this works or if it's faster! Of course, I still have to deal with endianness and all that, but let's just assume that's not a problem for now.

    Any suggestions?

  2. #2
    Join Date
    Mar 2011
    Location
    Hyderabad, India
    Posts
    1,882
    Thanks
    3
    Thanked 452 Times in 435 Posts
    Qt products
    Qt4 Qt5
    Platforms
    MacOS X Unix/X11 Windows
    Wiki edits
    15

    Default Re: Parsing/extracting from a binary QByteArray

    This will work assuming that the hardware platfrom and data chunk have same endian. If hardware platfrom and data chunk have different endian you are looking at a different problem. Endian mismatch will for sure cause degrade in performance.


    This is just plain C style, nothing much of C++


    Qt Code:
    1. struct Record
    2. {
    3. quint32 Long1;
    4. quint32 Long2;
    5. quint32 Long3;
    6. quint32 Long4;
    7. quint16 Short1;
    8. quint16 Short2;
    9. //...
    10. };
    11.  
    12.  
    13. const Record *record = (const Record*)chunk.constData();
    14. quint32 long_data1 = record->Long1;
    15. quint32 long_data2 = record->Long2;
    16. quint16 short_data1 = record->Short1;
    To copy to clipboard, switch view to plain text mode 
    When you know how to do it then you may do it wrong.
    When you don't know how to do it then it is not that you may do it wrong but you may not do it right.

  3. The following user says thank you to Santosh Reddy for this useful post:

    Phlucious (2nd December 2011)

  4. #3
    Join Date
    Aug 2009
    Location
    Belgium
    Posts
    310
    Thanks
    10
    Thanked 31 Times in 25 Posts
    Qt products
    Qt4
    Platforms
    Windows

    Default Re: Parsing/extracting from a binary QByteArray

    If the contents of the 30-byte chunks is fixed as you indicate, and if the endianess is the same, then Santosh's solution is the way to go.

    Otherwise you must encode it like you did, but you can write it more compact like :
    Qt Code:
    1. unsigned char *pData = (unsigned char*)chunk.data();
    2. quint32 long_data1 = pData[0] | ((quint32)pData[0+1]<<8) | ((quint32)pData[0+2]<<16) | ((quint32)pData[0+3]<<24);
    To copy to clipboard, switch view to plain text mode 
    And to make it more readable, create a #define to extract the data, subsituting the '0' for the parameter in your define. And create different #defines for different data types.

    As a sidenote... I don't think that parsing such small chunks of data in a separate thread will speed up things. Possibly the effort of task switching is greater than the effort of parsing.

    Best regards,
    Marc

  5. The following user says thank you to marcvanriet for this useful post:

    Phlucious (2nd December 2011)

  6. #4
    Join Date
    Jan 2011
    Posts
    70
    Thanks
    43
    Thanked 4 Times in 2 Posts
    Qt products
    Qt4
    Platforms
    Windows

    Default Re: Parsing/extracting from a binary QByteArray

    Thanks to both of you for useful responses. I hadn't thought of re-casting the char* into a struct. Wouldn't I have to add an extra char to the end of my struct because QByteArray::data() and QByteArray::constdata() both return null-terminated string?

    Qt has some useful endian-conversion functions in QtEndian that I might be able to use after importing.

    Quote Originally Posted by marcvanriet View Post
    As a sidenote... I don't think that parsing such small chunks of data in a separate thread will speed up things. Possibly the effort of task switching is greater than the effort of parsing.
    My reasoning was that it'd be faster to read a 30-byte (or 60-byte, or 190-byte... there's a handful of possible record formats) chunk from the disk and parsing that from memory instead of reading it 1-8 bytes at a time since HDD read/write tends to be very slow unless you have a SSD. Is that reasoning sound, or does Qt cache the file more than I realize? I've had issues trying to rapidly process a lot of data straight off the hard drive in the past—it tends to be many times slower than loading the whole file into memory first.

    You make a good point about task switching, though... maybe I'll thread a couple thousand records at a time instead of one at a time.

    I read in this thread that the system might run faster if I used the native int instead of the smaller integers. Supposing for a second that speed's more important than memory, would it be worthwhile to re-cast the records into a struct of native integers after importing? I guess there's only one way to really find out...

  7. #5
    Join Date
    Aug 2009
    Location
    Belgium
    Posts
    310
    Thanks
    10
    Thanked 31 Times in 25 Posts
    Qt products
    Qt4
    Platforms
    Windows

    Default Re: Parsing/extracting from a binary QByteArray

    I read in ... that the system might run faster if I used the native int instead of the smaller integers. Supposing for a second that speed's more important than memory, would it be worthwhile to re-cast the records into a struct of native integers after importing?
    That won't speed up the reading part of course.

    It may only be usefull if you run complex numerical algorithms on the data (for instance image manipulation, fourier transforms, color space transformations, ...).

    It won't make a difference if you're just displaying the data in a graph or logging the data in a report or so.

    Best regards,
    Marc
    Last edited by marcvanriet; 2nd December 2011 at 20:37.

  8. The following user says thank you to marcvanriet for this useful post:

    Phlucious (2nd December 2011)

Similar Threads

  1. Replies: 1
    Last Post: 22nd June 2011, 08:12
  2. Extracting int values from QByteArray
    By Tottish in forum Newbie
    Replies: 4
    Last Post: 7th April 2010, 10:41
  3. Replies: 1
    Last Post: 20th January 2010, 09:01
  4. Replies: 9
    Last Post: 25th July 2009, 13:27
  5. Extracting xml fragment from QXmlStreamReader
    By Gopala Krishna in forum Qt Programming
    Replies: 5
    Last Post: 1st December 2007, 09:14

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.