Results 1 to 6 of 6

Thread: QRegExp for octal escape sequence in QString

  1. #1
    Join Date
    May 2007
    Posts
    131
    Thanks
    17
    Thanked 4 Times in 2 Posts

    Question QRegExp for octal escape sequence in QString

    Hello,


    I have a QString which contains octal escape sequences that i have to parse.

    A string can for example look like this:

    Qt Code:
    1. "test123/032test/032/0352"
    To copy to clipboard, switch view to plain text mode 
    which should result in

    Qt Code:
    1. "test123 test #2"
    To copy to clipboard, switch view to plain text mode 

    after i am done with replacing the octal escape sequences.

    I tried to use QRegExp for this, but I am not able to filter the sequences.



    Would really appreciate any help.

    Thank you in advance.

  2. #2
    Join Date
    Oct 2009
    Posts
    483
    Thanked 97 Times in 94 Posts
    Qt products
    Qt4 Qt5
    Platforms
    Unix/X11 Windows

    Default Re: QRegExp for octal escape sequence in QString

    Quote Originally Posted by bmn View Post
    I tried to use QRegExp for this, but I am not able to filter the sequences.
    What have you tried? Please show some code and we will help you towards a solution.

    You do not have to use QRegExp (or QRegularExpression if you use Qt 5) for this. A hand-written parser with a basic state machine does the trick.

  3. #3
    Join Date
    May 2007
    Posts
    131
    Thanks
    17
    Thanked 4 Times in 2 Posts

    Default Re: QRegExp for octal escape sequence in QString

    Thank you for your response.

    I tried the following way, but since the escape sequence is treated as a single character it is obviously not working.

    Qt Code:
    1. [\\\d\d\d]{4,4}
    To copy to clipboard, switch view to plain text mode 

    After I convert the given escape sequence to a char, '\032' will lead to 26 decimal, which is not what i want, since it should be treated as 32 decimal to represent a ' ' (space) in ascii.
    But I can overcome this by simple adding 6 to whatever result i will get.

    Qt Code:
    1. char tmp = myStr.at(i).toLatin1(); // '\032' -> 26 dec
    2. char ascii = tmp + 6; //convert to ascii encoding 26 dec -> 32 dec (' ')
    To copy to clipboard, switch view to plain text mode 
    Last edited by bmn; 6th January 2016 at 10:25. Reason: wrap code

  4. #4
    Join Date
    Oct 2009
    Posts
    483
    Thanked 97 Times in 94 Posts
    Qt products
    Qt4 Qt5
    Platforms
    Unix/X11 Windows

    Default Re: QRegExp for octal escape sequence in QString

    This is confusing.

    You mentioned "octal escape sequences" in your first post, although you apparently want to interpret them as decimal numbers. What do you mean by "octal", then?

    Your first post suggests that your escape sequences consist of a slash followed by three decimal digits, but your second post suggests that they begin with a backslash.

    Your second post apparently confuses C octal escape sequences interpreted by the compiler in your source code with the escape sequences that your program shall interpret at runtime. These are two completely unrelated things.

    So, could you please
    • precisely describe the format and meaning of your escape sequences;
    • provide the complete code that you wrote, e.g. in the form of a function that takes a QString and returns the QString obtained by interpreting the escape sequences

    ?

  5. #5
    Join Date
    May 2007
    Posts
    131
    Thanks
    17
    Thanked 4 Times in 2 Posts

    Default Re: QRegExp for octal escape sequence in QString

    Hello,

    The purpose is to decode a DNS-API full service name string. I ended up writing a small state-machine as you suggested. In case anybody is interested, here is the code.
    The 'tmp' QString is the escaped string name of the service.


    Qt Code:
    1. /*
    2.   * All strings used in the DNS-SD APIs are UTF-8 strings. Apart from the exceptions noted below,
    3.   * the APIs expect the strings to be properly escaped, using the conventional DNS escaping rules:
    4.   *
    5.   * '\\' represents a single literal '\' in the name
    6.   * '\.' represents a single literal '.' in the name
    7.   * '\ddd', where ddd is a three-digit decimal value from 000 to 255,
    8.   * represents a single literal byte with that value.
    9.   * A bare unescaped '.' is a label separator, marking a boundary between domain and subdomain.
    10.   */
    11. state = startChar;
    12. for (int i=0; i<tmp.count(); i++) {
    13. QChar ch = tmp.at(i);
    14. switch (state) {
    15. case startChar:
    16. if ((ch == '\\')) {
    17. signPos = i;
    18. nrStr.clear();
    19. state = digit1;
    20. }
    21. break;
    22. case digit1:
    23. if ((ch == '\\') || (ch == '.')) {
    24. tmp.remove(i-1, 1);
    25. i--; //String is shortened by one, so rescan the last sign
    26. state = startChar;
    27. }
    28. else if ((ch >= '0') && (ch <= '2')) {
    29. nrStr.append(ch);
    30. state = digit2;
    31. }
    32. break;
    33. case digit2:
    34. if ((ch >= '0') && (ch <= '9')) {
    35. nrStr.append(ch);
    36. state = digit3;
    37. }
    38. break;
    39. case digit3:
    40. if ((ch >= '0') && (ch <= '9')) {
    41. nrStr.append(ch);
    42. int nr = nrStr.toUInt();
    43. tmp.replace(signPos, 4, QChar(nr));
    44. i=i-3; //String is shortened by 3, so rescan the last 3 signs
    45. state = startChar;
    46. }
    47. break;
    48. default:
    49. state = startChar;
    50. break;
    51. }
    52. }
    To copy to clipboard, switch view to plain text mode 

  6. #6
    Join Date
    Oct 2009
    Posts
    483
    Thanked 97 Times in 94 Posts
    Qt products
    Qt4 Qt5
    Platforms
    Unix/X11 Windows

    Default Re: QRegExp for octal escape sequence in QString

    Your state machine code looks good. What is unusual is that it unescapes the QString in place, but that works; it won't scale well with long strings though, because each call to remove() presumably moves all the remainder of the string, giving a quadratic time complexity in the length of the string. If this becomes a performance bottleneck of your program, consider changing the interface of your decoder so that it can be fed the encoded string in chunks, and outputs chunks of the decoded string (see QTextDecoder to get an idea).

    Here is an optimization for your current code: instead of accumulating digits in nrStr before converting them to a uint, you could progressively compute the number nr: replace nrStr.clear() with nr = 0, and replace nrStr.append(ch) with nr = nr * 10 + ch.toLatin1() - '0'.

Similar Threads

  1. Recover a QString with QRegExp
    By hassinoss in forum Qt Programming
    Replies: 5
    Last Post: 27th February 2014, 17:04
  2. Gui Event Handler Exception : Illegal Escape Sequence
    By subagha in forum Qt Programming
    Replies: 0
    Last Post: 4th November 2013, 15:43
  3. QString with escape characters
    By Jeffb in forum Newbie
    Replies: 1
    Last Post: 26th April 2010, 15:06
  4. Search for QRegExp in a QString
    By Abc in forum Qt Programming
    Replies: 6
    Last Post: 13th August 2008, 10:31
  5. QString manipulation - QRegExp
    By mattia in forum Newbie
    Replies: 1
    Last Post: 18th March 2008, 12:21

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Digia, Qt and their respective logos are trademarks of Digia Plc in Finland and/or other countries worldwide.