Hi,
I have a following string
"²¹Ðĺ£?#xce;"
What does this mean? How can I use QTextCodec to convert it to readable characters?
Thanks!
Hi,
I have a following string
"²¹Ðĺ£?#xce;"
What does this mean? How can I use QTextCodec to convert it to readable characters?
Thanks!
Exactly how many "characters" are in that string? I think the forum is playing with your input. I see literally this:
But is suspect you pasted some smaller number of "gibberish" characters."²¹Ðĺ£?#xce;"
It is impossible to say out of context. This is part of the reason that Unicode is so useful.What does this mean?
Impossible to know without knowing where the source material originated and what encoding was used.How can I use QTextCodec to convert it to readable characters?
Sorry, don't know what more information I should give.
It is supposed to be some Chinese characters. I think each block of ";" is a character. Normally they are readable Chinese characters that I can display in QTextEdit, but then I am hit by this unreadable string, and can't display it properly in QTextEdit.
The encoding is supposed to be GB2312...
Do you literally have "²¹Ðĺ£?#xce;" in your string? Am I correct in assuming the "?" is a typo and should be "&"?
Edit: The "&#xhh;" is an HTML/XML escaped byte with hexadecimal ("x") value of hh.
GB2312 is a character set that can be encoded several ways. The "GB18030" QTextCodec is what you probably want to use:
Note that the last byte in the input string is an incomplete character. Does the result look correct?Qt Code:
#include <QtCore> int main(int argc, char **argv) { qDebug() << string; // Outputs three Chinese characters: // "补心海" return 0; }To copy to clipboard, switch view to plain text mode
Last edited by ChrisW67; 20th December 2013 at 00:19. Reason: Expanded
Wow! Thanks! It is "补心海拔".
But now in some place I am getting normal string like "补心海拔", in other place I get that unreadable string. How can I tell in my program to handle both cases correctly?
Edit 1: "?" is not a typo, I copy straight from what I get.
Edit 2: It should be 4 Chinese characters, but I only get 3 with your code...
Edit 3: even worse than this. I am getting mixed string like this "X?#xf8;æ ‡", which should be “Xåæ ‡â€. I suppose the user has broken database. But the user don't care, they blame me! So I have to find a workaround.
Edit 4: further investigation, I found that the "?" represents missing string, in this example, "?" should be "°&", so the complete string should be (I think) "²¹Ðĺ£°Î" . The it gives "补心海拔".
With all these mess, what should I do? Just tell user that they have bad database and ask then to fix their database? I think they would not be happy to hear that...
Thank you!
Last edited by lni; 20th December 2013 at 11:36.
OK,
The user agrees it is their database problem. Thank you ChrisW67!
Bookmarks