QString::replace difference in behavior
Hallo there,
I have lost sooo much time trying to figure this out and it doesn't make much sense for me!
The problem is that QString::replace behaves different under Qt5-Windows compared to Qt4-Windows or Qt5-Linux!
Using:
============
Windows 10 Pro
21H1
19043.1165
Windows Feature Experience Pack 120.2212.3530.0
============
Linux:
NAME="openSUSE Tumbleweed"
# VERSION="20210912"
ID="opensuse-tumbleweed"
ID_LIKE="opensuse suse"
VERSION_ID="20210912"
PRETTY_NAME="openSUSE Tumbleweed"
============
Qt 4.8.7 (only on Windows)
Qt 5.15.2 (Windows and Linux)
MSVC 2019 compiler under Windows
============
I just want to change some "special" characters with others. The code I use should be portable, (at least IMHO) there is no magic in it at all.
But here the code (it is not minimal but runs alone) first to better understand what I want:
Code:
#include <QDebug>
// -----------------------------------------------------------------------------
// -----------------------------------------------------------------------------
void checkStrings(const QString& str1, const QString& str2)
{
if (str1 == str2) {
qDebug() << "=== GOOD ===";
} else {
qDebug() << "### BAD ###";
}
qDebug() << "str1:" << str1;
qDebug() << "str2:" << str2;
}
// -----------------------------------------------------------------------------
// -----------------------------------------------------------------------------
void replaceStrings()
{
qDebug() << "\t----> ENTERING: " << Q_FUNC_INFO;
const QString src
= "[aabbAABBaabbAABB]";
tmp = src;
tmp.replace("a", "-");
dst = "[--bbAABB--bbAABB]";
checkStrings(tmp, dst);
}
// -----------------------------------------------------------------------------
// -----------------------------------------------------------------------------
void replaceUmlauts()
{
qDebug() << "\t----> ENTERING: " << Q_FUNC_INFO;
const QString src
= "[ää,ÄÄ,öö,ÖÖ,üü,ÜÜ]";
tmp = src;
tmp.replace("ä", "-");
// tmp.replace(QString("ä"), QString("-")); // doesn't matter if we do it this or that way...
dst = "[--,ÄÄ,öö,ÖÖ,üü,ÜÜ]";
checkStrings(tmp, dst);
}
// -----------------------------------------------------------------------------
// -----------------------------------------------------------------------------
int main(int argc, char *argv[])
{
Q_UNUSED(argc);
Q_UNUSED(argv);
replaceStrings();
replaceUmlauts();
return 0;
}
// EOF
The most accurate answer to that should be and is the one under Linux:
Linux (Qt5) output:
Code:
----> ENTERING: void replaceStrings()
=== GOOD ===
str1: "[--,bb,AA,BB,--,bb,AA,BB]"
str2: "[--,bb,AA,BB,--,bb,AA,BB]"
----> ENTERING: void replaceUmlauts()
=== GOOD ===
str1: "[--,ÄÄ,öö,ÖÖ,üü,ÜÜ,--,ÄÄ,öö,ÖÖ,üü,ÜÜ]"
str2: "[--,ÄÄ,öö,ÖÖ,üü,ÜÜ,--,ÄÄ,öö,ÖÖ,üü,ÜÜ]"
The windows outputs are not giving everything correct back because the console uses codepage 850. But that is OK!
Windows 10 - Qt4 output:
Code:
----> ENTERING: void __cdecl replaceStrings(void)
=== GOOD ===
str1: "[--,bb,AA,BB,--,bb,AA,BB]"
str2: "[--,bb,AA,BB,--,bb,AA,BB]"
----> ENTERING: void __cdecl replaceUmlauts(void)
=== GOOD ===
str1: "[--,??,÷÷,ÍÍ,³³,??,--,??,÷÷,ÍÍ,³³,??]"
str2: "[--,??,÷÷,ÍÍ,³³,??,--,??,÷÷,ÍÍ,³³,??]"
Windows 10 - Qt5 output:
Code:
----> ENTERING: void __cdecl replaceStrings(void)
=== GOOD ===
str1: "[--,bb,AA,BB,--,bb,AA,BB]"
str2: "[--,bb,AA,BB,--,bb,AA,BB]"
----> ENTERING: void __cdecl replaceUmlauts(void)
### BAD ###
str1: "[--,--,--,--,--,--,--,--,--,--,--,--]"
str2: "[--,??,??,??,??,??,--,??,??,??,??,??]"
The problem is that the second compare (line 6) in "replaceUmlauts()" results to "### BAD ###".
WHYYYYYYYYYYYY is that so???!?!?!?!??!
Just trying to change:
the "a"s to "-"s in the first source QString: "[aa,bb,AA,BB,aa,bb,AA,BB]"
and the "ä"s to "-"s again in the second one "[ää,ÄÄ,öö,ÖÖ,üü,ÜÜ,ää,ÄÄ,öö,ÖÖ, üü,ÜÜ]".
Thanks for ANY help on that in advance!
Re: QString::replace difference in behavior
Try converting your strings to Latin1 or utf-8 before doing the replace on the resulting QByteArray. Don't know if this will help. Character conversions and codepage issues can be very tricky.
Re: QString::replace difference in behavior
Thanks for the answer d_stranz.
I don't want to make any change to the code because Qt4 under Windows DOES work as intended and also Qt5 under Linux. And I am not making anything fancy there. Just using the "QString::replace()" function and I expect it to behave like the documentation writes.
I just want to understand why Qt5-Windows behaves differently. Any clues?
BTW, the source file is ALWAYS in UTF8 (will add this to the first post, forgot about that).
Quote:
Originally Posted by
d_stranz
Try converting your strings to Latin1 or utf-8 before doing the replace on the resulting QByteArray. Don't know if this will help. Character conversions and codepage issues can be very tricky.
Re: QString::replace difference in behavior
OK, the problem seems to be the MSCV 2019 compiler.
I tried to use another kit today and added the MinGW 8.1.0 compiler along with Qt5 for windows. Result:
Windows 10 - Qt5 - MinGW output:
Code:
----> ENTERING: void replaceStrings()
=== GOOD ===
str1: "[--,bb,AA,BB,--,bb,AA,BB]"
str2: "[--,bb,AA,BB,--,bb,AA,BB]"
----> ENTERING: void replaceUmlauts()
=== GOOD ===
str1: "[--,??,÷÷,ÍÍ,³³,??,--,??,÷÷,ÍÍ,³³,??]"
str2: "[--,??,÷÷,ÍÍ,³³,??,--,??,÷÷,ÍÍ,³³,??]"
So after really finding out who the bad guy was by trying to make a small program for you here, I also found out how to make him work as wanted.
Just add in your profile:
Code:
QMAKE_CXXFLAGS += -source-charset:utf-8
QMAKE_CXXFLAGS += -execution-charset:utf-8
somewhere, along with any conditions you need and it works also good with MSCV 2019.
I hope this saves some time on everyone who encounters it!
Thanks and till next time!
freeman_w
Re: QString::replace difference in behavior
Quote:
QMAKE_CXXFLAGS += -source-charset:utf-8
QMAKE_CXXFLAGS += -execution-charset:utf-8
Good detective work. Surprising to me that the compiler (or maybe the IDE) would let you type characters with umlauts as part of your source code and then mangle them when it built the code.