qt toutf8函数_QString :: toUtf8在做什么?

在QT中,关于QString的toUtf8函数,文章通过一个简单的例子揭示了UTF-8编码的问题。当使用QString构造函数处理UTF-8编码的字符串时,可能会发生双UTF-8编码。解决方案包括转换源文件到Latin-1编码,使用Unicode转义序列,或者直接使用QString::fromUtf8()构造函数来避免问题。在QT5中,此问题已得到解决,构造函数默认使用QString::fromUtf8()。
摘要由CSDN通过智能技术生成

This may sounds like a obvious question, but I'm missing something about either how UTF-8 is encoded or how the toUtf8 function works.

Let's look at a very simple program

QString str("Müller");

qDebug() << str << str.toUtf8().toHex();

Then I get the output

"Müller" "4dc383c2bc6c6c6572"

But I got the idea the the letter ü should have been encoded as c3bc and not c383c2bc.

Thanks

Johan

解决方案

It depends on the encoding of your source code.

I tend to think that your file is already encoded in UTF-8, the character ü being encoded as C3 BC.

You're calling the QString::QString ( const char * str ) constructor which, according to http://doc.qt.io/qt-4.8/qstring.html#QString-8, converts your string to unicode using the QString::fromAscii() method which by default considers the input as Latin1 contents.

As C3 and BC are both valid in Latin 1, representing respectively à and ¼, converting them to UTF-8 will lead to the following characters:

à (C3) -> C3 83

¼ (BC) -> C2 BC

which leads to the string you get: "4d c3 83 c2 bc 6c 6c 65 72"

To sum things up, it's double UTF-8 encoding.

There are several options to solve this issue:

1) You can convert your source file to Latin-1 using your favorite text editor.

2) You can properly escape the ü character into \xFC in the litteral string, so the string won't depend on the file's encoding.

3) you can keep the file and string as UTF-8 data and use QString str = QString::fromUtf8 ("Müller");

Update: This issue is no longer relevant in QT5. http://doc.qt.io/qt-5/qstring.html#QString-8 states that the constructor now uses QString::fromUtf8() internally instead of QString::fromAscii(). So, as long as UTF-8 encoding is used consistently, it will be used by default.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值