QString 提供了一个采用Unicode字符编码标准(以双字节对字符编码)的的字符串。
Unicode是US-ASCII和Latin-1的超集。
QString中存储的每个字符占16-bit,以QChar表示,QString采用写时拷贝技术(copy-on-write)来减少内存的占用和避免不必要的数据拷贝,对于大部分的使用来说,QString是我们所需要的,因为对它的使用贯穿了整个Qt的API,另外,对Unicode编码的支持也可使你的程序易于翻译和拓展。对于传统以8-bit ‘\0’作为结束的字符串,Qt提供了QByteArray来支持。
构造函数一览
QString ()
QString ( const QChar * unicode, int size )
QString ( const QChar * unicode )
QString ( QChar ch )
QString ( int size, QChar ch )
QString ( const QLatin1String & str )
QString ( const QString & other )
QString ( const char * str )
QString ( const QByteArray & ba )
初始化字符串
一种初始化QString的方法是传递一个const char * 类型的字符串常量, QString str = "Hello";这时,QString会利用fromAscii()函数将const char *数据转化成Unicode。默认情况下,在所有的带有const char *参数的QString 函数中,const char *型被理解成传统的C风格以‘\0’作为结束标志的字符串,而将const char * 数据设成0的参数是合法的。
你还可以以QChar型的数组来表示QString字符串, static const QChar data[4] = { 0x0055, 0x006e, 0x10e3, 0x03a3 }; QString str(data, 4);
另外一种方式是通过resize()方法来设置一个字符串的大小然后逐一赋值,QString是以0下标作为索引的开始的,
QString str;
str.resize(4);
str[0] = QChar('U');
str[1] = QChar('n');
str[2] = QChar(0x10e3);
str[3] = QChar(0x03a3);
对于只读访问,一个比较有效的语法是使用at()函数,它的效率要比[]操作符来的高,
QString str;
for (int i = 0; i < str.size(); ++i) {
if (str.at(i) >= QChar('a') && str.at(i) <= QChar('f'))
qDebug() << "Found character in range [a-f]";
}
QString提供了一下几个基本函数来修改字符数据,
QString str = "and";
str.prepend("rock "); // str == "rock and"
str.append(" roll"); // str == "rock and roll"
str.replace(5, 3, "&"); // str == "rock & roll"
对于C程序员要注意的点:
由于c++的类型系统和基于QString的implicit sharing技术,QString类型可能会被视为int或其他的基本数据类型,例如:
QString Widget::boolToString(bool b)
{
QString result;
if (b)
result = "True";
else
result = "False";
return result;
}
QString区别null string 和 empty string 这两个概念,一个null string表示以QString的默认构造函数传递(const char *)0来完成的初始化的字符串,而an empty string则表示
其中所有字符的长度都是0,A null string is always empty, but an empty string isn't necessarily null:
QString().isNull(); // returns true
QString().isEmpty(); // returns true
QString("").isNull(); // returns false
QString("").isEmpty(); // returns true
QString("abc").isNull(); // returns false
QString("abc").isEmpty(); // returns false
We recommend that you always use the isEmpty() function and avoid isNull().
关于8-bit字符串向Unicode字符串的转化问题:
QString提供了以下四个函数来返回一个const char *版本的字符串 QByteArray: toAscii(), toLatin1(), toUtf8(), and toLocal8Bit().
由QString向8-bit字符串的转换:
To convert from one of these encodings, QString provides fromAscii(), fromLatin1(), fromUtf8(), and fromLocal8Bit(). Other encodings are supported through the QTextCodec class.
As mentioned above, QString provides a lot of functions and operators that make it easy to interoperate with const char * strings. But this functionality is a double-edged sword: It makes QString more convenient to use if all strings are US-ASCII or Latin-1, but there is always the risk that an implicit conversion from or to const char * is done using the wrong 8-bit encoding. To minimize these risks, you can turn off these implicit conversions by defining the following two preprocessor symbols:
QT_NO_CAST_FROM_ASCII disables automatic conversions from C string literals and pointers to Unicode.
QT_NO_CAST_TO_ASCII disables automatic conversion from QString to C strings.
One way to define these preprocessor symbols globally for your application is to add the following entry to your qmake project file:
DEFINES += QT_NO_CAST_FROM_ASCII \
QT_NO_CAST_TO_ASCII
You then need to explicitly call fromAscii(), fromLatin1(), fromUtf8(), or fromLocal8Bit() to construct a QString from an 8-bit string, or use the lightweight QLatin1String class, for example:
QString url = QLatin1String("http://www.unicode.org/");
Similarly, you must call toAscii(), toLatin1(), toUtf8(), or toLocal8Bit() explicitly to convert the QString to an 8-bit string. (Other encodings are supported through the QTextCodec class.)
toAscii() returns an 8-bit string encoded using the codec specified by QTextCodec::codecForCStrings (by default, that is Latin 1).
toLatin1() returns a Latin-1 (ISO 8859-1) encoded 8-bit string.
toUtf8() returns a UTF-8 encoded 8-bit string. UTF-8 is a superset of US-ASCII (ANSI X3.4-1986) that supports the entire Unicode character set through multibyte sequences.
toLocal8Bit() returns an 8-bit string using the system's local encoding.