目录
1 字符编码
1.1几种编码格式比较
编码类型 | 英文 | 汉字 | 说明 |
ASCII码 | 1个Byte字节(不区分大小写) | 无中文 | ASCII码只适用于美帝,要是用在美帝之外的国家,就不能满足需求了 |
ANSI码 | 英文1个字节 | 中文字符占2个字节 | 前127个与ASCII码相同,之后的字符全是某个国家语言的所有字符;中国(GB2312);日本(Shift_JIS);韩国(Euc-kr);用0x00~0x7f 范围的1 个字节来表示 1 个英文字符;超出一个字节的 0x80~0xFFFF 范围来表示其他语言的其他字符 |
Unicode编码 | 英文2个字节 | 中文字符占2个字节 | 为了解决不同国家ANSI编码的冲突问题,Unicode应运而生;统一全世界所有编码,解决乱码问题 |
UTF-8编码 | 英文1个字节 | 中文字符占3个字节 | 变长的编码方式:它可以使用1~4个字节表示一个符号;根据不同的符号而变化字节长度 |
2 字符串类型
2.1 分类
分类 | 类型 | 说明 | 举例 |
基本类型 | char* | 字符指针,指向一个字符串的首地址 | char* pString = "字符串内容"; |
char[] | 字符数组,即包含了字符串的首地址信息,也包含长度信息 | char szString[] = "字符串内容"; char* pString = szString; size_t nSize = sizeof(szString); | |
const char* | 若一个字符串内容不希望被随意修改,我们应当为其加上const属性,特别是在传入参数时 | ||
wchar_t* | 对于Unicode(双字节字符集)字符串,在Windows中使用wchar_t*来表示 | ||
CHAR*\WCHAR*\TCHAR* | CHAR* = char WCHAR = wchar_t* TCHAR则使用了宏定义技术来检测项目的字符集配置,根据这个配置来自动适应——当项目是多字节(或ANSI,下同)时,它就是char;项目是宽字节(Unicode,下同)时,它就是wchar_t。 | ||
Windows类型(LP***STR) | LPSTR | = char* | typedef char* LPSTR; |
LPCSTR | = const char* | typedef const char* LPCSTR; | |
LPWSTR | = wchar_t* | typedef wchar_t* LPWSTR; | |
LPCWSTR | = const wchar_t* | typedef const wchar_t* LPCWSTR; | |
LPTSTR | = TCHAR* | typedef TCHAR* LPTSTR; | |
LPCTSTR | = const TCHAR* | typedef const TCHAR* LPCTSTR; | |
STL字符串 | std::string | 该类型来自于STL库,其实所有STL容器都是使用泛型技术的类模版实现; 是类模版basic_string的实例化; 专门针对ANSI字符串的相关操作 | typedef basic_string<char, …> string; |
std::wstring | 专门针对Unicode字符串的相关操作 | typedef basic_string<wchar_t, …> wstring; | |
ATL字符串 | BSTR | typedef LPWSTR BSTR; | |
CString | CString也是类模版CStringT的实例化。客观来讲,CString设计上比std::string要更为精巧,也更加实用; CString会根据TCHAR的实际类型自动适应项目字符集配置 | typedef CStringT<TCHAR, …> CString; | |
COM字符串 | bstr_t | bstr_t是COM中的字符串封装类,只是单纯的类,而不是类模版,它在一个类中同时实现了对char*和wchar_t*的支持 |
3 字符串函数
3.1 分类
类型 | tchar.h | UNICODE | 非UNICODE | 说明 |
拷贝函数 | _tcscpy\_tcscpy_s | wcscpy\wcscpy_s | strcpy\strcpy_s | |
_tcsncpy\_tcsncpy_s | wcsncpy\ wcsncpy_s | strncpy\strncpy_s | ||
_stprintf\_stprintf_s | _swprintf\swprintf_s | sprintf\sprintf_s | ||
memcpy\memcpy_s\wmemcpy_s\ memset | ||||
拼接函数 | _tcscat\_tcscat_s | wcscat\wcscat_s | strcat\strcat_s | |
计算长度 | _tcslen | wcslen | strlen | |
_tcsnlen | wcsnlen | strnlen | ||
编码转换函数 | WideCharToMultiByte | |||
MultiByteToWideChar | ||||
SysAllocString | ||||
ATL宏 <atlconv.h> | T2A\T2W | T2A= W2A | T2W = A2W | 使用前必须包含头文件,并且申明USER_CONVERSION; 考虑到栈空间的尺寸(VC 默认2M),使用时要注意几点 1、只适合于进行短字符串的转换; 4、对情况 2 和 3,要使用 MultiByteToWideChar() 和 WideCharToMultiByte(); |
A2T\ W2T | A2T= A2W | W2T = W2A | ||
T2CA\ T2CW | T2CA = W2CA | T2CW = A2CW | ||
A2CT\ W2CT | A2CT = A2CW | W2CT = W2CA | ||
字符串转数字 | _ttoi\ _tstoi | _wtoi | atoi | 字符串转int,需要引入<stdlib.h> |
_ttol\ _tstol | _wtol | atol | 字符串转long,需要引入<stdlib.h> | |
_ttoi64\ _tstoi64 | _wtoi64 | _atoi64 | 字符串转__int64,需要引入<stdlib.h> | |
_tstof | _wtof | atof | 字符串转double,需要引入<stdlib.h> | |
3.2 重点函数介绍
3.2.1 strlen与wcslen
char str[] = "Hello!测试";
printf( "字符串长度:%d,字节数:%d\n", strlen( str ), sizeof( str ) );//10 11
wchar_t wstr[] = L"Hello!测试";
printf( "宽字符串长度(wcslen):%d,字节数:%d\n", wcslen( wstr ), sizeof( wstr ) );//8 18
1、strlen:返回的是字节数,不包含'\0'(结束符)。中英文不一致,且不同的字符编码集,值也不 同(Ansi编码:中文2,英文1;Unicode编码:中文2,英文2;UTf-8编码:中文3,英文2)
2、wcslen:返回的是字符的个数,不包含'\0'(结束符)。中英文一致
3、sizeof:返回的是字节数,包含'\0'结束符,而'/0'在Unicode下也是占两个字节的
4 字符串操作
4.1 字符串转换
4.1.1 类型转换
注意:
1、从char*\string 转换到 wchar_t*\wstring,属于Ansi编码转化到Unicode编码,具体转换方法见4.1.2
2、string\wstring 与CString之间不能直接转换 需要先转换为中间基础类型(const char*\const wchar_t*)后再转换
3、const char*\const wchar_t* 转换为char*\ wchar_t* 需要拷贝
代码举例:
// TestDemo.cpp : Defines the entry point for the console application.
//
/************************************************************************/
/* 类型转换 */
/************************************************************************/
#include "stdafx.h"
#include "atlstr.h"
#include <string>
int _tmain(int argc, _TCHAR* argv[])
{
//1-1 char* ->string
char* pChar = "Test字符串";
std::string str = pChar!= NULL?pChar:"";//way1:拷贝构造
std::string str1;
str1 = pChar!= NULL?pChar:""; //way2 赋值语句
char* pChar2 = NULL;
//std::string str2 = pChar2; //报错
std::string str2 = pChar2!= NULL?pChar2:"";
std:: string str3;
//str3.append(pChar2); //也崩溃报错
str3.append(pChar); // way3:append
std:: string str4;
str4.append(str1);
std::string str5;
str5.assign(pChar); //way4:assign 清空原来的 用新的代替原来的
//str5.assign(pChar2); //也崩溃报错
str5.assign(str1);
//1-2 string ->char*
const char* pChar3 = str.c_str();
size_t nSize = strlen(pChar3)*sizeof(char); //先转换为const char* 再转换为char*
char* pChar4 = new char[nSize + 1];
memset(pChar4,0,nSize + 1);
strcpy(pChar4,pChar3);
delete[]pChar4;
pChar4 = NULL;
//1-3 char*\const char* ->CString
CString strCS = pChar; //char* ->CString
CString strCS1 = str.c_str(); //const char* ->CString
CString strCS2(pChar); //构造函数
CString strCS3(str.c_str()); //构造函数
//1-4 CString -> const char*
CStringA strCSA(strCS); //先转换为多字节
const char* pChar5 = strCSA.GetString();
const char* pChar6 = strCSA.GetBuffer();
const char* pChar7 = strCSA;
//1-5 CString -> char*
size_t nSize8 = strlen(pChar5)*sizeof(char);//先转换为const char* 再转换为char*
char* pChar8 = new char[nSize8 + 1];
memset(pChar8,0,nSize8 + 1);
strcpy(pChar8,pChar5);
delete[]pChar8;
pChar8 = NULL;
//2-1 wchar_t* ->wstring
wchar_t* pWchar = _T("Test宽字符串");
std::wstring wsStr(pWchar); //way1:构造
std::wstring wsStr1 = pWchar; //way2:拷贝构造
std::wstring wsStr2;
wsStr2 = pWchar; //way3:赋值语句
std::wstring wsStr3;
wsStr3.append(pWchar); //way4:append
std::wstring wsStr4;
wsStr4.assign(pWchar); //way5:assign
//2-2 wstring ->wchar_t*
const wchar_t* pWchar1 = wsStr.c_str();
size_t nwSize = wcslen(pWchar1)*sizeof(wchar_t);
wchar_t* pWchar2 = new wchar_t[nwSize + 1];
memset(pWchar2,0,nwSize + 1);
wcscpy(pWchar2,pWchar1);
delete[]pWchar2;
pWchar2 = NULL;
//2-3 wchar_t*\const wchar_t* -> CString
CString strWCS = pWchar; //wchar_t* ->CString
CString strWCS1 = wsStr.c_str(); //const wchar_t* ->CString
CString strWCS2(pWchar); //构造函数
CString strWCS3( wsStr.c_str()); //构造函数
//2-4 CStirng ->const wchar_t*
const wchar_t* strCWChar = strWCS;
const wchar_t* strCWChar1 = strWCS.GetString();
const wchar_t* strCWChar2 = strWCS.GetBuffer();
//2-5 CStirng ->wchar_t*
size_t nWCharSize = wcslen(strCWChar)*sizeof(wchar_t);
wchar_t* strPWChar = new wchar_t[nWCharSize + 1];
memset(strPWChar,0,nWCharSize + 1);
wcscpy(strPWChar,strCWChar);
delete[]strPWChar;
strPWChar = NULL;
return 0;
}
4.1.2 编码转换
// TestDemo.cpp : Defines the entry point for the console application.
//
/************************************************************************/
/* 编码格式转换 */
/************************************************************************/
#include "stdafx.h"
#include "atlstr.h"
#include <string>
/*=============================================================================
函 数 名: AnsiToUnicode
功 能: 实现将char型buffer(ANSI编码)中的内容安全地拷贝到指定的WChar型(Unicode编码)的buffer中
参 数: char* pchSrc [in] 源字符串
WCAHR* pchDest [out] 目标buf
int nDestLen [in] 目标buf长度(注意:以字节为单位,不是以字符个数为单位)
注 意: 无
返 回 值: 无
=============================================================================*/
WCHAR* AnsiToUnicode(IN const char* pchSrc, OUT WCHAR* pchDest,IN int nDestLen )
{
if ( pchSrc == NULL || pchDest == NULL )
{
return NULL;
}
int nTmpLen = MultiByteToWideChar(CP_ACP, 0, pchSrc, -1, NULL, 0);
WCHAR* pWTemp = new WCHAR[nTmpLen + 1];
memset(pWTemp, 0, (nTmpLen + 1) * sizeof(WCHAR));
MultiByteToWideChar(CP_ACP, 0, pchSrc, -1, pWTemp, nTmpLen + 1);
UINT nLen = wcslen(pWTemp);
if (nLen + 1 > (nDestLen /*/ sizeof(WCHAR)*/))
{
wcsncpy(pchDest, pWTemp, nDestLen /*/ sizeof(WCHAR)*/ - 1);
pchDest[nDestLen /*/ sizeof(WCHAR)*/ - 1] = 0;
}
else
{
wcscpy(pchDest, pWTemp);
}
//delete []pWTemp;
return pWTemp;
}
/*=============================================================================
函 数 名: UnicodeToAnsi
功 能: 实现将WCHAR型buffer(Unicode编码)中的内容安全地拷贝到指定的char型(ANSI编码)的buffer中
参 数: WCHAR* pchSrc [in] 源字符串
char* pchDest[out] 目标buf
int nDestLen [in] 目标buf长度(注意:以字节为单位,不是以字符个数为单位)
注 意: 无
返 回 值: 无
=============================================================================*/
char* UnicodeToAnsi(IN const WCHAR* pchSrc, OUT char* pchDest, IN int nDestLen )
{
if ( pchDest == NULL || pchSrc == NULL )
{
return NULL;
}
const WCHAR* pWStrSRc = pchSrc;
int nTmplen = WideCharToMultiByte(CP_ACP, 0, pWStrSRc, -1, NULL, 0, NULL, NULL);
char* pTemp = new char[nTmplen + 1];
memset(pTemp, 0, nTmplen + 1);
WideCharToMultiByte(CP_ACP, 0, pWStrSRc, -1, pTemp, nTmplen + 1, NULL, NULL);
int nLen = strlen(pTemp);
if (nLen + 1 > nDestLen)
{
strncpy(pchDest, pTemp, nDestLen - 1);
pchDest[nDestLen - 1] = 0;
}
else
{
strcpy(pchDest, pTemp);
}
//delete []pTemp;
return pTemp;
}
/*=============================================================================
函 数 名: Utf8ToUnicode
功 能: 实现将char型的buffer(utf8编码)中的内容安全地拷贝到指定的WCHAR型buffer(Unicode编码)中
参 数: char* pchSrc [in] 源字符串
WCHAR* pchDest [out] 目标buf
int nDestLen [in] 目标buf长度(注意:以字节为单位,不是以字符个数为单位)
注 意: 无
返 回 值: 无
=============================================================================*/
WCHAR* Utf8ToUnicode(IN const char* pchSrc, OUT WCHAR* pchDest, IN int nDestLen )
{
if ( pchSrc == NULL || pchDest == NULL )
{
return NULL;
}
int nTmpLen = MultiByteToWideChar(CP_UTF8, 0, pchSrc, -1, NULL, 0);
WCHAR* pWTemp = new WCHAR[nTmpLen + 1];
memset(pWTemp, 0, (nTmpLen + 1) * sizeof(WCHAR));
MultiByteToWideChar(CP_UTF8, 0, pchSrc, -1, pWTemp, nTmpLen + 1);
UINT nLen = wcslen(pWTemp);
if (nLen + 1 > (nDestLen /*/ sizeof(WCHAR)*/))
{
wcsncpy(pchDest, pWTemp, nDestLen /*/ sizeof(WCHAR)*/ - 1);
pchDest[nDestLen/* / sizeof(WCHAR)*/ - 1] = 0;
}
else
{
wcscpy(pchDest, pWTemp);
}
//delete []pWTemp;
return pWTemp;
}
/*=============================================================================
函 数 名: UnicodeToUtf8
功 能: 实现将WCHAR型buffer(Unicode编码)中的内容安全地拷贝到指定的char型的buffer(utf8编码)中
参 数: WCAHR* pchSrc [in] 源字符串
char* pchDest [out] 目标buf
int nDestLen [in] 目标buf长度(注意:以字节为单位,不是以字符个数为单位)
注 意: 无
返 回 值: 无
=============================================================================*/
char* UnicodeToUtf8(IN const WCHAR* pchSrc, OUT char* pchDest, IN int nDestLen )
{
if ( pchDest == NULL || pchSrc == NULL )
{
return NULL;
}
const WCHAR* pWStrSRc = pchSrc;
int nTmplen = WideCharToMultiByte(CP_UTF8, 0, pWStrSRc, -1, NULL, 0, NULL, NULL);
char* pTemp = new char[nTmplen + 1];
memset(pTemp, 0, nTmplen + 1);
WideCharToMultiByte(CP_UTF8, 0, pWStrSRc, -1, pTemp, nTmplen + 1, NULL, NULL);
int nLen = strlen(pTemp);
if (nLen + 1 > nDestLen)
{
strncpy(pchDest, pTemp, nDestLen - 1);
pchDest[nDestLen - 1] = 0;
}
else
{
strcpy(pchDest, pTemp);
}
//delete []pTemp;
return pTemp;
}
/*=============================================================================
函 数 名: AnsiToUtf8
功 能: 实现将char型buffer(ANSI编码)中的内容安全地拷贝到指定的char型的buffer(utf8编码)中
参 数: char* pchSrc [in] 源字符串
char* pchDest [out] 目标buf
int nDestLen [in] 目标buf长度(注意:以字节为单位,不是以字符个数为单位)
注 意: 无
返 回 值: 无
=============================================================================*/
char* AnsiToUtf8(IN const char* pchSrc,OUT char* pchDest, IN int nDestLen )
{
if (pchSrc == NULL || pchDest == NULL)
{
return NULL;
}
// 先将ANSI转成Unicode
int nUnicodeBufLen = MultiByteToWideChar(CP_ACP, 0, pchSrc, -1, NULL, 0);
WCHAR* pUnicodeTmpBuf = new WCHAR[nUnicodeBufLen + 1];
memset(pUnicodeTmpBuf, 0, (nUnicodeBufLen + 1) * sizeof(WCHAR));
MultiByteToWideChar(CP_ACP, 0, pchSrc, -1, pUnicodeTmpBuf, nUnicodeBufLen + 1);
// 再将Unicode转成utf8
int nUtf8BufLen = WideCharToMultiByte(CP_UTF8, 0, pUnicodeTmpBuf, -1, NULL, 0, NULL, NULL);
char* pUtf8TmpBuf = new char[nUtf8BufLen + 1];
memset(pUtf8TmpBuf, 0, nUtf8BufLen + 1);
WideCharToMultiByte(CP_UTF8, 0, pUnicodeTmpBuf, -1, pUtf8TmpBuf, nUtf8BufLen + 1, NULL, NULL);
int nLen = strlen(pUtf8TmpBuf);
if (nLen + 1 > nDestLen)
{
strncpy(pchDest, pUtf8TmpBuf, nDestLen - 1);
pchDest[nDestLen - 1] = 0;
}
else
{
strcpy(pchDest, pUtf8TmpBuf);
}
//delete[]pUtf8TmpBuf;
delete[]pUnicodeTmpBuf;
pUnicodeTmpBuf = NULL;
return pUtf8TmpBuf;
}
/*=============================================================================
函 数 名: Utf8ToAnsi
功 能: 实现将char型buffer(utf8编码)中的内容安全地拷贝到指定的char型的buffer(ANSI编码)中
参 数: char* pchSrc [in] 源字符串
char* pchDest [out] 目标buf
int nDestLen [in] 目标buf长度(注意:以字节为单位,不是以字符个数为单位)
注 意: 无
返 回 值: 无
=============================================================================*/
char* Utf8ToAnsi(IN const char* pchSrc,OUT char* pchDest,IN int nDestLen)
{
if (pchSrc == NULL || pchDest == NULL)
{
return NULL;
}
// 先将utf8转成Unicode
int nUnicdeBufLen = MultiByteToWideChar(CP_UTF8, 0, pchSrc, -1, NULL, 0);
WCHAR* pUnicodeTmpBuf = new WCHAR[nUnicdeBufLen + 1];
memset(pUnicodeTmpBuf, 0, (nUnicdeBufLen + 1) * sizeof(WCHAR));
MultiByteToWideChar(CP_UTF8, 0, pchSrc, -1, pUnicodeTmpBuf, nUnicdeBufLen + 1);
// 再将Unicode转成Ansi
int nAnsiBuflen = WideCharToMultiByte(CP_ACP, 0, pUnicodeTmpBuf, -1, NULL, 0, NULL, NULL);
char* pAnsiTmpBuf = new char[nAnsiBuflen + 1];
memset(pAnsiTmpBuf, 0, nAnsiBuflen + 1);
WideCharToMultiByte(CP_ACP, 0, pUnicodeTmpBuf, -1, pAnsiTmpBuf, nAnsiBuflen + 1, NULL, NULL);
int nLen = strlen(pAnsiTmpBuf);
if (nLen + 1 > nDestLen)
{
strncpy(pchDest, pAnsiTmpBuf, nDestLen - 1);
pchDest[nDestLen - 1] = 0;
}
else
{
strcpy(pchDest, pAnsiTmpBuf);
}
//delete []pAnsiTmpBuf;
delete []pUnicodeTmpBuf;
pUnicodeTmpBuf = NULL;
return pAnsiTmpBuf;
}
int _tmain(int argc, _TCHAR* argv[])
{
char pStr[] = "Ansic字符串"; //ANSI编码
//char* pStr = "Ansic字符串"; //同上
WCHAR pWStr[] = L"Unicode字符串"; //Unicode编码
//WCHAR* pWStr = "Unicode字符串"; //同上
//TCHAR* pTStr = _T("ceshi字符串"); //等价于上面两种
int nSizeofChar = sizeof(char); //1
int nSizeofWchar = sizeof(WCHAR); //2
//1-1 Unicode转Ansi
int nU2ALenSrc = wcslen(pWStr);//10
int nU2ASize = sizeof(WCHAR);
int nU2ALen = nU2ALenSrc*nU2ASize;
CHAR* pStrAnsi = new CHAR[nU2ALen + 1];
memset(pStrAnsi, 0, nU2ALen);
char* pStrAnsiDest = UnicodeToAnsi(pWStr,pStrAnsi,nU2ALen);
int nA2ULenTest = strlen(pStrAnsi);
delete[]pStrAnsi;
pStrAnsi = NULL;
//1-2 Ansi转换为Unicode
int nA2ULenSrc = strlen(pStrAnsiDest);//13 = 7+3*2
int nA2USize = sizeof(char);
int nA2ULen = nA2ULenSrc*nA2USize;
WCHAR* pWStrUnicode = new WCHAR[nA2ULen + 1];
memset(pWStrUnicode, 0, nA2ULen);
WCHAR* pWStrUnicodeDest = AnsiToUnicode(pStrAnsiDest,pWStrUnicode, nA2ULen); //注意为目标字符串的字节数要大于等于源字符串字节数
delete[]pWStrUnicode; //尤其是源字符串中包含中文时,ANSI中英文为1个字节 中文2个字节 而Unicode中用两个字节表示一个字符
delete[]pStrAnsiDest;
delete[]pWStrUnicodeDest;
pWStrUnicode = NULL;
pStrAnsiDest = NULL;
pWStrUnicodeDest = NULL;
//2-1 Unicode转Utf-8
int nU2UtfLenSrc = wcslen(pWStr);
int nU2UtfSize = sizeof(WCHAR);
int nU2UtfLen = nU2UtfLenSrc*nU2UtfSize;
char* pStrUnicode2Utf = new char[nU2UtfLen + 1];
memset(pStrUnicode2Utf, 0, nU2UtfLen);
char* pStrUnicode2UtfDest = UnicodeToUtf8(pWStr,pStrUnicode2Utf,nU2UtfLen);
int nU2UtfLenTest = strlen(pStrUnicode2Utf);
delete[]pStrUnicode2Utf;
pStrUnicode2Utf = NULL;
//2-2 utf-8转Unicode
int nUtf2ULenSrc = strlen(pStrUnicode2UtfDest);//16 = 7+3*3 Utf-8编码汉子占3个字节,英文占一个字节
int nUtf2USize = sizeof(char);
int nUtf2ULen = nUtf2ULenSrc*nUtf2USize;
WCHAR* pWsUtf2Unicode = new WCHAR[nUtf2ULen + 1];
memset(pWsUtf2Unicode,0,nUtf2ULen);
WCHAR* pWsUtf2UnicodeDest = Utf8ToUnicode(pStrUnicode2UtfDest,pWsUtf2Unicode,nUtf2ULen);
delete[]pWsUtf2Unicode;
delete[]pStrUnicode2UtfDest;
delete[]pWsUtf2UnicodeDest;
pWsUtf2Unicode = NULL;
pStrUnicode2UtfDest = NULL;
pWsUtf2UnicodeDest = NULL;
//3-1 Ansi转Utf-8
int nA2UtfLenSrc = strlen(pStr);//11 = 5+3*2
int nA2UtfSize = sizeof(char);
int nA2UtfLen = nA2UtfLenSrc*nA2UtfSize + 4;//必须保证目标字符串分配的字节大于等于源字符串
char* pStrAnsi2Utf = new char[nA2UtfLen + 1];
memset(pStrAnsi2Utf, 0, nA2UtfLen);
char* pStrAnsi2UtfDest = AnsiToUtf8(pStr,pStrAnsi2Utf,nA2UtfLen);
int nAnsi2UtfLenTest = strlen(pStrAnsi2Utf);
delete[]pStrAnsi2Utf;
pStrAnsi2Utf = NULL;
//3-2 utf-8转Ansi
int nUtf2ALenSrc = strlen(pStrAnsi2UtfDest);//5*1+3*3 = 14
int nUtf2ASize = sizeof(char);
int nUtf2ALen = nUtf2ALenSrc*nUtf2ASize;
char* pStrUtf2Ansi = new char[nUtf2ALen + 1];
memset(pStrUtf2Ansi, 0, nUtf2ALen);
char* pStrUtf2AnsiDest = Utf8ToAnsi(pStrAnsi2UtfDest,pStrUtf2Ansi,nUtf2ALen);
int nUtf2ALenTest = strlen(pStrUtf2Ansi);
int nUtf2ALenTest2 = strlen(pStrUtf2AnsiDest);
delete[]pStrUtf2Ansi;
delete[]pStrAnsi2UtfDest;
delete[]pStrUtf2AnsiDest;
pStrUtf2Ansi = NULL;
pStrAnsi2UtfDest = NULL;
pStrUtf2AnsiDest = NULL;
return 0;
}
///
/*================================================说明======================================================================================================================
1、strlen:返回的是字节数,不包含'\0'(结束符)。中英文不一致,且不同的字符编码集,值也不同(Ansi编码:中文2,英文1;Unicode编码:中文2,英文2;UTf-8编码:中文3,英文2)
2、wcslen:返回的是字符的个数,不包含'\0'(结束符)。中英文一致
3、sizeof:返回的是字节数,包含'\0'结束符,而'/0'在Unicode下也是占两个字节的
4、字符串长度等于:len = strlen*sizeof(char);wlen = wcslen*sizeof(WCHAR);
5、注意转换函数中传入的参数,为目标字符串按照对应的编码格式的字节数,代码中没有精确传入目标字符串需要传入的字节数
6、Ansi与Utf编码不能直接转换,需要先转换到Unicode,用Unicode作为中间的“桥梁”
============================================================================================================================================================================ */
4.1.3 转换为数字
字符串转换为数字,有两种方法。1、使用字符串流对象进行数字转换 2、使用标准函数。注意:使用字符串流对象要包含#include <sstream> #include <iostream> ,使用标准函数要引入<stdlib.h>
代码举例:
/*===========================================================================================================
Ansi
===============================================================================================================*/
string str = "John 20 50";
const char *cstr = "Amy 30 42";
ostringstream ostr; // The ostringstream object to write to
string name;
int score1, score2, average_score;
//1-1 string ->Number
// Read name and scores and compute average then write to ostr
istringstream istr1 (str); // istr1 will read from str
istr1 >> name >> score1 >> score2;
average_score = (score1 + score2)/2;
ostr << name << " has average score" << average_score << "\n";
cout << ostr.str();
//1-2 const char* ->Number
istringstream istr2; // istr2 will read from cstr
istr2.str(cstr);
istr2 >> name >> score1 >> score2;
average_score = (score1 + score2)/2;
ostr << name << " has average score" << average_score << "\n";
cout << ostr.str();
ostr << hex; //转化为16进制
ostr << name << "'s scores in hexadecimal are: " << score1 << " and " << score2 << "\n";
cout << ostr.str();
//1-3 使用函数
string intStr = "11";
int iValue = atoi(intStr.c_str());
string fStr = "3.141592688";
double fValue = atof(fStr.c_str());//double
string lStr = "90000";
long lValue = atol(lStr.c_str());
/*===========================================================================================================
Unicode
===============================================================================================================*/
wstring wStr = L"XiaoMing 18 180";
const wchar_t* cWstr = L"Lily 21 170";
wostringstream wostr; //写
//2-1 wstring -> Number
wistringstream wistr1(wStr); //读
int height = 0;
int age = 0;
wstring wsName = L"";
wistr1 >> wsName;
wistr1 >> age;
wistr1 >> height;
wostr << wsName << "age:" << age << "height:" << height<< "\n";
//2-2 const wchar_t* -> Number
wistringstream wistr2(cWstr); //读
wistr2 >> wsName;
wistr2 >> age;
wistr2 >> height;
wostr << wsName << "age:" << age << "height:" << height<< "\n";
//2-3 使用函数
wstring intwStr = L"11";
int iwValue = _wtoi(intwStr.c_str());
wstring fwStr = L"3.141592688";
double fwValue = _wtof(fwStr.c_str());//double
wstring lwStr = L"90000";
long lwValue = _wtol(lwStr.c_str());
wistringstream wistr3(fwStr);
double fwValueTemp = 0.0;
wistr3 >> fwValueTemp;
5 参考文献
- 一文带你弄懂C++中的ANSI、Unicode和UTF8三种字符编码_chenlycly的专栏-CSDN博客_ansi编码和unicode编码
- c++ ANSI、UNICODE、UTF8互转 - 越深入,越清晰 - 博客园
- 字符编码:ANSI、ASCII、Unicode、UTF-8、UTF-16、UTF-32概念和格式转换_LaugustusJ的博客-CSDN博客_utf16转ascii
- https://www.cnblogs.com/wswind/p/9811670.html
- C++字符串和数字转换完全攻略
http://c.biancheng.net/view/1527.html