UTF-8与std::string转换

ListenAlone

已于 2022-06-24 15:12:52 修改

阅读量1k

点赞数

分类专栏： C++ 文章标签： c++

于 2022-06-24 15:09:26 首次发布

原文链接：https://blog.csdn.net/Stone_Wang_MZ/article/details/106761471

版权

C++ 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

string stringUTF8(const string& str)
{
    int nwLen = ::MultiByteToWideChar(CP_ACP, 0, str.c_str(), -1, NULL, 0);
    wchar_t* pwBuf = new wchar_t[nwLen + 1];
    ZeroMemory(pwBuf, nwLen * 2 + 2);
    ::MultiByteToWideChar(CP_ACP, 0, str.c_str(), str.length(), pwBuf, nwLen);
    int nLen = ::WideCharToMultiByte(CP_UTF8, 0, pwBuf, -1, NULL, NULL, NULL, NULL);
    char* pBuf = new char[nLen + 1];
    ZeroMemory(pBuf, nLen + 1);
    ::WideCharToMultiByte(CP_UTF8, 0, pwBuf, nwLen, pBuf, nLen, NULL, NULL);
    std::string retStr(pBuf);
    delete[]pwBuf;
    delete[]pBuf;
    pwBuf = NULL;
    pBuf = NULL;
    return retStr;
}
 
string UTF8string(string strTemp)
{
    char buf[1024 * 60];
    snprintf(buf, sizeof(buf), u8"%s", strTemp.c_str());
    TCHAR wscBuffer[1024 * 10] = { 0 };
    MultiByteToWideChar(CP_UTF8, 0, buf, (int)strlen(buf) + 1, wscBuffer, sizeof(wscBuffer) / sizeof(wchar_t));
    memset(buf, 0, 1024 * 9);
    WideCharToMultiByte(CP_ACP, 0, wscBuffer, -1, buf, 1024 * 9, NULL, NULL);
    return buf;
}

C++中string的UTF-8格式_三石目的博客-CSDN博客_c++ string utf8

补充：wstring 和 string 区别

wstring是宽字符，占用2个字节的大小，针对UNICODE编码格式，用于对中文汉字的定义和赋值。wstring跟string区别为：字节不同、编码格式不同、使用不同。

一、字节不同
1、wstring：wstring是宽字符，占用2个字节的大小，即16bit。

2、string：string是窄字符，占用1个字节的大小，即8bit。

也就是说，宽字符，每表示一个字符其实是占了16bit，即2个char的大小。而汉字就是需要16bit来表示。

二、编码格式不同
1、wstring：wstring一般针对UNICODE编码格式，一个单元两个char。

2、string：string一般针对ASCII编码格式，一个单元一个char。

三、使用不同
1、wstring：在使用中文汉字时，使用wstring来定义变量进行赋值。

2、string：在使用英文汉字时，使用string来定义变量进行赋值。

https://blog.csdn.net/liuming690452074/article/details/115765683