使用jsoncpp写中文乱码的坑

最新推荐文章于 2024-05-29 16:44:18 发布

zhenfei2017

最新推荐文章于 2024-05-29 16:44:18 发布

阅读量6k

点赞数 1

分类专栏：学习笔记

本文链接：https://blog.csdn.net/qq_16135205/article/details/86299508

版权

学习笔记专栏收录该内容

7 篇文章 0 订阅

订阅专栏

1. 问题描述
当使用老版本jsoncpp写中文，使用新版本的读取，就会乱码。

2. jsoncpp 源码地址：
https://github.com/open-source-parsers/jsoncpp.git

3. 字节流类型
string-escape 对二进制的字节流，
unicode-escape unicode码点值流

4. GitHub上的提交记录
Revision: 42a161fc80b32cc63c1a3e7ab1c9ed588a83edaa
Author: Paweł Kierski pkierski@gmail.com
Date: 2017/10/4 9:19:20
Message:
Serialize UTF-8 string with Unicode escapes (#687)
Squashed and merged.

Modified: src/lib_json/json_writer.cpp

5. 代码对比
2017/10/4 以前的版本
序列化字符串使用的utf8字符串值的16进制表示 (string-escape)

std::string valueToQuotedString( const char *value ){
…
switch(*c){
default:\不是特殊字符，16进制表示
if ( isControlCharacter( *c ) ){
std::ostringstream oss;
oss << “\u” << std::hex << std::uppercase << std::setfill(‘0’) << std::setw(4) << static_cast(*c);
result += oss.str();
}
}
}

bool isControlCharacter(char ch){
return ch > 0 && ch <= 0x1F;
}

2017/10/4后的版本
使用utf8字符串码点值的16进制表示(unicode-escape)

static JSONCPP_STRING valueToQuotedStringN(const char* value, unsigned length) {
//
//转化到码点上，我的理解就是unicode的码表值
unsigned int cp = utf8ToCodepoint(c, end);
// don’t escape non-control characters
// (short escape sequence are applied above)
if (cp < 0x80 && cp >= 0x20)
result += static_cast(cp);
else if (cp < 0x10000) { // codepoint is in Basic Multilingual Plane
result += “\u”;
result += toHex16Bit(cp);
} else { // codepoint is not in Basic Multilingual Plane
// convert to surrogate pair first
cp -= 0x10000;
result += “\u”;
result += toHex16Bit((cp >> 10) + 0xD800);
result += “\u”;
result += toHex16Bit((cp & 0x3FF) + 0xDC00);
}
}

6. 切记，不要随便升级库，当与服务端或其他端调试通的情况下，不要随便升级。

zhenfei2017

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
使用jsoncpp写中文乱码的坑

1. 问题描述当使用老版本jsoncpp写中文，使用新版本的读取，就会乱码。2. jsoncpp 源码地址：https://github.com/open-source-parsers/jsoncpp.git3. 字节流类型string-escape 对二进制的字节流，unicode-escape unicode码点值流4. GitHub上的提交记录Revision: 42a...
复制链接

扫一扫