问题现象(通过一个例子查看)
例子:
int main(int argc, char** argv) {
uint8_t in = stoi(string(argv[1]));
printf("in: %c, %d\n", in, in);
// 序列化
std::ostringstream ostream;
ostream.str("");
ostream << in;
// 反序列化
std::istringstream istream(ostream.str());
uint8_t out;
istream >> out;
printf("out: %c, %d\n", out, out);
}
输出结果:
符合预期:
root@DAVINCI-D01-001:~/CLionProjectst/Test/cmake-build-debug# ./main 1
in: , 1
out: , 1
不符合预期:
root@DAVINCI-D01-001:~/CLionProjectst/Test/cmake-build-debug# ./main 9
in: , 9
out: , 0
针对不同的输入,反序列化的结果有时不合入预期,会导致反序列化失败。
尝试:使用istringstream的get方法替换operator>>,有效
- 查看uint8_t的linux的系统定义,对应的是/usr/include/stdint.h
typedef unsigned char uint8_t;
可知uint8_t对应的是无符号字符。
- 查看istringstream operator>>的定义(具体解析可以参考
)
看到operator>>没有针对char的重载,是不是std对char的支持不太好。
- 查看istreamsteam 读取字符的其他方法
get Get characters (public member function )
更改后的代码
int main(int argc, char** argv) {
uint8_t in = stoi(string(argv[1]));
printf("in: %c, %d\n", in, in);
// 序列化
std::ostringstream ostream;
ostream.str("");
ostream << in;
// 反序列化
std::istringstream istream(ostream.str());
uint8_t out;
//istream >> out;
out = istream.get();
printf("out: %c, %d\n", out, out);
}
输出结果:
root@DAVINCI-D01-001:~/CLionProjectst/Test/cmake-build-debug# ./main 1
in: , 1
out: , 1
root@DAVINCI-D01-001:~/CLionProjectst/Test/cmake-build-debug# ./main 9
in: , 9
out: , 9
神奇的解决了。
反思:Why, operator>> 和 get的区别是什么?
- operator>> 操作不符合预期的时候,是不是操作失败了?
对于stream来讲,可以通过good(), 或者 rdstate() 获取当前流的标志;当不符合预期时,增加如下代码查看执行后的流状态:
printf("istream: good:%d \n", istream.good());
if (istream.rdstate() & std::ios_base::eofbit) {
printf("istream'flag is eofbit\n");
}
if (istream.rdstate() & std::ios_base::failbit) {
printf("istream'flag is failbit\n");
}
operator>> 输出:
root@DAVINCI-D01-001:~/CLionProjectst/Test/cmake-build-debug# ./main 9
in: , 9
istream: good:0
istream’flag is eofbit
istream’flag is failbit
out: , 0
get 输出:
root@DAVINCI-D01-001:~/CLionProjectst/Test/cmake-build-debug# ./main 9
in: , 9
istream: good:1
out: , 9
operator >> | get | |
---|---|---|
流状态 | eofbit && failbit | good |
-
对比operator>> 和 get的源码
get源码
template<typename _CharT, typename _Traits> typename basic_istream<_CharT, _Traits>::int_type basic_istream<_CharT, _Traits>:: get(void) { const int_type __eof = traits_type::eof(); int_type __c = __eof; _M_gcount = 0; ios_base::iostate __err = ios_base::goodbit; sentry __cerb(*this, true); // 此处为差异点 if (__cerb) { __try { __c = this->rdbuf()->sbumpc(); // 27.6.1.1 paragraph 3 if (!traits_type::eq_int_type(__c, __eof)) _M_gcount = 1; else __err |= ios_base::eofbit; } __catch(__cxxabiv1::__forced_unwind&) { this->_M_setstate(ios_base::badbit); __throw_exception_again; } __catch(...) { this->_M_setstate(ios_base::badbit); } } if (!_M_gcount) __err |= ios_base::failbit; if (__err) this->setstate(__err); return __c; }
operator>>源码
// 27.6.1.2.3 Character extraction templates template<typename _CharT, typename _Traits> basic_istream<_CharT, _Traits>& operator>>(basic_istream<_CharT, _Traits>& __in, _CharT& __c) { typedef basic_istream<_CharT, _Traits> __istream_type; typedef typename __istream_type::int_type __int_type; typename __istream_type::sentry __cerb(__in, false); // 此处为差异点 if (__cerb) { ios_base::iostate __err = ios_base::goodbit; __try { const __int_type __cb = __in.rdbuf()->sbumpc(); if (!_Traits::eq_int_type(__cb, _Traits::eof())) __c = _Traits::to_char_type(__cb); else __err |= (ios_base::eofbit | ios_base::failbit); } __catch(__cxxabiv1::__forced_unwind&) { __in._M_setstate(ios_base::badbit); __throw_exception_again; } __catch(...) { __in._M_setstate(ios_base::badbit); } if (__err) __in.setstate(__err); } return __in; }
stringstream是先将流里面的内容转换为sentry对象,但是operator>>和get构造sentry对象的传参不同,我们来看下sentry构造方法的参数的含义:
/** * @brief The constructor performs all the work. * @param __is The input stream to guard. * @param __noskipws Whether to consume whitespace or not. * * If the stream state is good (@a __is.good() is true), then the * following actions are performed, otherwise the sentry state * is false (<em>not okay</em>) and failbit is set in the * stream state. * * The sentry's preparatory actions are: * * -# if the stream is tied to an output stream, @c is.tie()->flush() * is called to synchronize the output sequence * -# if @a __noskipws is false, and @c ios_base::skipws is set in * @c is.flags(), the sentry extracts and discards whitespace * characters from the stream. The currently imbued locale is * used to determine whether each character is whitespace. * * If the stream state is still good, then the sentry state becomes * true (@a okay). */ explicit sentry(basic_istream<_CharT, _Traits>& __is, bool __noskipws = false);
第二个参数是标识,是否要消化掉whitespace,当配置为false是为消化掉whitespace,true为whitespace;
难道是应为operator>>将9当成whitespace消化掉了!!! 我们来验证下,
将istream >> out; 更改为 istream >> std::noskipws >> out;
运行结果:
root@DAVINCI-D01-001:~/CLionProjectst/Test/cmake-build-debug# ./main 9
in: , 9
istream: good:1
out: , 9
庐山真面目
istream的operator>>会默认去除whitespace,所以十进制为9字符会被清除,那么什么样的字符才是whitespace? 见isspace简介:
Check if character is a white-space
Checks whether c is a white-space character.
For the "C"
locale, white-space characters are any of:
' ' | (0x20) | space (SPC) |
---|---|---|
'\t' | (0x09) | horizontal tab (TAB) |
'\n' | (0x0a) | newline (LF) |
'\v' | (0x0b) | vertical tab (VT) |
'\f' | (0x0c) | feed (FF) |
'\r' | (0x0d) | carriage return (CR) |
所以以上字符都是whitespace,当存在stream中,使用operator>>默认会先消化这些whitespace