GB18030编解码可逆:
String aa = "RE怈鼡腀呈n蹒-;�";
byte[] xx = aa.getBytes("GB18030");
System.out.println(new String(xx, "GB18030"));
输出仍是:RE怈鼡腀呈n蹒-;�
GB18030解编码有时不可逆:
byte[] input = {82, 69, -112, 64, -4, -109, -60, 64, -77, -54, 110, -11, -25, 45, 59, -19};
String str = new String(input, "GB18030");
byte[] output = str.getBytes("GB18030");
for(byte b: output) {
System.out.println(b);
}
前后bytes对比:
[82, 69, -112, 64, -4, -109, -60, 64, -77, -54, 110, -11, -25, 45, 59, -19]
[82, 69, -112, 64, -4, -109, -60, 64, -77, -54, 110, -11, -25, 45, 59, -124, 49, -92, 55]
主要是最后那个-19,被解成什么了呢?一个未知字符,为什么?
因为最后一个原本是要组成汉字的字节-19落单了,被解成未知字符,再编时就编入四字节序列了