从一组使用UTF-8编码字符串的字节开始,从该数据创建一个字符串,然后获得一些字节以不同的编码方式对该字符串进行编码:
byte[] utf8bytes = { (byte)0xc3, (byte)0xa2, 0x61, 0x62, 0x63, 0x64 };
Charset utf8charset = Charset.forName("UTF-8");
Charset iso88591charset = Charset.forName("ISO-8859-1");
String string = new String ( utf8bytes, utf8charset );
System.out.println(string);
// "When I do a getbytes(encoding) and "
byte[] iso88591bytes = string.getBytes(iso88591charset);
for ( byte b : iso88591bytes )
System.out.printf("%02x ", b);
System.out.println();
// "then create a new string with the bytes in ISO-8859-1 encoding"
String string2 = new String ( iso88591bytes, iso88591charset );
// "I get a two different chars"
System.out.println(string2);
这将正确输出字符串和iso88591字节:
âabcd
e2 61 62 63 64
âabcd
因此,您的字节数组未与正确的编码配对:
String failString = new String ( utf8bytes, iso88591charset );
System.out.println(failString);
产出
âabcd
(或者,或者您只是将utf8字节写入文件并以iso88591在其他位置读取它们)