java字符心得_关于java中字符编码的一点心得(zt)-CSDN博客

本文链接：https://blog.csdn.net/weixin_30954817/article/details/114732058

这是张孝祥老师的java就业培训视频教程里面的一道题目(有所变动)：

编写下面的程序代码，分析和观察程序的运行结果：

import java.io.*;

public class TestCodeIO {

public static void main(String[] args) throws Exception{

InputStreamReader isr = new InputStreamReader(System.in,"iso8859-1");

BufferedReader br = new BufferedReader (isr);

String strLine = br.readLine();

br.close();

isr.close();

System.out.println(strLine);

}

运行程序后，输入“中国”两个字，输出结果为 ???ú

请按照下面两种方法修改上述程序，是输入的中文能够正常输出

1。修改程序中的语句

InputStreamReader isr = new InputStreamReader(System.in,"iso8859-1");

2。不修改上面的语句，修改下面的语句

System.out.println(strLine);

第一种该法很简单，只要改成下面这样就可以了，这里不详细讨论

InputStreamReader isr = new InputStreamReader(System.in,"gb2312");

这里我要详细讨论的是第二种该法怎么改

起初我是这样改的

System.out.println(new String (strLine.getBytes(),"iso8859-1"));输入“中国”后输出的结果虽然不是上面所述的乱码，但是还是乱码，显然这种该法是不正确的！

这里我要感谢软件民工告诉我的正确改法，使我恍然大悟

System.out.println(new String (strLine.getBytes("iso8859-1")));

这两种改法究竟有什么区别呢？为了方便大家阅读，我先把正确和错误的改法帖出来：

import java.io.*;

public class TestCodeIO {

public static void main(String[] args) throws Exception{

InputStreamReader isr = new InputStreamReader(System.in,"iso8859-1"); //Create an InputStreamReader that uses the given charset decoder

BufferedReader br = new BufferedReader (isr);

String strLine = br.readLine();

br.close();

isr.close();

System.out.println(strLine);

System.out.println(new String (strLine.getBytes(),"iso8859-1"));//错误改法 //Encodes this String (strLine) into a sequence of bytes using the platform's

//default charset(gb2312) then constructs a new String by decoding the

//specified array of bytes using the specified charset (iso8859-1)

//because this String (strLine) uses the charset decoder "iso8859-1",so it can

//only be encoded by "iso8859-1",cann't be encoded by the platform's default

//charset "gb2312",so this line is wrong. System.out.println(new String (strLine.getBytes("iso8859-1")));//正确改法 //Encodes this String (strLine) into a sequence of bytes using the named

//charset (iso8859-1),then constructs a new String by decoding the

//specified array of bytes using the platform's default charset (gb2312).

//This line is right.}

}

上面的英文注释已经说得很清楚了，这里我还是解释一下吧：

首先是错误的改法 System.out.println(new String (strLine.getBytes(),"iso8859-1"));这句代码是将strLine中的字符串用系统默认的编码方式(这里是gb2312)

转换为字节序列，然后用指定的编码方式(这里是iso8859-1)构造一个新的

String对象，并打印到屏幕上。

错误在哪里呢？

请注意这一段代码

InputStreamReader isr = new InputStreamReader(System.in,"iso8859-1");

BufferedReader br = new BufferedReader (isr);

String strLine = br.readLine();

这里strLine存储的内容是用指定的编码方式(iso8859-1)存储的，而转换成字节码

的时候(这句代码strLine.getBytes())却使用了系统默认的gb2312编码，所以当然就

输出乱码了！然后用gb2312编码的字节序列构建新的String对象的时候又使用了

iso8859-1编码，所以输出的乱码和System.out.println(strLine)有所不同。至于正确的改法就不用详细说明了吧，首先将strLine用iso8859-1编码方式转换成字节序列，然后用系统默认的编码方式(gb2312)构建新的String对象，然后打印输出