Servlet 输出中文乱码的新收获(1)

最新推荐文章于 2022-05-25 14:28:21 发布

leobug

最新推荐文章于 2022-05-25 14:28:21 发布

阅读量135

点赞数

分类专栏： Java 文章标签： Servlet Tomcat 浏览器 Apache HTML

Java 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

又碰到servlet 输出中文乱码的问题，恼火。研究了一下，有了新的发现和认识。

原始代码：

java 代码

protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
PrintWriter pw = response.getWriter();
response.setCharacterEncoding("utf-8");
response.setContentType("text/html; charset=utf-8");
pw.print("中文");
}

无论把3、4两句改成gbk还是utf-8，页面访问到的一律是??

一怒之下用wpe抓包，发现无论设为utf-8还是gbk抓到的均为

HTTP 代码

HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Content-Type: text/html;charset=ISO-8859-1
Content-Length: 2
Date: Thu, 08 Mar 2007 06:04:55 GMT
??

说明3、4两句没起作用，检查代码，尝试把2和三四顺序调整，乱码问题解决。

检查api文档，发现说明如下

PrintWriter getWriter() throws IOException

Returns a PrintWriter object that can send character text to the client. The PrintWriter uses the character encoding returned by getCharacterEncoding(). If the response's character encoding has not been specified as described in getCharacterEncoding (i.e., the method just returns the default value ISO-8859-1), getWriter updates it to ISO-8859-1.

推断getWriter()返回的PrintWriter使用的charactor encoding是在这个函数返回时即已确定的，但到底是返回的PrintWriter内部属性还是运行时的控制，未找到依据。

查看 tomcat中setCharacterEncoding方法的实现时发现如下代码：

java 代码