JavaWeb学习中处理中文乱码
在JavaWeb学习当中,常常会遇到中文乱码问题,为了正常显示我们所需要的内容,我们必须要对他进行转码处理。
在web环境中,我们的get请求编码是ISO-8859-1,我们的中文字符编码是UTF-8,若使用默认的字符编码,则会导致乱码问题。
下面我们分两个Tomcat版本进行测试
7版本如下
xml配置
<?xml version="1.0" encoding="UTF-8"?>
<web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee http://xmlns.jcp.org/xml/ns/javaee/web-app_4_0.xsd"
version="4.0">
<servlet>
<servlet-name>Test</servlet-name>
<servlet-class>TestServlet</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>Test</servlet-name>
<url-pattern>/test</url-pattern>
</servlet-mapping>
</web-app>
get请求中:
index.jsp
<%@ page contentType="text/html;charset=UTF-8" language="java" %>
<html>
<head>
<title>Student</title>
</head>
<body>
<form action="/test">
姓名:<input type="text" name="name"><br/>
性别:<input type="text" name="gender"><br/>
学历:<input type="text" name="grade"><br/>
<input type="submit" value="提交">
</form>
</body>
</html>
Servlet
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.io.PrintWriter;
public class TestServlet extends HttpServlet {
@Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
resp.setContentType("text/html;charset=UTF-8");
PrintWriter out = resp.getWriter();
String name=req.getParameter("name");
String gender = req.getParameter("gender");
String grade = req.getParameter("grade");
out.println("<html>");
out.println("<head><title>xxxxxxx</title></head>");
out.println("<body>");
out.println("姓名:" + name + "</br>");
out.println("姓别:" + gender + "</br>");
out.println("学历:" + grade);
out.println("</body>");
out.println("</html>");
}
}
运行结果如下:
前端发送中文信息:
收到的是乱码
通过get请求拿到数据是ISO-8859-1编码,如果我们不做任何处理,这时候,我们传回去的也是ISO-8859-1,而浏览器设定了UTF-8识别,所以当然会乱码
contentType="text/html;charset=UTF-8"
如何解决?
get方法中我们只需要将拿到的参数进行转码即可。
(req.getParameter("xxxx").getBytes("ISO-8859-1"),"UTF-8");
例如我们只将name进行转码:
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.io.PrintWriter;
public class TestServlet extends HttpServlet {
@Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
resp.setContentType("text/html;charset=UTF-8");
PrintWriter out = resp.getWriter();
String name=new String(req.getParameter("name").getBytes("ISO-8859-1"),"UTF-8");
String gender = req.getParameter("gender");
String grade = req.getParameter("grade");
out.println("<html>");
out.println("<head><title>xxxxxxx</title></head>");
out.println("<body>");
out.println("姓名:" + name + "</br>");
out.println("姓别:" + gender + "</br>");
out.println("学历:" + grade);
out.println("</body>");
out.println("</html>");
}
}
发送一样的数据
结果如下:
我们只需要用(req.getParameter(“xxxx”).getBytes(“ISO-8859-1”),“UTF-8”);就可以将指定的字符串成功转码了。
但要注意的是Get请求中用setCharacterEncoding(“UTF-8”);并不能成功转码,
Servlet
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.io.PrintWriter;
public class TestServlet extends HttpServlet {
@Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
resp.setContentType("text/html;charset=UTF-8");
req.setCharacterEncoding("UTF-8");
resp.setCharacterEncoding("UTF-8");
PrintWriter out = resp.getWriter();
String name = req.getParameter("name");
String gender = req.getParameter("gender");
String grade = req.getParameter("grade");
out.println("<html>");
out.println("<head><title>xxxxxxx</title></head>");
out.println("<body>");
out.println("姓名:" + name + "</br>");
out.println("姓别:" + gender + "</br>");
out.println("学历:" + grade);
out.println("</body>");
out.println("</html>");
}
}
运行结果如下:
Post请求:
指定发送Post请求,Servlet里doPost方法拿数据
同样发送相同信息至Servlet,只对name进行转码,结果与get请求相同
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.io.PrintWriter;
public class TestServlet extends HttpServlet {
@Override
protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
resp.setContentType("text/html;charset=UTF-8");
PrintWriter out = resp.getWriter();
String name=new String(req.getParameter("name").getBytes("ISO-8859-1"),"UTF-8");
String gender = req.getParameter("gender");
String grade = req.getParameter("grade");
out.println("<html>");
out.println("<head><title>xxxxxxx</title></head>");
out.println("<body>");
out.println("姓名:" + name + "</br>");
out.println("姓别:" + gender + "</br>");
out.println("学历:" + grade);
out.println("</body>");
out.println("</html>");
}
}
运行结果如下:
注意:Post请求就可以不需要一个一个转码这么麻烦了。可以直接使用setCharacterEncoding(“UTF-8”);方法统一处理
Servlet
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.io.PrintWriter;
public class TestServlet extends HttpServlet {
@Override
protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
resp.setContentType("text/html;charset=UTF-8");
req.setCharacterEncoding("UTF-8");
resp.setCharacterEncoding("UTF-8");
PrintWriter out = resp.getWriter();
String name = req.getParameter("name");
String gender = req.getParameter("gender");
String grade = req.getParameter("grade");
out.println("<html>");
out.println("<head><title>xxxxxxx</title></head>");
out.println("<body>");
out.println("姓名:" + name + "</br>");
out.println("姓别:" + gender + "</br>");
out.println("学历:" + grade);
out.println("</body>");
out.println("</html>");
}
}
运行结果如下:
由此可见:
在7版本的猫猫中,get请求的乱码是要用一个个转码的,(req.getParameter(“xxxx”).getBytes(“ISO-8859-1”),“UTF-8”);,,并不能使用使用Post请求中setCharacterEncoding(“UTF-8”)的方法进行转换,而post请求中,可以用req.getParameter(“xxxx”).getBytes(“ISO-8859-1”),“UTF-8”);进行设置,但我们更喜欢用setCharacterEncoding(“UTF-8”)进行统一设置,从前端取进来要设置一下request,Servlet传出去也要设置一下respone。这样即可解决转码问题。
9版本如下
前端数据不变,发送Get请求,Servlet如下,我们只对name进行处理
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.io.PrintWriter;
public class TestServlet extends HttpServlet {
@Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
resp.setContentType("text/html;charset=UTF-8");
PrintWriter out = resp.getWriter();
String name = new String(req.getParameter("name").getBytes("ISO-8859-1"),"UTF-8");
String gender = req.getParameter("gender");
String grade = req.getParameter("grade");
out.println("<html>");
out.println("<head><title>xxxxxxx</title></head>");
out.println("<body>");
out.println("姓名:" + name + "</br>");
out.println("姓别:" + gender + "</br>");
out.println("学历:" + grade);
out.println("</body>");
out.println("</html>");
}
}
运行结果如下:
有意思的来了,会发现,结果刚好相反。设置了转码却乱码了,没设置的却显示正常。
而在post请求中却又是正常逻辑显示
为什么呢???
百度了一下:因为8版本以上的猫猫已经对get请求的ISO-8859-1进行了转换,如果我们再进行转换,就会有相反结果。
验证:
前端发送get请求,Servlet如下:
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.io.IOException;
import java.io.PrintWriter;
public class TestServlet extends HttpServlet {
@Override
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
resp.setContentType("text/html;charset=UTF-8");
// req.setCharacterEncoding("UTF-8");
// resp.setCharacterEncoding("UTF-8");
PrintWriter out = resp.getWriter();
// String name = req.getParameter("name");
// String name = new String(req.getParameter("name").getBytes("ISO-8859-1"),"UTF-8");
String gender = req.getParameter("gender");
System.out.println(gender);
String grade = req.getParameter("grade");
System.out.println(grade);
out.println("<html>");
out.println("<head><title>xxxxxxx</title></head>");
out.println("<body>");
// out.println("姓名:" + name + "</br>");
out.println("姓别:" + gender + "</br>");
out.println("学历:" + grade);
out.println("</body>");
out.println("</html>");
}
}
控制台输出结果如下:
此时并没有对get请求的任何信息做处理,直接拿到正常的中文。
试一下Post,结果如下:
说明8版本以后的Tomcat确实对Get请求字符编码先进行了处理,而Post请求并没有做处理。
post请求中的处理只需要调用setCharacterEncoding(“UTF-8”)方法即可
总结:
综上,8版本以上猫猫的get请求传入Servlet已经是UTF-8,可以正常显示,我们只需要处理响应的编码和post请求的传入传出即可,若对get还进行转码,本身转好的UTF-8又转回去了,所以乱码。
8版本以下猫猫的请求未被做任何处理,需要自己手动转码。
看看自己的猫猫版本的多少,不要踩坑!!!!!