在tomcat下,request请求由tomcat的connector组件进行接收并进行一系列处理最后达到用户的servlet进行业务处理。若在requst中存在非英文字符,通常会对这些字符进行编码。通常处理办法是在前端页面指定编码,并在tomcat容器启动时注册一个全局filter进行编码设置,针对post请求这种做法是可以解决问题的,但针对get请求,你会发现即使这样设置仍然不起作用。这个问题困扰我许久啊,最终还是在tomcat的文档中找到了相关设置说明。
Tomcat will use ISO-8859-1 as the default character encoding of the entire URL, including the query string ("GET parameters") (though see Tomcat 8 notice below).
There are two ways to specify how GET parameters are interpreted:
-
Set the URIEncoding attribute on the <Connector> element in server.xml to something specific (e.g. URIEncoding="UTF-8").
-
Set the useBodyEncodingForURI attribute on the <Connector> element in server.xml to true. This will cause the Connector to use the request body's encoding for GET parameters.
In Tomcat 8 starting with 8.0.0 (8.0.0-RC3, to be specific), the default value of URIEncoding attribute on the <Connector> element depends on "strict servlet compliance" setting. The default value (strict compliance is off) ofURIEncoding is now UTF-8. If "strict servlet compliance" is enabled, the default value is ISO-8859-1.
POST requests should specify the encoding of the parameters and values they send. Since many clients fail to set an explicit encoding, the default is used (ISO-8859-1).
官方建议:
Using UTF-8 as your character encoding for everything is a safe bet. This should work for pretty much every situation.
In order to completely switch to using UTF-8, you need to make the following changes:
-
Set URIEncoding="UTF-8" on your <Connector> in server.xml. References: HTTP Connector, AJP Connector.
-
Use a character encoding filter with the default encoding set to UTF-8
- Change all your JSPs to include charset name in their contentType.
For example, use <%@page contentType="text/html; charset=UTF-8" %> for the usual JSP pages and <jsp:directive.page contentType="text/html; charset=UTF-8" /> for the pages in XML syntax (aka JSP Documents).
- Change all your servlets to set the content type for responses and to include charset name in the content type to be UTF-8.
Use response.setContentType("text/html; charset=UTF-8") or response.setCharacterEncoding("UTF-8").
- Change any content-generation libraries you use (Velocity, Freemarker, etc.) to use UTF-8 and to specify UTF-8 in the content type of the responses that they generate.
-
Disable any valves or filters that may read request parameters before your character encoding filter or jsp page has a chance to set the encoding to UTF-8. For more information see