tomcat如何进行请求信息编码

最新推荐文章于 2023-05-21 13:55:25 发布

轻风无言

最新推荐文章于 2023-05-21 13:55:25 发布

阅读量595

点赞数 2

分类专栏： java网络

本文链接：https://blog.csdn.net/u012664375/article/details/54882122

版权

java网络专栏收录该内容

2 篇文章 0 订阅

订阅专栏

我们知道网络传输的，都是二进制字节流，那么服务器如何编码，怎么知道哪个字符集进行编码呢，那我们深入分析下tomcat连接，仔细探讨下。
接下来，我们看一下段代码，这是一个很简单的表单。

 <form action="demo01?name=中国"   method="post">
           <input type="text" name="name1" value="张三"/>
           <input type="submit" value="提交"/>
 </form>

controller中，我们直接用 HttpServletRequest，不用spring获取参数。

@RequestMapping(value = "/demo01", method = RequestMethod.GET)
     public String dologin1(HttpServletRequest request) throws UnsupportedEncodingException {
           log.info(request.getCharacterEncoding());
           log.info("name:中国" + request.getParameter("name"));
           log.info("name1:张三" + request.getParameter("name1"));
           return "login";
}

运行tomcat，结果如下，中文乱码：
这里写图片描述
我们用fiddler查看请求的详情：

我们来经过测试下：

 @Test
     public void test() throws UnsupportedEncodingException {
           String str = "中国";
           byte[] bytes = str.getBytes("utf-8");
           System.out.println(Hex.encodeHex(bytes));
           System.out.println(new String(bytes, "iso8859-1"));
           String str1 = "张三";
           byte[] bytes1 = str1.getBytes("utf-8");
           System.out.println(Hex.encodeHex(bytes1));
           System.out.println(new String(bytes1, "iso8859-1"));
     }

打印如下：

e4b8ade59bbd
iso8859-1编码: ä¸å›½
e5bca0e4b889
å¼ ä¸‰

由此，可以发现，我使用的谷歌浏览器，默认使用的中文编码为utf-8,而tomcat编码默认的是iso8859-1编码，由于编码对应的字符不同，所以造成乱码。
既然有编码问题，那么肯定可以解决，查看tomcat手册
发现tomcat连接器可以指定uri编码，参数URIEncoding：This specifies the character encoding used to decode the URI bytes, after %xx decoding the URL. If not specified, ISO-8859-1 will be used.
在server.xml中配置如下：

<Connector connectionTimeout="20000" port="8080" protocol="HTTP/1.1" URIEncoding="utf-8" redirectPort="8443"/>

此时运行tomcat，uri参数问题解决，结果如下：
这里写图片描述

那请求体参数如何进行编码呢？我们查看servelt源码发现，请求体的编码可以在获取参数前进行设置，由此猜想，tomcat解析请求体参数是在第一次使用时进行解析，也不难理解，字符串解析是耗性能的，既然不需要使用，那么不用解析，同样就不用消耗这部分性能。

/**
     * Overrides the name of the character encoding used in the body of this
     * request. This method must be called prior to reading request parameters
     * or reading input using getReader(). Otherwise, it has no effect.
     *
     * @param env      <code>String</code> containing the name of
     *                 the character encoding.
     * @throws         UnsupportedEncodingException if this
     *                 ServletRequest is still in a state where a
     *                 character encoding may be set, but the specified
     *                 encoding is invalid
     */
    public void setCharacterEncoding(String env) throws UnsupportedEncodingException;

改变controller代码,增加utf-8编码：

  @RequestMapping(value = "/demo01", method = RequestMethod.POST)
     public String dologin(HttpServletRequest request) throws UnsupportedEncodingException {
           request.setCharacterEncoding("utf-8");
           log.info(request.getCharacterEncoding());
           log.info("name:中国" + request.getParameter("name"));
           log.info("name1:张三" + request.getParameter("name1"));
           return "login";
     }

运行tomcat，发现编码问题完美解决：
这里写图片描述
难道每次获取参数前都要设置编码吗？肯定有更省事的方式，那就是过滤器，且我们可以直接用spring提供的现成的，org.springframework.web.filter.CharacterEncodingFilter,查看其代码：

@Override
     protected void doFilterInternal(
                HttpServletRequest request, HttpServletResponse response, FilterChain filterChain)
                throws ServletException, IOException {

           if (this.encoding != null && (this.forceEncoding || request.getCharacterEncoding() == null)) {
                request.setCharacterEncoding(this.encoding);
                if (this.forceEncoding) {
                     response.setCharacterEncoding(this.encoding);
                }
           }
           filterChain.doFilter(request, response);
     }