encodeURIComponent为什么使用两遍？

最新推荐文章于 2021-06-18 18:40:29 发布

yakoo5

最新推荐文章于 2021-06-18 18:40:29 发布

阅读量1.6k

点赞数 1

分类专栏： Java 文章标签： encoding character javascript 浏览器 scheme byte

Java 专栏收录该内容

63 篇文章 1 订阅

订阅专栏

从使用上看来，javascript使用encodeURIComponent编码一次，如果是作为Url请求发送，浏览器是自动会作一次解码，编码方式为浏览器默认。这样在一次编码后，请求到后台后，比如中文就成为乱码了。中间即使编码方式是一致也会乱码。解决方法是在前台javascript使用encodeURIComponentg两次，这样浏览器解码一次后，还是一种编码后的字符，传递到后台就不会是乱码，当然你得在后台做一次解码工作。

这个是不是真理，不好说，我是通过实验得出的。比如你把一个请求：http://localhost:8080/sxkj/news/actionNewsByCategoryId.do?categoryId=3&categoryName=%E4%BA%BA%E6%89%8D%E6%8B%9B%E8%81%98 浏览器是自动把我categoryName后面的给解码为了中文“人才招聘”,请求到了后台是乱码，而把categoryName后面“%E4%BA%BA%E6%89%8D%E6%8B%9B%E8%81%98”，再次编码，作为参数请求后台，后台拿到的就是正确的中文字符了。

编码解码的一个工具网站：http://meyerweb.com/eric/tools/dencoder/

JAVA API上对URLDecoder的注释：

The following rules are applied in the conversion:

The alphanumeric characters "a" through "z", "A" through "Z" and "0" through "9" remain the same.
The special characters ".", "-", "*", and "_" remain the same.
The plus sign "+" is converted into a space character " " .
A sequence of the form "%xy" will be treated as representing a byte wherexy is the two-digit hexadecimal representation of the 8 bits. Then, all substrings that contain one or more of these byte sequences consecutively will be replaced by the character(s) whose encoding would result in those consecutive bytes. The encoding scheme used to decode these characters may be specified, or if unspecified, the default encoding of the platform will be used.

最后一句英文不太好理解，其他几句都简单。官方的翻译如下（最后一句也是翻得乱七八糟）：

转换中使用以下规则：

字母数字字符 "a" 到 "z"、"A" 到 "Z" 和 "0" 到 "9" 保持不变。
特殊字符 "."、"-"、"*" 和 "_" 保持不变。
加号 "+" 转换为空格字符 " "。
将把 "%xy" 格式序列视为一个字节，其中 xy 为 8 位的两位十六进制表示形式。然后，所有连续包含一个或多个这些字节序列的子字符串，将被其编码可生成这些连续字节的字符所代替。可以指定对这些字符进行解码的编码机制，或者如果未指定的话，则使用平台的默认编码机制。