URLDecoder解码异常 URLDecoder: Illegal hex characters in escape (%) pattern - For input string: “xxx“

凌晨九点半

已于 2022-03-29 17:23:19 修改

阅读量1w

点赞数 6

分类专栏：开发问题集合文章标签： java

于 2022-03-29 17:19:49 首次发布

本文链接：https://blog.csdn.net/Darker2017/article/details/123826351

版权

开发问题集合专栏收录该内容

4 篇文章 0 订阅

订阅专栏

问题：使用URLDecoder对标题进行解码报异常。

URLDecoder.decode(title, "utf-8")

异常信息：

java.lang.IllegalArgumentException: URLDecoder: Illegal hex characters in escape (%) pattern - For input string: "xxx"

原因：通过URLDecoder.decode方法源码可以看出，当编码字符串中含有%号和+号时，方法有特殊处理逻辑，导致%号会抛出异常，+号会转换成空格，源码如下：

public static String decode(String s, String enc)
        throws UnsupportedEncodingException{

        boolean needToChange = false;
        int numChars = s.length();
        StringBuffer sb = new StringBuffer(numChars > 500 ? numChars / 2 : numChars);
        int i = 0;

        if (enc.length() == 0) {
            throw new UnsupportedEncodingException ("URLDecoder: empty string enc parameter");
        }

        char c;
        byte[] bytes = null;
        while (i < numChars) {
            c = s.charAt(i);
            switch (c) {
            case '+':
                sb.append(' ');
                i++;
                needToChange = true;
                break;
            case '%':
                /*
                 * Starting with this instance of %, process all
                 * consecutive substrings of the form %xy. Each
                 * substring %xy will yield a byte. Convert all
                 * consecutive  bytes obtained this way to whatever
                 * character(s) they represent in the provided
                 * encoding.
                 */

                try {

                    // (numChars-i)/3 is an upper bound for the number
                    // of remaining bytes
                    if (bytes == null)
                        bytes = new byte[(numChars-i)/3];
                    int pos = 0;

                    while ( ((i+2) < numChars) &&
                            (c=='%')) {
                        int v = Integer.parseInt(s.substring(i+1,i+3),16);
                        if (v < 0)
                            throw new IllegalArgumentException("URLDecoder: Illegal hex characters in escape (%) pattern - negative value");
                        bytes[pos++] = (byte) v;
                        i+= 3;
                        if (i < numChars)
                            c = s.charAt(i);
                    }

                    // A trailing, incomplete byte encoding such as
                    // "%x" will cause an exception to be thrown

                    if ((i < numChars) && (c=='%'))
                        throw new IllegalArgumentException(
                         "URLDecoder: Incomplete trailing escape (%) pattern");

                    sb.append(new String(bytes, 0, pos, enc));
                } catch (NumberFormatException e) {
                    throw new IllegalArgumentException(
                    "URLDecoder: Illegal hex characters in escape (%) pattern - "
                    + e.getMessage());
                }
                needToChange = true;
                break;
            default:
                sb.append(c);
                i++;
                break;
            }
        }

        return (needToChange? sb.toString() : s);
    }

解决方法：参照ASCII编码表，将%号和+号先替换在解码。代码如下：

title = title.replaceAll("%(?![0-9a-fA-F]{2})", "%25");
title = title.replaceAll("\\+", "%2B")
title = URLDecoder.decode(title);

# 这里使用了一个特殊正则表达式：零宽负向先行断言(zero-widthnegative lookahead assertion)，模式为(?!pattern)，代表字符串中的一个位置，紧接该位置之后的字符序列不能匹配pattern。
# %(?![0-9a-fA-F]{2})意思是'%'开始，不匹配%后面两位为数字或字母（包括大小写）的字符；这样就把正确的排除掉了，剩下的就是需要匹配替换的。

参考链接：

https://codeantenna.com/a/rVm18ZjBKT

凌晨九点半

关注

6
点赞
踩
14

收藏

觉得还不错? 一键收藏
0
评论
URLDecoder解码异常 URLDecoder: Illegal hex characters in escape (%) pattern - For input string: “xxx“

问题：URLDecoder.decode(title, "utf-8")使用URLDecoder对标题进行解码报异常：java.lang.IllegalArgumentException: URLDecoder: Illegal hex characters in escape (%) pattern - For input string: "xxx"原因：我们通过URLDecoder.decode方法源码可以看出，当编码字符串中含有%号和+号时，方法有特殊处理逻辑，导致%号会抛出异常，+号会
复制链接

扫一扫

专栏目录