java中替换\r\n遇到的坑

最新推荐文章于 2024-08-19 10:26:07 发布

ElevenVitaminC

最新推荐文章于 2024-08-19 10:26:07 发布

阅读量3.1w

点赞数 21

分类专栏： Java

本文链接：https://blog.csdn.net/u013868665/article/details/79971419

版权

Java 专栏收录该内容

16 篇文章 1 订阅

订阅专栏

本人在项目中需要将数据库中的\r\n转换成html页面可识别的<br />，于是使用了text.replaceAll("(\\r\\n|\\r|\\n|\\n\\r)", "<br />");来进行替换，发现竟然替换不了！！！打印输出的内容中毫无变化。（数据库文本----"这是一个段落\\r\\n"，发现\r\n变成了\\r\\n）

然后尝试换其他的方法Pattern替换等等，依旧无效。

后来偶然查看了下replaceAll 的源码，瞬间豁然开朗！代码如下:（解决方案在最后！！）

public String replaceAll(String regex, String replacement) {
    return Pattern.compile(regex).matcher(this).replaceAll(replacement);
}

public String replaceAll(String replacement) {
    reset();
    boolean result = find();
    if (result) {
        StringBuffer sb = new StringBuffer();
        do {
            appendReplacement(sb, replacement);
            result = find();
        } while (result);
        appendTail(sb);
        return sb.toString();
    }
    return text.toString();
}

关键来了！！！

public Matcher appendReplacement(StringBuffer sb, String replacement) {

        // If no match, return error
        if (first < 0)
            throw new IllegalStateException("No match available");

        // Process substitution string to replace group references with groups
        int cursor = 0;
        StringBuilder result = new StringBuilder();

        while (cursor < replacement.length()) {
            char nextChar = replacement.charAt(cursor);
            if (nextChar == '\\') {
                cursor++;
                nextChar = replacement.charAt(cursor);
                result.append(nextChar);
                cursor++;
            } else if (nextChar == '$') {
                // Skip past $
                cursor++;
                // A StringIndexOutOfBoundsException is thrown if
                // this "$" is the last character in replacement
                // string in current implementation, a IAE might be
                // more appropriate.
                nextChar = replacement.charAt(cursor);
                int refNum = -1;
                if (nextChar == '{') {
                    cursor++;
                    StringBuilder gsb = new StringBuilder();
                    while (cursor < replacement.length()) {
                        nextChar = replacement.charAt(cursor);
                        if (ASCII.isLower(nextChar) ||
                            ASCII.isUpper(nextChar) ||
                            ASCII.isDigit(nextChar)) {
                            gsb.append(nextChar);
                            cursor++;
                        } else {
                            break;
                        }
                    }
                    if (gsb.length() == 0)
                        throw new IllegalArgumentException(
                            "named capturing group has 0 length name");
                    if (nextChar != '}')
                        throw new IllegalArgumentException(
                            "named capturing group is missing trailing '}'");
                    String gname = gsb.toString();
                    if (ASCII.isDigit(gname.charAt(0)))
                        throw new IllegalArgumentException(
                            "capturing group name {" + gname +
                            "} starts with digit character");
                    if (!parentPattern.namedGroups().containsKey(gname))
                        throw new IllegalArgumentException(
                            "No group with name {" + gname + "}");
                    refNum = parentPattern.namedGroups().get(gname);
                    cursor++;
                } else {
                    // The first number is always a group
                    refNum = (int)nextChar - '0';
                    if ((refNum < 0)||(refNum > 9))
                        throw new IllegalArgumentException(
                            "Illegal group reference");
                    cursor++;
                    // Capture the largest legal group string
                    boolean done = false;
                    while (!done) {
                        if (cursor >= replacement.length()) {
                            break;
                        }
                        int nextDigit = replacement.charAt(cursor) - '0';
                        if ((nextDigit < 0)||(nextDigit > 9)) { // not a number
                            break;
                        }
                        int newRefNum = (refNum * 10) + nextDigit;
                        if (groupCount() < newRefNum) {
                            done = true;
                        } else {
                            refNum = newRefNum;
                            cursor++;
                        }
                    }
                }
                // Append group
                if (start(refNum) != -1 && end(refNum) != -1)
                    result.append(text, start(refNum), end(refNum));
            } else {
                result.append(nextChar);
                cursor++;
            }
        }
        // Append the intervening text
        sb.append(text, lastAppendPosition, first);
        // Append the match substitution
        sb.append(result);

        lastAppendPosition = last;
        return this;
    }

从上面这段代码中我们发现这样一个片段

char nextChar = replacement.charAt(cursor);
 nextChar = replacement.charAt(cursor);

if (nextChar == '\\') {
    cursor++;
    nextChar = replacement.charAt(cursor);
    result.append(nextChar);
    cursor++;
}

也就是说它把连续的两个反斜杠(‘\\’)变成了一个反斜杠(‘\’)

到这里我想各位都已经明白了，咱们要替换的‘\\r’应该用‘\\\\r’来代替，用连续的4个反斜杠

也就是text.replaceAll("(\\\\r\\\\n|\\\\r|\\\\n|\\\\n\\\\r)", "<br />");

这个问题主要出现在数据库读取出的数据中\r\n变成了\\r\\n，如果是正常的\r\n，只需要正常替换就可以了。

经过测试，这行代码完美的替换掉了内容中的换行字符。

这个问题坑了我快一个小时，放到这里供各位参考。

ElevenVitaminC

关注

21
点赞
踩
20

收藏

觉得还不错? 一键收藏
8
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录