本人在项目中需要将数据库中的\r\n转换成html页面可识别的<br />,于是使用了text.replaceAll("(\\r\\n|\\r|\\n|\\n\\r)", "<br />");来进行替换,发现竟然替换不了!!!打印输出的内容中毫无变化。(数据库文本----"这是一个段落\\r\\n",发现\r\n变成了\\r\\n)
然后尝试换其他的方法Pattern替换等等,依旧无效。
后来偶然查看了下replaceAll 的源码,瞬间豁然开朗!代码如下:(解决方案在最后!!)
public String replaceAll(String regex, String replacement) {
return Pattern.compile(regex).matcher(this).replaceAll(replacement);
}
public String replaceAll(String replacement) {
reset();
boolean result = find();
if (result) {
StringBuffer sb = new StringBuffer();
do {
appendReplacement(sb, replacement);
result = find();
} while (result);
appendTail(sb);
return sb.toString();
}
return text.toString();
}
关键来了!!!
public Matcher appendReplacement(StringBuffer sb, String replacement) {
// If no match, return error
if (first < 0)
throw new IllegalStateException("No match available");
// Process substitution string to replace group references with groups
int cursor = 0;
StringBuilder result = new StringBuilder();
while (cursor < replacement.length()) {
char nextChar = replacement.charAt(cursor);
if (nextChar == '\\') {
cursor++;
nextChar = replacement.charAt(cursor);
result.append(nextChar);
cursor++;
} else if (nextChar == '$') {
// Skip past $
cursor++;
// A StringIndexOutOfBoundsException is thrown if
// this "$" is the last character in replacement
// string in current implementation, a IAE might be
// more appropriate.
nextChar = replacement.charAt(cursor);
int refNum = -1;
if (nextChar == '{') {
cursor++;
StringBuilder gsb = new StringBuilder();
while (cursor < replacement.length()) {
nextChar = replacement.charAt(cursor);
if (ASCII.isLower(nextChar) ||
ASCII.isUpper(nextChar) ||
ASCII.isDigit(nextChar)) {
gsb.append(nextChar);
cursor++;
} else {
break;
}
}
if (gsb.length() == 0)
throw new IllegalArgumentException(
"named capturing group has 0 length name");
if (nextChar != '}')
throw new IllegalArgumentException(
"named capturing group is missing trailing '}'");
String gname = gsb.toString();
if (ASCII.isDigit(gname.charAt(0)))
throw new IllegalArgumentException(
"capturing group name {" + gname +
"} starts with digit character");
if (!parentPattern.namedGroups().containsKey(gname))
throw new IllegalArgumentException(
"No group with name {" + gname + "}");
refNum = parentPattern.namedGroups().get(gname);
cursor++;
} else {
// The first number is always a group
refNum = (int)nextChar - '0';
if ((refNum < 0)||(refNum > 9))
throw new IllegalArgumentException(
"Illegal group reference");
cursor++;
// Capture the largest legal group string
boolean done = false;
while (!done) {
if (cursor >= replacement.length()) {
break;
}
int nextDigit = replacement.charAt(cursor) - '0';
if ((nextDigit < 0)||(nextDigit > 9)) { // not a number
break;
}
int newRefNum = (refNum * 10) + nextDigit;
if (groupCount() < newRefNum) {
done = true;
} else {
refNum = newRefNum;
cursor++;
}
}
}
// Append group
if (start(refNum) != -1 && end(refNum) != -1)
result.append(text, start(refNum), end(refNum));
} else {
result.append(nextChar);
cursor++;
}
}
// Append the intervening text
sb.append(text, lastAppendPosition, first);
// Append the match substitution
sb.append(result);
lastAppendPosition = last;
return this;
}
从上面这段代码中我们发现这样一个片段
char nextChar = replacement.charAt(cursor);
nextChar = replacement.charAt(cursor);
if (nextChar == '\\') {
cursor++;
nextChar = replacement.charAt(cursor);
result.append(nextChar);
cursor++;
}
也就是说它把连续的两个反斜杠(‘\\’)变成了一个反斜杠(‘\’)
到这里我想各位都已经明白了,咱们要替换的‘\\r’应该用‘\\\\r’来代替,用连续的4个反斜杠
也就是text.replaceAll("(\\\\r\\\\n|\\\\r|\\\\n|\\\\n\\\\r)", "<br />");
这个问题主要出现在数据库读取出的数据中\r\n变成了\\r\\n,如果是正常的\r\n,只需要正常替换就可以了。
经过测试,这行代码完美的替换掉了内容中的换行字符。
这个问题坑了我快一个小时,放到这里供各位参考。