今天产品做文案替换的时候发现报错了,文本如下:
java.lang.IllegalArgumentException: Illegal group reference: group index is missing
就是一个简单的正则替换,但是特殊之处就在于是将‘’dollar‘’替换成$符号,很明显,$符号在正则表达式里属于通配符,直接替换肯定是会报错的,比如下面这个简单的例子:
@Test
void result() {
List<String> goodsList = new ArrayList<>();
goodsList.add("free service: 0 Dollar");
goodsList.add("small hamburger: 0.99 dollar");
goodsList.add("normal hamburger: 2.99 Dollar");
goodsList.add("big hamburger: 4.99 Dollar");
String searchStr = "dollar";
String replaceStr = "$";
for (String goods : goodsList) {
// (?i) 表示忽略大小写
goods = goods.replaceAll("(?i)" + searchStr, replaceStr);
System.out.println(goods);
}
}
搜了一下网上的做法,大都是利用Matcher.quoteReplacement()方法处理特殊符号,于是我将代码修改为下面这样:
for (String goods : goodsList) {
replaceStr = Matcher.quoteReplacement(replaceStr);
// (?i) 表示忽略大小写
goods = goods.replaceAll("(?i)" + searchStr, replaceStr);
System.out.println(goods);
}
输出如下:
free service: 0 $
small hamburger: 0.99 \$
normal hamburger: 2.99 \\\$
big hamburger: 4.99 \\\\\\\$
很明显这样不对,多了转义符,因为转义符 "\" 本身就是特殊字符,会被Matcher.quoteReplacement()转义,恰好我把方法写在了循环体内导致多次调用。实际情况变成了转义字符每循环一次就乘以2,循环n次,相当于多了个转义符。项目里循环次数明显比这个要多,分分钟内存溢出。
解决方法很简单,把Matcher.quoteReplacement()方法放到循环体之外,另外最好不要直接把转义后的结果赋值给当前变量,应该新建一个字符串变量进行赋值,改进后代码如下:
@Test
void result() {
List<String> goodsList = new ArrayList<>();
goodsList.add("free service: 0 dollar");
goodsList.add("small hamburger: 0.99 dollar");
goodsList.add("normal hamburger: 2.99 Dollar");
goodsList.add("big hamburger: 4.99 Dollar");
String searchStr = "dollar";
String replaceStr = "$";
String replaceStrQuote = Matcher.quoteReplacement(replaceStr);
for (String goods : goodsList) {
// (?i) 表示忽略大小写
goods = goods.replaceAll("(?i)" + searchStr, replaceStrQuote);
System.out.println(goods);
}
}
最终结果如下:
free service: 0 $
small hamburger: 0.99 $
normal hamburger: 2.99 $
big hamburger: 4.99 $