java 正则出特殊字符,在Java正则表达式中转义特殊字符

Is there any method in Java or any open source library for escaping (not quoting) a special character (meta-character), in order to use it as a regular expression?

This would be very handy in dynamically building a regular expression, without having to manually escape each individual character.

For example, consider a simple regex like \d+\.\d+ that matches numbers with a decimal point like 1.2, as well as the following code:

String digit = "d";

String point = ".";

String regex1 = "\\d+\\.\\d+";

String regex2 = Pattern.quote(digit + "+" + point + digit + "+");

Pattern numbers1 = Pattern.compile(regex1);

Pattern numbers2 = Pattern.compile(regex2);

System.out.println("Regex 1: " + regex1);

if (numbers1.matcher("1.2").matches()) {

System.out.println("\tMatch");

} else {

System.out.println("\tNo match");

}

System.out.println("Regex 2: " + regex2);

if (numbers2.matcher("1.2").matches()) {

System.out.println("\tMatch");

} else {

System.out.println("\tNo match");

}

Not surprisingly, the output produced by the above code is:

Regex 1: \d+\.\d+

Match

Regex 2: \Qd+.d+\E

No match

That is, regex1 matches 1.2 but regex2 (which is "dynamically" built) does not (instead, it matches the literal string d+.d+).

So, is there a method that would automatically escape each regex meta-character?

If there were, let's say, a static escape() method in java.util.regex.Pattern, the output of

Pattern.escape('.')

would be the string "\.", but

Pattern.escape(',')

should just produce ",", since it is not a meta-character. Similarly,

Pattern.escape('d')

could produce "\d", since 'd' is used to denote digits (although escaping may not make sense in this case, as 'd' could mean literal 'd', which wouldn't be misunderstood by the regex interpeter to be something else, as would be the case with '.').

解决方案

I'm not 100% sure this is what you are asking here. If you are looking for a way to create constants that you can use in your regex patterns then just prepending them with "\\" would work:

String digit = "\\d";

There is no Pattern method that I know of that does this for you. Unfortunately, although there is "\\d" for digits, "\\w" for work characters, etc. there is also () for grouping, + and * for repeats, etc.. There is not a common way to deal with each of the parts of a a regular expression.

In your post you use the Pattern.quote(string) method. You probably know that this wraps your pattern between "\\Q" and "\\E" so you can match a string even if it happens to have a special regex character in it (+, ., \\d, etc.)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值