java 正则字符转义,Java正则表达式转义字符

When matching certain characters (such as line feed), you can use the regex "\\n" or indeed just "\n". For example, the following splits a string into an array of lines:

String[] lines = allContent.split("\\r?\\n");

But the following works just as well:

String[] lines = allContent.split("\r?\n");

My question:

Do the above two work in exactly the same way, or is there any subtle difference? If the latter, can you give an example case where you get different results?

Or is there a difference only in [possible/theoretical] performance?

解决方案

There is no difference in the current scenario. The usual string escape sequences are formed with the help of a single backslash and then a valid escape char ("\n", "\r", etc.) and regex escape sequences are formed with the help of a literal backslash (that is, a double backslash in the Java string literal) and a valid regex escape char ("\\n", "\\d", etc.).

"\n" (an escape sequence) is a literal LF (newline) and "\\n" is a regex escape sequence that matches an LF symbol.

"\r" (an escape sequence) is a literal CR (carriage return) and "\\r" is a regex escape sequence that matches an CR symbol.

"\t" (an escape sequence) is a literal tab symbol and "\\t" is a regex escape sequence that matches a tab symbol.

See the list in the Java regex docs for the supported list of regex escapes.

However, if you use a Pattern.COMMENTS flag (used to introduce comments and format a pattern nicely, making the regex engine ignore all unescaped whitespace in the pattern), you will need to either use "\\n" or "\\\n" to define a newline (LF) in the Java string literal and "\\r" or "\\\r" to define a carriage return (CR).

String s = "\n";

System.out.println(s.replaceAll("\n", "LF")); // => LF

System.out.println(s.replaceAll("\\n", "LF")); // => LF

System.out.println(s.replaceAll("(?x)\\n", "LF")); // => LF

System.out.println(s.replaceAll("(?x)\\\n", "LF")); // => LF

System.out.println(s.replaceAll("(?x)\n", ""));

// =>

//

Why is the last one producing +newline+? Because "(?x)\n" is equal to "", an empty pattern, and it matches an empty space before the newline and after it.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值