jdk6之前的正则表达式不支持命名捕获组功能,只能通过捕获组的索引来访问捕获组。当正则表达式比较复杂的时候,里面含有大量的捕获组和非捕获组,通过从左至右数括号来得知捕获组的计数也是一件很烦人的事情;而且这样做代码的可读性也不好,当正则表达式需要修改的时候也会改变里面捕获组的计数。
解决这个问题的方法是通过给捕获组命名来解决,就像Python, PHP, .Net 以及Perl这些语言里的正则表达式一样.
新引入的命名捕获组支持如下:
(1) (?<NAME>X) to define a named group "NAME"
(2) \k<Name> to backref a named group "NAME"
(3) ${NAME} to reference to captured group in matcher's replacement str
(4) group(String NAME) to return the captured input subsequence by the given "named group"
例子如下:
public static void indexedCaptureTest(){//jdk6之前的使用方式
String names = "fred or barney";
Matcher m = Pattern.compile("(\\w+) or (\\w+)").matcher(names);
if(m.find()){
System.out.println(m.group(1)+","+m.group(2));
}
}
public static void namedCaptureTest(){//jdk7可以给捕获组命名
String names = "fred or barney";
Matcher m = Pattern.compile("(?<name1>\\w+) or (?<name2>\\w+)").matcher(names);
if(m.find()){
System.out.println(m.group("name1")+","+m.group("name2"));
}
}
再看一下反向引用和替换字符串的例子:
String input = “aabbbccdddef”;
如何把这个字符串拆成[aa, bbb, cc, ddd, e, f]这样的数组?
public static void indexedCaptureReplace(){
String input = "aabbbccdddef";
String regex = "((.)+?)(?!\\2)";
String temp = input.replaceAll(regex, "$1,");
String[] arr = temp.split(",");
System.out.println(java.util.Arrays.toString(arr));
}
public static void namedCaptureReplace(){
String input = "aabbbccdddef";
String regex = "(?<name2>(?<name1>.)+?)(?!\\k<name1>)";//好丑陋的实现!ugly!
String temp = input.replaceAll(regex, "${name2},");
String[] arr = temp.split(",");
System.out.println(java.util.Arrays.toString(arr));
}
参考:
http://blog.csdn.net/goldenfish1919/article/details/7317962