public static void main(String[] args) {
// (?<=A)B(?=C) 模式
// 查找"以A开始,以C结束,中间含B的字符串",然后匹配出B
String input="<html>xxxxx</html>";
Pattern p=Pattern.compile("(?<=<(\\w+)>).*(?=<\\/\\1>)");
Matcher m=p.matcher(input);
System.out.println(m.find());
System.out.println(m.group());
}
异常: Look-behind group does not have an obvious maximum length
网上搜集到的相关帖子:
http://www.bennadel.com/blog/1132-REMatchGroup-UDF-To-Return-Only-Specified-Group-In-RegEx-Pattern.htm
http://stackoverflow.com/questions/1971652/existence-of-obvious-maximum-length-of-look-behind-group-in-java
http://stackoverflow.com/questions/1536915/regex-look-behind-without-obvious-maximum-length-in-java
解决办法: (Integer.MAX_VALUE-1)替换成具体数字也行
Pattern p=Pattern.compile("(?<=<(\\w{0,"+(Integer.MAX_VALUE-1)+"})>).*(?=<\\/\\1>)");
1. 为什么是"Integer.MAX_VALUE-1",而不是"Integer.MAX_VALUE".还不是很清楚,后面再查资料
2. (Integer.MAX_VALUE-1)替换成具体数字也行
附:
另外csdn一个帖子,"火龙果"对正则的相关信息作了很好的解释,链接地址为:
http://topic.csdn.net/u/20080325/17/fb7a3e8d-029a-4d8e-89ae-77a9d28ec301.html
据说,"火龙果"的正则很厉害,以后要常去csdn学习正则啊!!
另外一个学习正则的地方:http://www.regexlab.com/zh/regref.htm