java 正则表达式备忘笔记


java中的正则表达式主要有两个作用,一个是判断字符串是否匹配正则表达式,另一个是通过正则表达式来截取字符串中的一部分。

判断是否匹配指定表达式的语法

		String str = "33as";
		String pattern = "\\d{2}.*$";
		System.out.println(Pattern.matches(pattern, str));

通过正则表达式来截取字符串

简单截取
		String str = "a112b234c543d";
		String pattern = "\\d+";
		
		Pattern r = Pattern.compile(pattern);
		
		Matcher matcher = r.matcher(str);
		
		while(matcher.find())
			System.out.println(matcher.group());

输出

112
234
543
捕获组

主要使用find和group方法

	private static void common3() {
		//截取+分组
		String str = "a123b456c7890d";
		String pattern = "(\\d)\\d(\\d+)";
		
		Pattern r = Pattern.compile(pattern);
		
		Matcher matcher = r.matcher(str);
		
		while(matcher.find()){
			System.out.println(matcher.group());
			for(int i = 0; i < matcher.groupCount(); i++){
				System.out.println(matcher.group(i + 1));
			}
		}
	}

输出

123
1
3
456
4
6
7890
7
90

group()方法获取整个正则表达式的结果,group(int i)获取与i值对应下标的正则表达式中括号内的匹配值。以上面例子来说,(\\d)\\d(\\d+) 的第一个括号获取的是第一个数字,对应group(0),第二个括号对应第三个数字及之后的数字,对应group(1)

匹配不捕获 (?:pattern)

匹配 pattern 但不捕获该匹配的子表达式

可以理解成|的另一种写法
asd(?:g|fg) 等价于 asdg|asdfg
asd(?:g|fg)(?:gh|h)等价于asdggh|asdgh|asdfggh|asdfgh

	private static void v1(){
		String str = "asdfgh";
		String pattern = "asd(?:g|fg)";
		pattern = "asd(?:g|fg)(?:gh|h)";
		//asdggh|asdgh|asdfggh|asdfgh
		
		Pattern r = Pattern.compile(pattern);
		
		Matcher matcher = r.matcher(str);
		
		while(matcher.find()){
			System.out.println(matcher.group());
			System.out.println(matcher.groupCount());
		}
	}

输出

asdfgh
0

可以看到括号中的表达式被匹配了出来,但是并没有被存储起来。

正向预测不捕获 (?=pattern)

windows(?=\\d{3})表示匹配后面紧跟3个数字的windows。

	private static void v3(){
		String str = "windows123";
		String pattern = "windows(?=\\d{3})";
		
		Pattern r = Pattern.compile(pattern);
		
		Matcher matcher = r.matcher(str);
		
		while(matcher.find()){
			System.out.println(matcher.group());
		}
	}
反向预测不捕获 (?!pattern)

这个反向不是指方向,而是相对正向取反。
linux(?!\\d{2}) 可以匹配linux1但不能匹配linux12或者linux123

	private static void v4(){
		String str = "linux1";
		String pattern = "linux(?!\\d{2})";
		
		Pattern r = Pattern.compile(pattern);
		
		Matcher matcher = r.matcher(str);
		
		while(matcher.find()){
			System.out.println(matcher.group());
		}
	}

输出

linux
后向引用

\1 用于正则表达式内取值,取的是第一个分组匹配到的值
例如匹配连续重复的两个字符

		String content = "aa";
	    String pattern = "(.)\\1";
	    boolean isMatch = Pattern.matches(pattern, content);
	    System.out.println(isMatch);//true

匹配是否存在连续字符

		String content = "asdffjhkjkk";
	    String pattern = ".*(.)\\1+.*";
	    boolean isMatch = Pattern.matches(pattern, content);
	    System.out.println(isMatch);//true

自我练习

获取a标签的超链接地址
	private static void test1(){
		String str = "<a href='localhost:9999'></a>\r\r\n  ";
		str += "<a id='a1' href=\"https://www.baidu.com\" class=\"a-cs1\">bbb</a>";
		
//		localhost:9999
//		https://www.baidu.com
		
		str = str.replace("\"", "'");
		
		String pattern = "<a(?:[^>]*)href='([^']*)";
		Pattern r = Pattern.compile(pattern);
		Matcher matcher = r.matcher(str);
		
		while(matcher.find()){
//			System.out.println(matcher.group());
			for(int i = 0; i < matcher.groupCount(); i++){
				System.out.println(matcher.group(i + 1));
			}
		}
			
	}
获取class中包含a-cs1的全部a标签的超链接地址
	private static void test2(){
		String str = "<a href='localhost:9999' class=\"a-cs2\"></a>\r\r\n  ";
		str += "<a id='a1' href=\"https://www.baidu.com\" class=\"a-cs1\">bbb</a>";
		str += "<a id='a2' href=\"https://bbs.csdn.net/forums/Java\" class=\"a-cs1; rrwer\">c</a>";
		
//		https://www.baidu.com
//		https://bbs.csdn.net/forums/Java
		
		str = str.replace("\"", "'");
		
		String pattern = "<a(?:[^>]*)class='[^a-cs1]*a-cs1[^a-cs1]*";
		Pattern r = Pattern.compile(pattern);
		Matcher matcher = r.matcher(str);
		
		while(matcher.find()){
//			System.out.println(matcher.group());
			String temp = matcher.group();
			
			String pattern2 = "href='(.*?)'";
			Pattern r2 = Pattern.compile(pattern2);
			Matcher matcher2 = r2.matcher(temp);
			
			while(matcher2.find()){
				System.out.println(matcher2.group(1));
			}
		}
			
	}
下划线转驼峰
	private static void test3(){
		String str = "aaa_bbb_c_ddd";//aaaBbbCDdd
		
		String pattern = "_[a-z]";
		Pattern r = Pattern.compile(pattern);
		Matcher matcher = r.matcher(str);
		
		StringBuffer sb = new StringBuffer();
		
		while(matcher.find()){
			matcher.appendReplacement(sb, matcher.group().toUpperCase().replace("_", ""));
		}
		
		matcher.appendTail(sb);
		System.out.println(sb);
	}
驼峰转下划线
	private static void test4(){
		String str = "aaaBbbCDdd";//aaa_bbb_c_ddd
		
		String pattern = "[A-Z]";
		Pattern r = Pattern.compile(pattern);
		Matcher matcher = r.matcher(str);
		
		StringBuffer sb = new StringBuffer();
		
		while(matcher.find()){
			matcher.appendReplacement(sb, "_" + matcher.group().toLowerCase());
		}
		
		matcher.appendTail(sb);
		System.out.println(sb);
	}
数字转为财务格式(三位逗号分割)
	private static void test6(){
		String str = "23456789.12545";//23,456,789.13
		//23456789	23,456,789.00
		
		str = new BigDecimal(str).setScale(2, BigDecimal.ROUND_HALF_UP).toString();
		str = str.replaceAll("(\\d)(?=(\\d{3})+\\.)", "$1,");
		
		System.out.println(str);
	}
获取url中的参数
	private static void test7(){
		String str = "http://aaa.bbb.com?aa=12&b1=b2b&ccc=123";
		
//		aa = 12
//		b1 = b2b
//		ccc = 123
		
		String pattern = "[\\?|\\&]([^=]+)=([^&]+)";
		Pattern r = Pattern.compile(pattern);
		Matcher matcher = r.matcher(str);
		
		while(matcher.find()){
			System.out.println(matcher.group(1) + " = " + matcher.group(2));
		}
		
	}
连续重复字符去重
		String str = "aaa...123321222ggg";
		String regex = "(.)\\1+";
		Matcher matcher = Pattern.compile(regex).matcher(str);
		String res = matcher.replaceAll("$1");
		System.out.println(res);//a.123212g

简单解释一下,$1代表第一个括号内的匹配值,Matcher.replaceAll的作用是吧每个匹配的表达式("(.)\\1+")替换为括号内的值。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值