正则表达式 (Regular Expressions)

一.概述

正则表达式是用来匹配字符串的一种模式。
  写法:字符{数量}位置
  应用:匹配验证、分割、查找、替换。
  常见符号:

二.应用

1.匹配

String类中的boolean matches(String regex)方法。用规则匹配整个字符串,只要有一处不符合规则,就匹配结束,返回false。


public class MatchesDemo {
	/*
	 验证QQ号码
	 要求: 5-15位数,0不能开头,只能是数字
	 */
	//方式一:不用正则表达式
	public static void qqCheck_1(String qq){
		if(!qq.startsWith("0"))
		{
			if(qq.length() >= 5 && qq.length() <= 15){
				try{
					Long l = Long.parseLong(qq);
					System.out.println("QQ号码正确");
				}catch(NumberFormatException e){
					System.out.println("包含非法字符!");
				}
			}
			else
				System.out.println("长度非法!");
		}
		else
			System.out.println("没有0开头的号码!");
	}
	//方式二:用正则实现
	public static void qqCheck_2(String qq){
		String regex = "[1-9]\\d{4,14}";
		if (qq.matches(regex))
			System.out.println("QQ号正确");
		else
			System.out.println(qq+": 非法号码!");
	}
	/*
	 验证手机号码
	 要求: 13,15或18开头,11位数
	 */
	public static void phoneCheck(String phone){
		String regex = "1[358]\\d{9}";
		if(phone.matches(regex))
			System.out.println("手机号码正确");
		else
			System.out.println("号码不对!");
	}
	public static void main(String[] args){
		String qq = "1234234";
		qqCheck_1(qq);
		qqCheck_2(qq);
		
		String phone = "13812345678";
		phoneCheck(phone);
	}
}

2.分割

String类中String[]split(String regex)方法。

public class SplitDemo {
	public static void main(String[] args){
		//按.分割
		String regex1 = "\\.";
		String[] arr = "192.168.1.62".split(regex1);
		print(arr);
		//按空格分割,+表示可能有一个或多个空格
		String regex2 = " +";
		arr = "God helps    those  who    help themselves".split(regex2);
		print(arr);
		//按叠词(出现两次或两次以上的字母)分割
		String regex3 = "(.)\\1+";
		arr = "ItthhiinnkkkkiiloveUUUu".split(regex3);
		print(arr);
	}
	public static void print(String[] arr){
		for (String s: arr)
			System.out.println(s);
	}
}

3.替换

String replaceAll(String regex,String replacement)方法。
  对于组中所匹配的字符,可以用 n 来 获 取 。 n来获取。 n在正则中表示行的结尾,所以出现在正则中不能用来表示组,一般用于替换中。

public class ReplaceDemo {
	public static void main(String[] args){
		//将字符串中连续出现3次或3次以上的数字群替换成#
		String regex1 = "\\d{3,}";
		String s1 = "God323is594a490girl";
		s1 = s1.replaceAll(regex1, "#");
		System.out.println(s1);
		//将叠词替换为一个,$1表示符合组中一个字符
		String regex2 = "(.)\\1+";
		String s2 = "Itiisssabeeauttiffffffuuullnniighhhttt";
		s2 =s2.replaceAll(regex2, "$1");
		System.out.println(s2);
	}
}

4.获取

步骤:
  1)将正则表达式封装成对象。
  2)让正则对象和要操作的字符串相关联。
  3)关联后,获取正则匹配引擎。
  4)通过引擎对符合规则的子串进行操作,比如取出。

import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class PatternDemo {
	public static void main(String[] args){
		String s = "Tomorrow is a good day, and we plan to have a trip"; 
		//获取字符串中4个字母的单词
		String regex = "\\b[a-z]{4}\\b";
		get(s, regex);
	}
	public static void get(String s, String regex){
		//将规则封装成对象
		Pattern p = Pattern.compile(regex);
		//让正则对象和要作用的字符串相关联,获取匹配器对象
		Matcher m = p.matcher(s);
		//find()方法是将规则作用到字符串上,并进行符合规则的子串查找。
		while(m.find()){
			//group()方法用于获取匹配后结果。
			System.out.println(m.group());
			//start()和end()分别表示匹配字符的开始和结尾的索引
			System.out.println(m.start()+"..."+m.end());
		}
	}
}

三、实例

应用技巧:
  1)如果只想知道该字符是否对是错,使用匹配。
  2)想要将已有的字符串变成另一个字符串,替换。
  3)想要按照自定的方式将字符串变成多个字符串。切割。获取规则以外的子串。
  4)想要拿到符合需求的字符串子串,获取。获取符合规则的子串。

/*
 去掉字符串中的".",把重复的字符变成单个字符
 */
public class ReplaceTest {
	public static void main(String[] args){
		String s = "我我...有...蛇蛇蛇....蛇蛇....精精精....病....病病.....";
		String regex = "\\.+";
		s = s.replaceAll(regex, "");
		regex = "(.)\\1+";
		s = s.replaceAll(regex, "$1");
		System.out.println(s);
	}
}
/*
 将IP地址进行地址段顺序的排序
 思路:
 1.每一段都先补全3位
 2.利用集合进行排序
 3.删除每段前面补的0
 */
import java.util.TreeSet;
public class IPSortTest {
	public static void main(String[] args){
		String ip = "192.68.1.254 102.49.23.013 10.10.10.10 2.2.2.2 8.109.90.301";
		String regex = "(\\d+)";
		ip = ip.replaceAll(regex,"00$1"); //保证每段至少都有3位
		System.out.println(ip);
		regex = "0*(\\d{3})";
		ip = ip.replaceAll(regex,"$1"); //每段只保留3位
		System.out.println(ip);
		regex = " ";
		String[] arr = ip.split(regex); //按空格分割
		//定义一个TreeSet集合,利用元素自然排序
		TreeSet<String> ts = new TreeSet<String>();
		for(String str: arr){
			ts.add(str);
		}
		regex = "0*(\\d)";
		for(String s: ts){
			System.out.println(s.replaceAll(regex, "$1"));
		}
	}
}
/*
 对邮件进行校验
 要求: xxxxx@xx(.xx),括号内的内容出现1-3次
 */
public class CheckMail {
	public static void main(String[] args){
		String mail = "123sfsl7@sina.com.cn";
		String regex = "\\w+@[a-zA-Z0-9]+(\\.[a-zA-Z]+){1,3}";
		System.out.println(mail.matches(regex));
	}
}
/*
网络爬虫(蜘蛛)
*/
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class Spider {
	public static void main(String[] args) throws Exception{
		getWebMail();
	}
	public static void getWebMail() throws Exception{
		//封装网页地址
		URL url = new URL("http://tieba.baidu.com/p/1390896758");
		//链接服务器
		URLConnection conn = url.openConnection();
		//带缓冲区的网页读取流
		BufferedReader br = new BufferedReader(new InputStreamReader(conn.getInputStream()));
		String line = null;
		//定义匹配邮件地址的正则表达式
		String regex = "\\w+@\\w+(\\.\\w+)+";
		Pattern p = Pattern.compile(regex); //封装正则表达式
		//读取网页数据
		while((line = br.readLine()) != null){
			//正则关联数据
			Matcher m = p.matcher(line);
			//寻找匹配邮箱
			while (m.find()){
				System.out.println(m.group()); //输出匹配邮箱
			}
		}
	}
	//获取指定文档中的邮件地址。使用获取功能。Pattern Matcher
	public static void getFileMail() throws Exception{
		//将文件封装成对象
		File file = new File("E:\\mail.txt");
		//创建带缓冲区的读取流
		BufferedReader br = new BufferedReader(new FileReader(file));
		String line = null;
		//定义正则表达式
		String regex = "\\w+@[a-zA-Z]+(\\.(a-zA-Z]+)+";
		//创建Pattern对象,封装正则表达式
		Pattern p = Pattern.compile(regex);
	    //读取文件中数据
		while((line = br.readLine()) != null){
			//关流字符串
			Matcher m = p.matcher(line);
			while(m.find()){ //寻找匹配的字符串
				System.out.println(m.group()); //输出匹配的字符串
			}
		}
	}
}
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Learning Regular Expressions By 作者: Ben Forta ISBN-10 书号: 0134757068 ISBN-13 书号: 9780134757063 Edition 版本: 1 出版日期: 2018-05-25 pages 页数: 144 $34.99 Learn to use one of the most powerful text processing and manipulation tools available Regular expression experts have long been armed with an incredibly powerful tool, one that can be used to perform all sorts of sophisticated text processing and manipulation in just about every language and on every platform. That’s the good news. The bad news is that for too long, regular expressions have been the exclusive property of only the most tech savvy. Until now. Ben Forta’s Learning Regular Expressions teaches you the regular expressions that you really need to know, starting with simple text matches and working up to more complex topics, including the use of backreferences, conditional evaluation, and look-ahead processing. You’ll learn what you can use, and you’ll learn it methodically, systematically, and simply. Regular expressions are nowhere near as complex as they appear to be at first glance. All it takes is a clear understanding of the problem being solved and how to leverage regular expressions to solve them. Read and understand regular expressions Use literal text and metacharacters to build powerful search patterns Take advantage of advanced regular expression features, including lookahead and backreferences Perform powerful search-and-replace operations in all major professional editing tools Add sophisticated form and text processing to web applications Search for files using command-line tools like grep and egrep Use regular expressions in programming languages like JavaScript, Java, PHP, Python, Microsoft .NET, and C#, as well as in DBMSs including MySQL and Oracle Work with phone numbers, postal codes, social security numbers, IP addresses, URLs, email addresses, and credit card numbers

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值