java中正则表达式的使用

-前进ヾ

已于 2024-04-29 23:20:02 修改

阅读量1.3k

点赞数 18

文章标签： java 正则表达式 mysql

于 2024-04-29 03:12:56 首次发布

本文链接：https://blog.csdn.net/m0_57665454/article/details/138254591

版权

匹配规则

匹配规则如下：详细可参照菜鸟教程正则视频教程韩顺平正则表达式专题。注意Java中用 \\ 进行正则匹配。

几点注意事项：

^和$的使用，例如：Pattern.compile("[1-9]{6}")和Pattern.compile("^[1-9]{6}$")都是匹配连续6个1-9之间的数字，但是不同的是使用^和$匹配的字符串不能包含其它字符。例如：对于字符串 "123456abc"，Pattern.compile("[1-9]{6}")会匹配成功，因为它包含了6个从1到9之间的数字。而 Pattern.compile("^[1-9]{6}$") 会匹配失败，因为这个字符串不仅仅包含6个数字，还有其他字符。

反向引用

反向引用可以理解为复制了一份前面捕获组捕获的字符串，这种引用既可以是在正则表达式内部，也可以是在正则表达式外部，内部反向引用 \\分组号，外部反向引用$分组号。例如下列实例代码中的(\\d)\\1{2}表示：捕获一个数字，并引用这个捕获组两次,也就是说会匹配三个重复的数字。

public void Test(){
        String content = "12321-111222333";
        //String content2 = "12321-333333333";
        Pattern compile = Pattern.compile("\\d{5}-(\\d)\\1{2}(\\d)\\2{2}(\\d)\\3{2}");
        Matcher matcher = compile.matcher(content);
        while (matcher.find()){
            // 12321-111222333
            System.out.println(matcher.group(0));
        }
    }

常用函数说明

import java.util.regex.Matcher;
import java.util.regex.Pattern;
 
public class RegexMatches
{
    public static void main( String[] args ){
 
      // 按指定模式在字符串查找
      String line = "This order was placed for QT3000! OK?";
      String pattern = "(\\D*)(\\d+)(.*)";
 
      // 创建 Pattern 对象
      Pattern r = Pattern.compile(pattern);
 
      // 现在创建 matcher 对象
      Matcher m = r.matcher(line);
      if (m.find( )) {
         System.out.println("Found value: " + m.group(0) );
         System.out.println("Found value: " + m.group(1) );
         System.out.println("Found value: " + m.group(2) );
         System.out.println("Found value: " + m.group(3) ); 
      } else {
         System.out.println("NO MATCH");
      }
   }
}

Matcher类

find()方法

根据指定的规则，定位满足规则的子字符串
找到后，将子字符串的开始的索引记录到 matcher对象的属性 int[］ groups；开始索引记录到groups[0]，把该子字符串的结束的索引+1的值记录到 groups[1]
同时记录oldLast的值为子字符串的结束的索引+1的值，即下次执行find时，就从此开始匹配

group()方法

如下图是goup()对应的源码，通过返回语句我们可以看到进行了一步字符串截取，如果是group(0)则返回的是group(0)到group(1)对应的值的字符串，即把按照规则匹配的字符串截取出来，例如group(0)=a, group(1)=b，则截取的字符串位置为[a, b)。这也是为什么group(1)存储的是匹配字符串结束索引+1。

对于用括号 ( ) 分组的正则匹配，例如下列示例代码中的匹配规则Pattern.compile("(\\d\\d)(\\d\\d)");匹配的是连续4位数字，但是我们将匹配规则分为了两组，通过下图的debug可以看到group(0)=0，group(1)=4，那么调用Matcher中的group(0)方法得到的就是content内容中[0, 4)的数据，即“2024”。而group(2)=0，group(3)=2表示第一个分组(\\d\\d)匹配到的内容，即“20”，group(4)=2，group(5)=4表示第二个分组(\\d\\d)匹配到的内容，即“24”.所以调用matcher.group(1)方法返回的结果是20，matcher.group(2)返回的结果是24.【注意，参照上图的group()方法源码，别把group参数和Matcher中的group()方法搞混了】

总结一下：matcher.group(0)得到的就是匹配到的字符串，如果对应的正则表达式有多个括号()分组，比如示例为2组，则matcher.group(1)是第一个分组匹配到的字符，matcher.group(2)是第二个分组匹配到的字符串。

public void Test(){
        String content = "2024年4月25日，神舟十八号载人飞船发射取得圆满成功。";
        Pattern compile = Pattern.compile("(\\d\\d)(\\d\\d)");
        Matcher matcher = compile.matcher(content);
        while (matcher.find()){
            System.out.println(matcher.group(0)); //2024
            System.out.println(matcher.group(1)); //20
            System.out.println(matcher.group(2)); //24
        }
    }

索引方法

public void Test(){
        String content = "2024年4月25日，神舟十八号载人飞船发射取得圆满成功。";
        Pattern compile = Pattern.compile("\\d\\d\\d\\d");
        Matcher matcher = compile.matcher(content);
        while (matcher.find()){
            System.out.println(matcher.group(0)); //2024
            System.out.println(matcher.start()); //0
            System.out.println(matcher.end()); //4
        }
    }

replaceAll()方法

根据模式匹配的字符串全部替换为replaceAll()参数中的字符串，返回替换后的新字符串。

public void Test(){
        String content = "2024年4月25日，神舟十八号载人飞船发射取得圆满成功。2024";
        Pattern compile = Pattern.compile("2024");
        Matcher matcher = compile.matcher(content);
        String newContent = matcher.replaceAll("二零二四");
        //二零二四年4月25日，神舟十八号载人飞船发射取得圆满成功。二零二四
        System.out.println(newContent);
    }

在字符串中也有replaceAll()方法，方法参数可以是正则表达式，下面是一个算法题示例。需要注意的是，replaceAll()方法可以结合正则匹配以及反向引用，但是正则表达式需要使用括号 () 括起来，因为这样表示一个捕获组，然后结合反向引用解决实际问题。

public static void main(String[] args) {
        Scanner in = new Scanner(System.in);
        String input = in.next();
        String output = input.replaceAll("(\\d+)", "*$1*");
        System.out.println(output);
    }

replaceFirst()方法

和replaceAll()类似，只不过replaceFirst()只会替换第一个满足要求的字符串，返回值依然是替换后的字符串。

public void Test(){
        String content = "2024年4月25日，神舟十八号载人飞船发射取得圆满成功。2024";
        Pattern compile = Pattern.compile("2024");
        Matcher matcher = compile.matcher(content);
        String newContent = matcher.replaceFirst("二零二四");
        //二零二四年4月25日，神舟十八号载人飞船发射取得圆满成功。2024
        System.out.println(newContent);
    }

Pattern类

matches()方法

整体匹配，返回的是一个boolean值，一般用于校验用户输入的数据是否合法，不加^和$限定也默认是这种匹配模式。boolean flag = Pattern.matches("^13|14|15|18\\d{9}$", content);

String字符串中也有对应的matches()方法，参数是正则匹配规则，返回boolean值。示例如下：

public static void main(String[] args) {
        String input = "13696581731";
        boolean flag = input.matches("13\\d{9}");
        System.out.println(flag); //true
    }

案例

1.简单的邮箱格式验证

public static void main(String[] args) {
        String input = "qianjin1920@qq.com";
        boolean mailNotify = input.matches(".+@.+\\.com");
        if (mailNotify){
            System.out.println("邮箱验证成功");
        }else {
            System.out.println("验证失败");
        }
    }

2.验证一个数是否是整数或小数

public static void main(String[] args) {
        String num1 = "0.5";
        if (num1.matches("[-+]?[1-9]\\d*|0\\.\\d+")){
            System.out.println("yes");
        }else {
            System.out.println("No");
        }
    }

3.网址解析

public static void main(String[] args) {
        String url = "http://www.sohu.com:8080/abc/index.htm";
        Matcher matcher = Pattern.compile("([a-z]+)://(www\\.[a-z]+\\.[a-z]+):(\\d+)[\\w-/]*/([\\w.]+)").matcher(url);
        while (matcher.find()){
            System.out.println(matcher.group(0));
            System.out.println("协议：" + matcher.group(1));
            System.out.println("域名：" + matcher.group(2));
            System.out.println("端口：" + matcher.group(3));
            System.out.println("文件名：" + matcher.group(4));
        }
    }

运行结果如下：

4.密码等级校验

密码按如下规则进行计分，并根据不同的得分为密码进行安全等级划分。

一、密码长度:

5 分: 小于等于4 个字符
10 分: 5 到7 字符
25 分: 大于等于8 个字符

二、字母:

0 分: 没有字母
10 分: 密码里的字母全都是小（大）写字母
20 分: 密码里的字母符合”大小写混合“

三、数字:

0 分: 没有数字
10 分: 1 个数字
20 分: 大于1 个数字

四、符号:

0 分: 没有符号
10 分: 1 个符号
25 分: 大于1 个符号

五、奖励（只能选符合最多的那一种奖励）:

2 分: 字母和数字
3 分: 字母、数字和符号
5 分: 大小写字母、数字和符号

最后的评分标准:

>= 90: 非常安全
>= 80: 安全（Secure）
>= 70: 非常强
>= 60: 强（Strong）
>= 50: 一般（Average）
>= 25: 弱（Weak）
>= 0: 非常弱（Very_Weak）

对应输出为：

VERY_SECURE
SECURE
VERY_STRONG
STRONG
AVERAGE
WEAK
VERY_WEAK

请根据输入的密码字符串，进行安全评定。
注：
字母：a-z, A-Z
数字：0-9
符号包含如下： (ASCII码表可以在UltraEdit的菜单view->ASCII Table查看)
!"#$%&'()*+,-./ (ASCII码：0x21~0x2F)
:;<=>?@ (ASCII码：0x3A~0x40)
[\]^_` (ASCII码：0x5B~0x60)
{|}~ (ASCII码：0x7B~0x7E)

输入描述：

输入一个string的密码

输出描述：

输出密码等级

public class HWTest {
    public static void main(String[] args) {
        Scanner in = new Scanner(System.in);
        String pwd = in.nextLine();
        int score = 0;
        //奖励标志位，分别表示大写字母，小写字母，数字，符号
        int[] reward = {0, 0, 0, 0};
        // 长度
        if (pwd.length() >= 8){
            score += 25;
        }else if (pwd.length() >= 5){
            score += 10;
        }else{
            score += 5;
        }
        System.out.println("长度+" + score);
        // 字母
        Matcher matcher = Pattern.compile("(?=.*[a-z])(?=.*[A-Z])").matcher(pwd);
        Matcher matcher1 = Pattern.compile("[a-z]+|[A-Z]+").matcher(pwd);
        if (matcher.find()){ //字母大小混合
            score += 20;
            reward[0] = reward[1] = 1;
        }else if (matcher1.find()){ //只有大写字母或者小写字母
            score += 10;
            reward[0] = 1; //只有大写或者小写字母时计大写字母标志位为1
        } //没有字母不加分
        System.out.println("字母+" + score);

        //数字
        Matcher matcher2 = Pattern.compile("\\d").matcher(pwd);
        int count = 0;
        while (matcher2.find()){
            count++;
        }
        if (count == 1){ // 只有1个数字
            score += 10;
            reward[2] = 1;
        }else if(count > 1){ //多个数字
            score += 20;
            reward[2] = 1;
        }
        System.out.println("数字+" + score);

        //符号
        String regex = "[!\"#$%&'()*+,-./:;<=>?@\\[\\]^_`{|}~]";
        Matcher matcher3 = Pattern.compile(regex).matcher(pwd);
        count = 0;
        while (matcher3.find()){
            count++;
        }
        if (count == 1){ // 只有1个符号
            score += 10;
            reward[3] = 1;
        }else if(count > 1){ //多个符号
            score += 25;
            reward[3] = 1;
        }
        System.out.println("符号+" + score);

        //奖励
        if (reward[0] == 1 && reward[1] == 1 && reward[2] == 1 && reward[3] == 1){
            score += 5;
        }else if (reward[0] == 1 && reward[2] == 1 && reward[3] == 1){
            score += 3;
        }else if (reward[0] == 1 && reward[2] == 1){
            score += 2;
        }
        System.out.println("奖励+" + score);

        if (score >= 90){
            System.out.println("VERY_SECURE");
        }else if (score >= 80){
            System.out.println("SECURE");
        }else if (score >= 70){
            System.out.println("VERY_STRONG");
        }else if (score >= 60){
            System.out.println("STRONG");
        }else if (score >= 50){
            System.out.println("AVERAGE");
        }else if (score >= 25){
            System.out.println("WEAK");
        }else {
            System.out.println("VERY_WEAK");
        }
        System.out.println("总得分：" + score);
    }
}