java-正则

最新推荐文章于 2023-06-28 00:04:45 发布

bobshute

最新推荐文章于 2023-06-28 00:04:45 发布

阅读量166

点赞数

文章标签：正则 JSE

本文链接：https://blog.csdn.net/bobshute/article/details/79968738

版权

1.概述

1.1 预定义字符类

符号	说明
.	任何字符（与行结束符可能匹配也可能不匹配）
\d	数字：[0-9]
\D	非数字： [^0-9]
\s	空白字符：[ \t\n\x0B\f\r]
\S	非空白字符：[^\s]
\w	单词字符：[a-zA-Z_0-9]
\W	非单词字符：[^\w]

|转义字符，比如”\”匹配”\” ,”{“匹配”{“。

1.2 数量词

符号	说明
*	｛0，｝,匹配0至多个在它之前的字符。例如正则表达式“zo”能匹配“z”以及“zoo”；正则表达式“.”意味着能够匹配任意字符串。
+	等价于｛1，｝,匹配前面的子表达式一次或多次。例如正则表达式9+匹配9、99、999等。
?	等价于｛0，1｝,匹配前面的子表达式零次或一次。例如，”do(es)?” 可以匹配 “do” 或 “does” 中的”do” 。此元字符还有另外一个用途，就是表示非贪婪模式匹配，后边将有介绍
{n}	匹配确定的 n 次。例如，“e{2}”不能匹配“bed”中的“d”，但是能匹配“seed”中的两个“e”。
{n,}	至少匹配n次。例如，“e{2,}”不能匹配“bed”中的“e”，但能匹配“seeeeeeeed”中的所有“e”。
{n,m}	最少匹配 n 次且最多匹配 m 次。“e{1,3}”将匹配“seeeeeeeed”中的前三个“e”。

1.3 边界匹配符号

符号	说明
^	行的开头
$	行的结尾
\b	单词边界
\B	非单词边界
\A	输入的开头
\G	上一个匹配的结尾
\Z	输入的结尾，仅用于最后的结束符（如果有的话）
\z	输入的结尾

1.4 其它常用符号

符号	说明
[]	匹配括号中的任何一个字符
[abc]	a、b 或 c（简单类）
[^abc]	任何字符，除了 a、b 或 c（否定）
[a-zA-Z]	a 到 z 或 A 到 Z，两头的字母包括在内（范围）
[a-d[m-p]]	a 到 d 或 m 到 p：[a-dm-p]（并集）
[a-z&&[def]]	d、e 或 f（交集）
[a-z&&[^bc]]	a 到 z，除了 b 和 c：[ad-z]（减去）
[a-z&&[^m-p]]	a 到 z，而非 m 到 p：[a-lq-z]（减去）

符号	说明
()的使用	组
()	将()之间括起来的表达式定义为“组”(group)，并且将匹配这个表达式的字符保存到一个临时区域,这个元字符在字符串提取的时候非常有用。捕获组可以通过从左到右计算其开括号来编号。
(\d)	第一组
((A)(B(C)))	第一组 ((A)(B(C))) 第二组 (A) 第三组(B(C)) 第四组(C)

2.示例


public class TestRegex {


    public static void main(String[] args) {
        // Matcher.lookingAt()

        // 1.1 是否匹配-判断是否为邮箱
            testMatch_email();
        // 1.2 是否匹配-收尾判断
            testMatch_StartAndEnd();
        // 2.匹配字符是否存在及匹配字符的值
            testMatchChar1();
        // 3.匹配查找字符-贪婪/非贪婪匹配(匹配多个)
            testMatchChar2();
        // 4.正则分组group
            testGroup();
        // 5.正则表达式-替换
             testReplace();
        // 6.正则表达式-分割
             testSplit();
        // 7.正则-find和lookat的区别
            testLookAt();
        // 8.常用正则表达式
            commonRegex();

    }

    // 1.1 是否匹配-判断是否为邮箱
    public static void testMatch_email() {
        String regex = "([a-zA-Z0-9_-]|[.])+@([a-zA-Z0-9_-])+(.[a-zA-Z0-9_-])+";
        Pattern p = Pattern.compile(regex);

        String email = "2111@qq.com";   
        Matcher m = p.matcher(email);
        boolean matchesResult = m.matches();
        System.out.println("matchesResult->" + matchesResult);// matchesResult->true

        String email2 = "2111#qq.com";
        Matcher m2 = p.matcher(email2);
        boolean matchesResult2 = m2.matches();
        System.out.println("matchesResult2->" + matchesResult2);// matchesResult2->false
    }

    // 1.2 是否匹配-收尾判断
    public static void testMatch_StartAndEnd(){
        String str = "testhascharhastest";
        String regex = "^t\\w*t$";
        Pattern p = Pattern.compile(regex);
        //匹配大小写
        Matcher m = p.matcher(str);
        //忽略大小写的写法
        //Pattern pat = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
        boolean matchesResult = m.matches();
        System.out.println("matchesResult2->" + matchesResult);// matchesResult->true
    }

    // 2 匹配字符是否存在及匹配字符的值
    public static void testMatchChar1() {
        String str = "testhascharhastest";
        String regex = "has\\w{1}";
        // 匹配大小写
        Pattern pattern = Pattern.compile(regex);
        // 忽略大小写的写法
        // Pattern pat = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
        Matcher matcher = pattern.matcher(str);
        // 查找字符串中是否有匹配正则表达式的字符/字符串
        boolean isHas = matcher.find();
        // isHas-2-1->true
        System.out.println("isHas-2-1->" + isHas);
        while (matcher.find()) {
            String matcherStr = matcher.group();
            // hast,start:11-15
            System.out.println("matcherStr->" + matcherStr + ",start:" + matcher.start() + "-" + matcher.end());
        }
    }



    //3.匹配查找字符-贪婪/非贪婪匹配(匹配多个)
    /*
     ====贪婪模式=====
    <td>hello world</td><td>hello regex</td>   位置：[11,51]
    ====非贪婪模式=====
    <td>hello world</td>   位置：[11,31]
    <td>hello regex</td>   位置：[31,51]
     */
    public static void testMatchChar2(){
        //提取td元素里的内容  
        String str="<table><tr><td>hello world</td><td>hello regex</td></tr></table>";  
        //贪婪模式  * + {n,} 默认情况是贪婪模式匹配  
        System.out.println("====贪婪模式=====");  
        //编译正则表达式到模式对象  
        String regex = "<td>.*</td>";
        Pattern p=Pattern.compile(regex);  
        // 忽略大小写的写法
        //Pattern p = Pattern.compile(regex, Pattern.CASE_INSENSITIVE);
        //得到匹配器  
        Matcher m=p.matcher(str);  
        //通过find方法查找匹配，找到就返回true，否则返回false  
        while(m.find()){  
            //通过group方法获取前面find查找到的子字符串，start、end方法获取子字符串开始和结束位置  
            System.out.println(m.group()+"   位置：["+m.start()+","+m.end()+"]");  
        }  

        //非贪婪模式，?跟在 * + {n,} 等的后面时，表示非贪婪模式，注意和子表达式后面的?区分开，子表达式后的?表示匹配0次或1次  
        System.out.println("====非贪婪模式=====");  
        String regex2 = "<td>.*?</td>";
        p=Pattern.compile(regex2);  
        m=p.matcher(str);  
        while(m.find()){  
            System.out.println(m.group()+"   位置：["+m.start()+","+m.end()+"]");  
        }  
    }

    //4.正则分组group
    /*
    group是针对()来说的,group(0)就是指的整个串,group(1) 指的是第一个括号里的东西,group(2)指的第二个括号里的东西。
    输出结果如下:
    Group 0:World!,Start:6 End:12
    Group 1:or,Start:7 End:9
    Group 2:ld!,Start:9 End:12
    Wor
     */
    public static void testGroup(){
           String str = "Hello,World! in Java.";
           Pattern pattern = Pattern.compile("W(or)(ld!)");  
           Matcher matcher = pattern.matcher(str);  
           while(matcher.find()){  
                System.out.println("Group 0:"+matcher.group(0)+",Start:"+matcher.start(0)+" End:"+matcher.end(0));//得到第0组——整个匹配 ,及索引
                System.out.println("Group 1:"+matcher.group(1)+",Start:"+matcher.start(1)+" End:"+matcher.end(1));//得到第一组匹配——与(or)匹配的,及索引  
                System.out.println("Group 2:"+matcher.group(2)+",Start:"+matcher.start(2)+" End:"+matcher.end(2));//得到第二组匹配——与(ld!)匹配的，组也就是子表达式  ,及索引
                System.out.println(str.substring(matcher.start(0),matcher.end(1)));//从总匹配开始索引到第1组匹配的结束索引之间子串——Wor
           }
    }

    //5.正则表达式-替换
    //将字符串中的数字都替换掉
    public static void testReplace(){
        String str = "a1b2c34d56e78f90g11h12i23j34k45l56m67n";
        String regex = "\\d*";
        String str2 = str.replaceAll(regex, "");
        System.out.println("str2->"+str2); //str2->abcdefghijklmn
    }

    // 6.正则表达式-分割
    //根据正则表达式切割字符串
    /*
     输出结果:
    ->a
    ->b
    ->c
    ->d
    ->e
    ->f
    ->g 
     */
    public static void testSplit(){
        String str = "a1b2c34d56e78f90g";
        String regex = "\\d+";
        String[] strArray = str.split(regex); 
        for(int i=0;i<strArray.length;i++){
            System.out.println("->"+strArray[i]);
        }
    }

    //7.正则-find和lookat的区别
    /*
     *  false
        true
        true
        true
        false
        true
        true
        true
        true
     */
    public static void testLookAt(){
          Pattern p = Pattern.compile("\\d{3,5}");
          String s = "123-34345-234-00";
          Matcher m = p.matcher(s);//注意,matcher是全局匹配.这里明显是不匹配.但是,由于s里面的"123"和p是匹配的,所以他会将这三个数字去掉.下次调用匹配方法的时候,是将剩余的字符串来继续匹配
          System.out.println(m.matches());
          m.reset();//重新设置到最开始..如果这里没有这一步,下面的匹配将受到影响...
          System.out.println(m.find());//find方法是部分匹配..也就是说,只要找到有匹配的字符段就算匹配...但是,他和matcher方法一样也会将s里面已经匹配的字符去掉....这里匹配的是"123" 
          System.out.println(m.find());//剩余的字符"-34345-234-00"继续匹配.匹配的是"34345"
          System.out.println(m.find());//剩余字符"-234-00"继续匹配.匹配的是"234"
          System.out.println(m.find());//剩余字"-00"符继续匹配.已经没有匹配的字符,所以这个方法不匹配.

          System.out.println(m.lookingAt());//lookingAt方法也是部分匹配,但是他都是从最开始进行匹配...所以每次都是匹配"123"
          System.out.println(m.lookingAt());
          System.out.println(m.lookingAt());
          System.out.println(m.lookingAt());
    }

    public static  void commonRegex(){
         //正则表达式：验证用户名
        String REGEX_USERNAME = "^[a-zA-Z]\\w{5,20}$";

         //正则表达式：验证密码
        String REGEX_PASSWORD = "^[a-zA-Z0-9]{6,20}$";

         //正则表达式：验证手机号
        String REGEX_MOBILE = "^((17[0-9])|(14[0-9])|(13[0-9])|(15[^4,\\D])|(18[0,5-9]))\\d{8}$";

         //正则表达式：验证邮箱
        String REGEX_EMAIL = "^([a-z0-9A-Z]+[-|\\.]?)+[a-z0-9A-Z]@([a-z0-9A-Z]+(-[a-z0-9A-Z]+)?\\.)+[a-zA-Z]{2,}$";

         //验证汉字
        String REGEX_CHINESE = "^[\u4e00-\u9fa5],{0,}$";

         //验证身份证
        String REGEX_ID_CARD = "(^\\d{18}$)|(^\\d{15}$)";

         //证URL
        String REGEX_URL = "http(s)?://([\\w-]+\\.)+[\\w-]+(/[\\w- ./?%&=]*)?";

         //验证IP地址
        String REGEX_IP_ADDR = "(25[0-5]|2[0-4]\\d|[0-1]\\d{2}|[1-9]?\\d)";
    }

bobshute

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
java-正则

1.概述1.1 预定义字符类符号说明 . 任何字符（与行结束符可能匹配也可能不匹配） \d 数字：[0-9] \D 非数字： [^0-9] \s 空白字符：[ \t\n\x0B\f\r] \S 非空白字符：[^\s] \w 单词字符：[a-zA-Z_0-9] \W 非单词字符：[^\...
复制链接

扫一扫