黑马程序员 Java自学总结十七正则表达式-CSDN博客

本文链接：https://blog.csdn.net/u011425950/article/details/9856275

------ ASP.Net+Android+IO开发.Net培训期待与您交流！ ------

总结内容来源于黑马毕老师的java基础教学视频

正则表达式:符合一定规则的表达式.
   学习正则表达式,就是学习一些特殊符号的应用.
   作用:用于专门操作字符串.
   特点:用一些特点的符号来表示一些代码操作.简化书写.

   好处:可以简化对字符串的复杂操作.
   弊端:符号定义越多,正则越长,阅读性越差.

各种符号的意义请参见API文档

具体操作功能:

1.匹配:String matches方法,用规则匹配整个字符串,返回boolean型

2.切割:String split方法

3.替换:String replaceAll方法

[html]view plaincopy 
   
 class  RegexDemo    
 {    
     public static void main(String[] args)     
     {    
         matchDemo("2334498398239","1[358]\\d{9}");    
         matchDemo("13842229504","1[358]\\d{9}");    
         System.out.println("------------匹配----------------");    
     
         splitDemo("andy.lili.tony","\\.");//按"."切，注意要用\\.     
         splitDemo("kdsjfksf  kdsfjksdf    dkffj kdfj jf"," +");//按 空格切    
         splitDemo("g:\\java\\code\\day25","\\\\");//按”//“切    
         splitDemo("dfkdddjfkdaadfzzzzkkkq","(.)\\1+");//按叠词切    
         System.out.println("------------切割----------------");    
             
         replaceDemo("aadddeeeffdoe","(.)\\1+","$1");//将重叠的字符替换成单个字母。iiii>>i 用 $ 获取前一个正则表达式的组。    
         replaceDemo("andy12;number1388882228384;","\\d{5,}","*");//将字符串中的数字替换成*。    
         replaceDemo("cdddffeeaadfafkkjkqqccg","(.)\\1+","#");//将叠词提出成#    
     }    
     public static void matchDemo(String s,String reg)    
     {    
         boolean flag = s.matches(reg);    
         if(flag)    
             System.out.println("手机号输入正确");    
         else    
             System.out.println("输入号码有误");    
     }    
     public static void splitDemo(String s,String reg)    
     {    
         String[] arr = s.split(reg);    
         System.out.println(arr.length);    
         for(String str:arr)    
         {    
             System.out.println(str);    
         }    
     }    
     public static void replaceDemo(String s,String reg,String rp)    
     {    
         String ar= s.replaceAll(reg,rp);    
         System.out.println(ar);    
     }    
 }    

4.用于复杂的获取，要与pattern和matcher配合使用；

操作步骤：

1)导入正则包，import java.regex.*;

2)将正则表达式封装成pattern对象；

3)将正则对象和要操作的字符串关联；

4)关联后，获取正则匹配引擎，matcher

5)通过引擎，对符合规则的子串进行操作；

代码:

[html]view plaincopy 
    
 import java.util.regex.*;//1    
 class GetDemo     
 {    
     public static void main(String[] args)     
     {    
         String s = "adfddfdfflkljkhjjj;; woifieujjfjfslfjdfad;w";    
     
         String reg="(.)\\1+";    
     
         Pattern p = Pattern.compile(reg);//2    
     
         Matcher m = p.matcher(s);//3    
     
         while(m.find())//4    
         {    
             System.out.println(m.group());    
         }    
     }    
 }    

如何判断要选取那种功能进行操作？

1.如果只想知道该字符串是对是错，使用匹配；

2.想要将自己的字符串变成另一个字符串，用替换；

3.想要按照指定的方式将字符串变成多个字符串，用切割；

4.想要拿到符合需求的子串，使用获取；

练习：

[html]view plaincopy 
     
 import java.util.*;    
 class  Test    
 {    
     public static void main(String[] args)     
     {    
         test1();     
         ipSort();    
         checkMail();    
     }    
     public static void test1()    
     {    
         String str = "黑黑.....黑黑黑....马程...程程....程.....程序...序..序序序.....员...员";     
          //str=str.replaceAll(str,"黑马程序员");    
          str=str.replaceAll("\\.","");    
          str=str.replaceAll("(.)\\1+","$1");    
          System.out.println(str);    
     }    
     /*   
      需求2： 将下边的ip地址段进行地址段顺序的排序。    
      192.168.1.15    23.48.56.109    10.73.91.18    254.253.252.1    1.23.25.26     
     
      思路：    
      还按照字符自然顺序，只要让它们每一段都是3位即可，    
      1.按照每一段需要的最多的0进行补齐，那么每一段就会至少保证有3位。    
      2.将每一段只保留三位，这样所有的ip地址都是每一段3位。    
      */    
       public static void ipSort()      
      {      
         String ip = "92.168.1.15 23.48.56.109 10.73.91.18 254.253.252.1 1.23.25.26";      
         ip = ip.replaceAll("(\\d+)","00$1");      
         System.out.println(ip);      
         ip = ip.replaceAll("0+(\\d{3})","$1");      
         String [] arr = ip.split(" +");      
         //Arrays.sort(arr);//如果有重复的IP地址并要获取，那么必须用这个      
         TreeSet<String> ts = new TreeSet<String>();           
         for(String str:arr)      
         {         
             //System.out.println(str);                
             ts.add(str);                  
         }      
         for(String s:ts)      
         {      
             System.out.println(s.replaceAll("0+(\\d+)","$1"));      
         }      
               
         /*    
         Iterator<String> it = ts.iterator();    
         while(it.hasNext())    
         {    
             String str = it.next();    
             str=str.replaceAll("0+(\\d+)","$1");    
             System.out.println(str);    
         }    
         */      
      }    
      /*    
     需求3：对邮件地址进行校验。这个必须掌握    
     */      
     public  static void checkMail()      
     {      
         String mail ="asdfas12@sina.com.cn";      
         //mail = "1@1.1";      
         String reg = "[a-zA-Z0-9_]+@[a-zA-Z0-9]+(\\.[a-zA-Z]+){1,3}";      
         //reg = "\\w+@\\w+(\\.\\w+)";//相对不太精确的匹配。      
         //mail.indexOf("@")!=-1;这种方式不要用。      
         System.out.println(mail.matches(reg));      
     }         
     
 }    

网络爬虫（蜘蛛）：

是搜索引擎的原理之一，用以检索网上的信息，如邮箱地址，博客，关键字等等；

[html]view plaincopy 
     
 import java.io.*;    
 import java.util.regex.*;    
 import java.net.*;    
 class  RegexTest2    
 {    
     public static void main(String[] args) throws IOException    
     {    
         getMails_2();           
     }    
     //需求2：从网页上获取邮件地址    
     public static void getMails_2() throws IOException    
     {    
         //建立应用层用于连接的端点URLConnection    
         URL url = new URL("http://www.163.com/");    
         URLConnection conn = url.openConnection();    
         //定义读取流，用于读取服务器返回的数据    
         BufferedReader bufIn = new BufferedReader(new InputStreamReader(conn.getInputStream()));    
         String line = null;         
         //定义正则表达式，并获取Pattern对象。    
         String reg = "\\w+@\\w+(\\.\\w+)+";    
         Pattern p = Pattern.compile(reg);    
         while((line = bufIn.readLine())!=null)    
         {       
             //获取与某一字符串的关联的正则表达式配适器。    
             Matcher m = p.matcher(line);    
             while(m.find())    
             {    
                 System.out.println(m.group());    
             }               
         }    
     }    
     /*   
     需求1：获取指定文档中的邮件地址。   
     使用获取功能。Pattern Matcher   
     */    
     public static void getMails() throws IOException    
     {    
         //http://localhost:8080/myweb/mail.html         
         BufferedReader bufr = new BufferedReader(new FileReader("mail.txt"));    
         String line = null;         
         String reg = "\\w+@\\w+(\\.\\w+)+";    
         Pattern p = Pattern.compile(reg);    
         while((line = bufr.readLine())!=null)    
         {               
             Matcher m = p.matcher(line);    
             while(m.find())    
             {    
                 System.out.println(m.group());    
             }    
                 
         }    
     }    
 }