java中使用正则表达式中的组

最新推荐文章于 2024-07-24 18:41:00 发布

Code0cean

最新推荐文章于 2024-07-24 18:41:00 发布

阅读量1.1k

点赞数 1

分类专栏：正则表达式文章标签：正则表达式 java

本文链接：https://blog.csdn.net/huangjhai/article/details/104077540

版权

正则表达式专栏收录该内容

5 篇文章 3 订阅

订阅专栏

组是括号划分的正则表达式，可以根据组的编号来引用某个组。组号为0表示整个表达式，组号1表示从左到右被第一个括号扩起的组，以此类推。
例如：
A(B(CD))E中有三个组：组0是ABCDE，组1是BCD，组2是CD。
Matcher对象提供了一系列方法，用以获取与组相关的信息：

方法	作用
public int groupCount()	返回该匹配器的模式中的分组数目，第0组不包括在内
public String group()	返回前一次匹配操作的第0组(整个匹配)
public String group(int i)	返回在前一次匹配操作期间的指定的组号
public int start(int group)	返回在前一次匹配操作中寻找到的组的起始索引
public int end(int group)	返回在前一次匹配操作中寻找到的组的最后一个字符索引加一的值

例子：

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class GroupsDemo {
    static public final String POEM="Twas brilling, and the slithy toves\n"
    		+ "Did gyre and gimble in the wabe.\n"
    		+ "All mimsy were the borogoves,\n"
    		+ "And the mome raths outgrabe.\n";
    
	public static void main(String[] args) {
		/*
		 * Patten.MULTILINE为模式标记，表示多行模式，在多行模式下，表达式^和$分别匹配一行的开始和结束，也可以匹配输入字符串的开始和结束
		 * \S+表示一次以上的非空格字符，s+表示一次以上的空格字符，目的匹配每行的最后3个字符。
		 */
		Matcher m=Pattern.compile("(\\S+)\\s+((\\S+)\\s+(\\S+))$",Pattern.MULTILINE).matcher(POEM);		
		while (m.find()) {
             for (int i = 0; i <=m.groupCount(); i++) {
            	 System.out.print("第"+i+"组是："+"["+m.group(i)+"]   ");		
			}
        	 System.out.println();
		}
         
	}

}

运行结果：

第0组是：[the slithy toves]   第1组是：[the]   第2组是：[slithy toves]   第3组是：[slithy]   第4组是：[toves]   
第0组是：[in the wabe.]   第1组是：[in]   第2组是：[the wabe.]   第3组是：[the]   第4组是：[wabe.]   
第0组是：[were the borogoves,]   第1组是：[were]   第2组是：[the borogoves,]   第3组是：[the]   第4组是：[borogoves,]   
第0组是：[mome raths outgrabe.]   第1组是：[mome]   第2组是：[raths outgrabe.]   第3组是：[raths]   第4组是：[outgrabe.]

start()和end()的使用：
在匹配操作成功之后，start()返回先前匹配的起始位置的索引，而end()返回所匹配的最后字符的索引加一的值。如果匹配操作失败后(或先于一个正在进行的匹配操作去操作)调用start()或end()将会产生IllegalStateException.
下面是使用例子：

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class GroupsDemo {
    static public final String POEM="Twas brilling, and the slithy toves\n"
    		+ "Did gyre and gimble in the wabe.\n"
    		+ "All mimsy were the borogoves,\n";
    
	public static void main(String[] args) {
		/*
		 * Patten.MULTILINE为模式标记，表示多行模式，在多行模式下，表达式^和$分别匹配一行的开始和结束，也可以匹配输入字符串的开始和结束
		 * \S+表示一次以上的非空格字符，s+表示一次以上的空格字符，目的匹配每行的最后3个字符。
		 */
		Matcher m=Pattern.compile("(\\S+)\\s+((\\S+)\\s+(\\S+))$",Pattern.MULTILINE).matcher(POEM);		
		while (m.find()) {
       	    System.out.print("起始索引为："+m.start());
       	    System.out.println("结束索引为："+m.end());
       	    
             for (int i = 0; i <=m.groupCount(); i++) {
            	 System.out.print("第"+i+"组是："+"["+m.group(i)+"]   ");	
            	    System.out.print("该组的起始索引为："+m.start(i));
            	    System.out.println("该组的结束索引为："+m.end(i));
			}
        	 System.out.println();
		}
         
	}

}

运行结果：

起始索引为：19结束索引为：35
第0组是：[the slithy toves]   该组的起始索引为：19该组的结束索引为：35
第1组是：[the]   该组的起始索引为：19该组的结束索引为：22
第2组是：[slithy toves]   该组的起始索引为：23该组的结束索引为：35
第3组是：[slithy]   该组的起始索引为：23该组的结束索引为：29
第4组是：[toves]   该组的起始索引为：30该组的结束索引为：35

起始索引为：56结束索引为：68
第0组是：[in the wabe.]   该组的起始索引为：56该组的结束索引为：68
第1组是：[in]   该组的起始索引为：56该组的结束索引为：58
第2组是：[the wabe.]   该组的起始索引为：59该组的结束索引为：68
第3组是：[the]   该组的起始索引为：59该组的结束索引为：62
第4组是：[wabe.]   该组的起始索引为：63该组的结束索引为：68

起始索引为：79结束索引为：98
第0组是：[were the borogoves,]   该组的起始索引为：79该组的结束索引为：98
第1组是：[were]   该组的起始索引为：79该组的结束索引为：83
第2组是：[the borogoves,]   该组的起始索引为：84该组的结束索引为：98
第3组是：[the]   该组的起始索引为：84该组的结束索引为：87
第4组是：[borogoves,]   该组的起始索引为：88该组的结束索引为：98