java 正则表达式Matcher方法讲解--上篇

  1. appendReplacement(StringBuffer sb, String replacement)
    根据方法名可知, 这是添加更换方法, 其意思是将输入字符序列中首次与正则表达式匹配之前的字符序列添加到sb中,然后将匹配部分更改为replacement字符序列并添加到sb中。
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Created by a123 on 16/12/27.
 */
public class Test {

    public static void main(String[] args) {
        Pattern p = Pattern.compile("java");
        Matcher m = p.matcher("The java book is  program book c");
        StringBuffer sb = new StringBuffer();
        if (m.find()) {
            m.appendReplacement(sb, "c++");
        }
//        m.appendTail(sb);
        System.out.println(sb);
    }
}
/**
输出:
The c++

Process finished with exit code 0
*/

2 . appendTail(StringBuffer sb)
根据字面意思知道这是添加剩余部分的意思, 也就是说将未匹配的字符序列添加到sb中, 结合上一个例子, appendReplacement()方法只是将匹配一起的字符序列添加到sb中, 而剩余部分舍弃, 结合appendTail()方法这可以将剩余部分添加到sb末尾

import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Created by a123 on 16/12/27.
 */
public class Test {

    public static void main(String[] args) {
        Pattern p = Pattern.compile("java");
        Matcher m = p.matcher("The java book is  program book c");
        StringBuffer sb = new StringBuffer();
        if (m.find()) {
            m.appendReplacement(sb, "c++");
        }
        m.appendTail(sb);
        System.out.println(sb);
    }
}
/**
 输出:
 The c++ book is  program book c

 Process finished with exit code 0
 */

3 . end()
The offset after the last character matched
该方法时查找输入字符序列中首次与正则表达式匹配部分最后一个字符所在位置

import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Created by a123 on 16/12/27.
 */
public class Test {

    public static void main(String[] args) {
        Pattern p = Pattern.compile("java");
        Matcher m = p.matcher("The java book is java program book c");
        if (m.find()) {
            System.out.println(m.end());//8
        }
    }
}

注意:上述代码中不能缺少if(m.find())判断, 否则将会抛出IllegalStateException异常, 看看源代码就知道为什么了

/**
* The range of string that last matched the pattern. If the last
* match failed then first is -1; last initially holds 0 then it
* holds the index of the end of the last match (which is where the
* next search starts).
*/
int first = -1, last = 0;

 public boolean find() {
        int nextSearchIndex = last;
        if (nextSearchIndex == first)
            nextSearchIndex++;

        // If next search starts before region, start it at region
        if (nextSearchIndex < from)
            nextSearchIndex = from;

        // If next search starts beyond region then it fails
        if (nextSearchIndex > to) {
            for (int i = 0; i < groups.length; i++)
                groups[i] = -1;
            return false;
        }
        return search(nextSearchIndex);
    }

 public int end() {
        if (first < 0)
            throw new IllegalStateException("No match available");
        return last;
    }

因为在Matcher中, first是一个全局变量, 默认值为-1, 如果不进行一次匹配操作, 那么在调用end()方法是坑定会抛出该异常。

4 . groupCount()

Returns the number of capturing groups in this matcher’s pattern.

Group zero denotes the entire pattern by convention. It is not included in this count.

Any non-negative integer smaller than or equal to the value returned by this method is guaranteed to be a valid group index for this matcher.

统计pattern中group的数量, 有人说其实就是统计pattern中左括号的个数, 根据部分测试似乎也是这样, 但是看源码,没有看懂。。。。。。

import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Created by a123 on 16/12/27.
 */
public class Test {

    public static void main(String[] args) {
        Pattern p = Pattern.compile("yties|ad|((java)|(book))");
        Matcher m = p.matcher("");
        System.out.println(m.groupCount());
    }
}
//3

4 . find()

Attempts to find the next subsequence of the input sequence that matches the pattern.

This method starts at the beginning of this matcher’s region, or, if a previous invocation of the method was successful and the matcher has not since been reset, at the first character not matched by the previous match.

If the match succeeds then more information can be obtained via the start, end, and group methods.

该方法用于判断输入字符序列中是否能与pattern相匹配, 如果匹配reteurn true, 否则return false.
注意:如果输入字符序列中存在多个能与pattern匹配的子序列, 那么第一调用find()方法默认与第一个子序列匹配, 第二次与调用与第二个子序列匹配,以此类推, 如:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Created by a123 on 16/12/27.
 */
public class Test {

    public static void main(String[] args) {
        Pattern p = Pattern.compile("java");
        Matcher m = p.matcher("The java book is  program book c");
        StringBuffer sb = new StringBuffer();
        System.out.println(m.find());//true
        System.out.println(m.find());//false
        System.out.println(m.find());//false
    }
}
//第一调用find()是能匹配, 但是第二次, 第三次调用时就不能匹配, 查看find方法源码就知道为什么会这样。
public boolean find() {
        int nextSearchIndex = last;
        if (nextSearchIndex == first)
            nextSearchIndex++;

        // If next search starts before region, start it at region
        if (nextSearchIndex < from)
            nextSearchIndex = from;

        // If next search starts beyond region then it fails
        if (nextSearchIndex > to) {
            for (int i = 0; i < groups.length; i++)
                groups[i] = -1;
            return false;
        }
        return search(nextSearchIndex);
    }

boolean search(int from) {
        this.hitEnd = false;
        this.requireEnd = false;
        from        = from < 0 ? 0 : from;
        this.first  = from;
        this.oldLast = oldLast < 0 ? from : oldLast;
        for (int i = 0; i < groups.length; i++)
            groups[i] = -1;
        acceptMode = NOANCHOR;
        boolean result = parentPattern.root.match(this, from, text);
        if (!result)
            this.first = -1;
        this.oldLast = this.last;
        return result;
    }
//可以知道, 每次调用find()都会更新last,而这个last正是最后一次调用find()匹配位置的记录。

5 . find(int start)

Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index.

If the match succeeds then more information can be obtained via the start, end, and group methods, and subsequent invocations of the find() method will start at the first character not matched by this match.

这个方法重载了find()方法, 其意思也与find()方法非常现实, 只是find()方法默认是从0开始查询匹配, 而且会有匹配记录, 但是这个方式是从指定位置开始查找匹配,并且多次调用结果一致,如:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Created by a123 on 16/12/27.
 */
public class Test {

    public static void main(String[] args) {
        Pattern p = Pattern.compile("java");
        Matcher m = p.matcher("The java book is  program book c");
        StringBuffer sb = new StringBuffer();
        System.out.println(m.find());//true
        System.out.println(m.find(1));//true
        System.out.println(m.find(1));//true
        System.out.println(m.find(5));//false
    }
}
/**
调用两次find(1)结果一致
调用find(5)结果为false
*/

6 . String group()

Returns the input subsequence matched by the previous match.

For a matcher m with input sequence s, the expressions m.group() and s.substring(m.start(), m.end()) are equivalent.

Note that some patterns, for example a*, match the empty string. This method will return the empty string when the pattern successfully matches the empty string in the input.

group()方法在上边英文中描述得很清楚了, 其实就是一个字符串集截取方法,m.group() 等同于 m.substring(m.start(), m.end());需要注意的匹配像 a*这样的匹配。

import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Created by a123 on 16/12/27.
 */
public class Test {

    public static void main(String[] args) {
        Pattern p = Pattern.compile("java");
        Matcher m = p.matcher("The java book is java program book c");

        if (m.find()) {
            System.out.println(m.group());//java
            System.out.println(m.group());//java
            System.out.println(m.group());//java
        }
    }
}

方法源码, 这里使用group(int group)的原因是因为在group()中调用group(0);

public String group(int group) {
        if (first < 0)
            throw new IllegalStateException("No match found");
        if (group < 0 || group > groupCount())
            throw new IndexOutOfBoundsException("No group " + group);
        if ((groups[group*2] == -1) || (groups[group*2+1] == -1))
            return null;
        return getSubSequence(groups[group * 2], groups[group * 2 + 1]).toString();
    }

7 . public boolean matches()

Attempts to match the entire region against the pattern.

If the match succeeds then more information can be obtained via the start, end, and group methods.

这个方法是从全局匹配, 查看是否有能匹配


import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 * Created by a123 on 16/12/27.
 */
public class Test {

    public static void main(String[] args) {
        Pattern p = Pattern.compile("java");
        Matcher m = p.matcher("The java book is java program book c");
        System.out.println(m.matches());//false

        Pattern p1 = Pattern.compile(".*java.*");
        Matcher m1 = p1.matcher("The java book is java program book c");
        System.out.println(m1.matches());//true
    }
}
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值