什么是正则表达式
一个特殊字符串,用于描述匹配一个字符串集合的模式。可以用它来进行字符串的匹配、替换和拆分。
正则表达式语法
用于匹配符合某一特征的字符串
Metacharacter | Description |
---|---|
| | Find a match for any one of the patterns separated by | as in: cat|dog|fish |
. | Find just one instance of any character |
^ | Finds a match as the beginning of a string as in: ^Hello |
$ | Finds a match at the end of the string as in: World$ |
\d | Find a digit |
\s | Find a whitespace character |
\b | Find a match at the beginning of a word like this: \bWORD, or at the end of a word like this: WORD\b |
\uxxxx | Find the Unicode character specified by the hexadecimal number xxxx |
用于搜索在某一范围内的字符
Expression | Description |
---|---|
[abc] | Find one character from the options between the brackets |
[^abc] | Find one character NOT between the brackets |
[0-9] | Find one character from the range 0 to 9 |
用于匹配字符出现的次数
Quantifier | Description |
---|---|
n+ | Matches any string that contains at least one n |
n* | Matches any string that contains zero or more occurrences of n |
n? | Matches any string that contains zero or one occurrences of n |
n{x} | Matches any string that contains a sequence of X n's |
n{x,y} | Matches any string that contains a sequence of X to Y n's |
n{x,} | Matches any string that contains a sequence of at least X n's |
匹配字符串
使用String中的matches方法来匹配字符串,返回True或者False
替换和拆分字符串
使用String中的replaceAll方法替换所有匹配的子字符串,类似的方法还有replaceFirst
使用split方法将一个字符串以匹配的分隔符拆分为子字符串
note
正则表达式以元表达式的形式给出,Java中若要使用 \ 要以转义字符 \\ 表示
默认情况下,量词符都是贪婪的,如
System.out.println(“Jaaavaa".replaceFirst("a+","R"));
会匹配到 aaa ,可以通过在正则表达式后面加?使量词符变为惰性,如
System.out.println(“Jaaavaa".replaceFirst("a+?","R"));
会匹配到 a