基本定义: | 解释 | 解释 | 备注 |
ab | Concatenation | 和 |
|
a | b | OR | 或 |
|
a* | EMPTY OR MORE | 空或多个 | 贪婪性(尽可能多的取配符合的pattern) |
a+ | 1 OR MORE | 1个或多个 | 贪婪性 |
a? | NONE OR 1 | 空或1个 | 贪婪性 |
+? OR *? OR | 懒惰性(与贪婪性正好相反 失败了 就回溯表达式) | Matching a HTML tag:(ex:<HTML>) before: <.+> | After: <.+?> OR <.*?> |
惰性代替方案 | 一个贪婪重复与一个取反字符集 | 不用回溯 | Even better: <[^>]+> |
|
|
|
|
缩写: |
|
|
|
. | ALL THE CHARS | 所有字符 |
|
\d OR \\d | ANY DIGIT | 所有数字 | [0-9] |
\D | NEG of(\d) | 除数字之外 | [^0-9] |
\w | WORD | 单词 | [a-zA-Z_0-9] |
\W | NEG (\w) | 处单词之外 | [^\w] |
\s | ESCAPE SEQ. | 逃离顺序符 | [ \t\n\x0B\f\r] 注意有空格 |
\S | NEG (\s) | 除了逃离顺序符之外的 | [^\s] |
|
|
|
|
边界: |
|
|
|
^ | START | 锚:开始 |
|
$ | END | 锚:尾 |
|
\b | WORD BOUNDARY | 单词边界**(需注意其判明机制) |
|
\B | NEG (\b) | 除单词边界之外的 |
|
|
|
|
|
括号,否定和其他 |
|
|
|
\ | LITERALS | 转义** | [ , \ , ^ , - ," , . , " , ] |
(ab) | "()"TREAT AS A GROUP | 小括号表示集体 |
|
[ab] | OR *ONLY MATCHING CHAR | 中括号表示或者 |
|
[^ab] | NEG OF (a|b) | 中括号里面加插入号表示“非” |
|
[a-z] | RANGE | 破折号表示区间 |
|
&& | INTERSECTION * FOR BOLLEAN RELATION | “&&”表示共有的 |
|
[a,b] | TO SEPARATE? * NOT SURE | 逗号用于区分 |
|
((?!xxxx).)* if then | NEGATIVE LOOK AROUND?* ADVANCED USAGE | ?(A)B|C)IF A THEN B ELSE C. (?(A)B) IF A THEN B
| 详情:http://ocpsoft.org/opensource/guide-to-regular-expressions-in-java-part-2/ |
|
|
|
|
重复特定的一个字符: |
|
|
|
X{N} | MATCH X EXACTLY N TIMES | 取配单个字符X 正好M次 |
|
X{N,} | AT LEAST N TIMES | 至少N次 |
|
X{N,M} | AT LEAST N TIMES BUT NO MORE THEN M TIMES | 至少N次 最多不超过M次 |
|
|
|
|
|
需要注意的地方: |
|
|
|
java 转义时要多加一个“\” | \\ |
|
|
[(ab)(ba)] | "[]"WOULD NOT WORK FOR STRING | 中括号不适合字符串的应用 |
|
[(ab)(bc)] ?= [a,b,b,c] | SAME THING | 同上 |
|
ab|ba OR (ab|ba) | THIS IS PORPER USE FOR STING | 正确用字符串 |
|
[^a-z] | NOT a to z |
|
|
[^abc] | Not a OR b OR c |
|
|
"\\" -> "\\\\" | FOR JAVA |
|
|
[a-z&&[def]] | INTERSECTION *BE CAREFULL | 满足a-z同时满足d或e或f 其实就是d|e|f |
|
[a-d[m-p]] | UNION | 或,a到d或者m到p |
|
[a-z&&[^bc]] | SUBTRACTION | 除了b或c之外从a到z |
|