0.正则表达式是字符串处理利器,用于字符匹配,查找和替换。
1.两个反斜杠才表示一个\。就是转义的意思,其中“\.”也才是表示我们所看到的“.”
\\ | The backslash characte |
2.一些特殊的按键符号表示
\t | The tab character ('\u0009')制表符 |
\n | The newline (line feed) character ('\u000A')换行符 |
\r | The carriage-return character ('\u000D') |
\f | The form-feed character ('\u000C') |
\a | The alert (bell) character ('\u0007') |
\e | The escape character ('\u001B') |
\cx | The control character corresponding to x |
3.范围符号,只表示一个字符,可以用AND OR运算
范围举例
1)[abc] abc之一 [^abc] 不是abc任一个
2)[a-zA-Z] == [a-z]|[A-Z] == [a-z[A-Z]] 大小字母中的一个
3)[A-Z&&[^RFG]] A-Z且非RFG中的一个
Character classes | |
[abc] | a, b, or c (simple class) |
[^abc] | Any character except a, b, or c (negation) |
[a-zA-Z] | a through z or A through Z, inclusive (range) |
[a-d[m-p]] | a through d, or m through p: [a-dm-p] (union) |
[a-z&&[def]] | d, e, or f (intersection) |
[a-z&&[^bc]] | a through z, except for b and c: [ad-z] (subtraction) |
[a-z&&[^m-p]] | a through z, and not m through p: [a-lq-z](subtraction) |
预定义字符集合 Predefined character classes | |
. | Any character (may or may not match line terminators) |
\d (d for digit) | A digit: [0-9] 数字 |
\D | A non-digit: [^0-9] 非数字 |
\s (s for space) | A whitespace character: [ \t\n\x0B\f\r] 空字符 |
\S | A non-whitespace character: [^\s] 非空字符 |
\w (w for word) | A word character: [a-zA-Z_0-9] 数字 大小字母加下划线 |
\W | A non-word character: [^\w] 非\w |
规律:小写表示是此类,大写表示非此类。
\\d{2,5} 表示[2,5]个,都包括边界。
4.常用量词,表示字符数量
Greedy quantifiers | |
---|---|
X? | X, once or not at all 0或1个 |
X* | X, zero or more times 0或多个 |
X+ | X, one or more times 1或多个 |
X{ n} | X, exactly n times 正好n个 |
X{ n, } | X, at least n times , times>=n |
X{ n, m} | X, at least n but not more than m times [n,m) |
5.边界匹配
Boundary matchers | |
---|---|
^ | The beginning of a line 以···开头 例如:“^a.*” 以a开头的 |
$ | The end of a line 结尾 例如:".*ir$" 以ir结尾的字符串 |
\b | A word boundary 单词边界 |
\B | A non-word boundary 非单词边界 |
\A | The beginning of the input |
\G | The end of the previous match |
\Z | The end of the input but for the final terminator, if any |
\z | The end of the input |
注意:^在[]内外含义不一样,在[]内部表示非(NOT),在[]外部表示开头(beginning)
6.一般小括号表示分组()。
7.诸如(?=a)此类表示非捕获类型,就是不包括的边界判断
Special constructs ( non-capturing) | |
---|---|
(?: X) | X, as a non-capturing group |
(?idmsuxU-idmsuxU) | Nothing, but turns match flags i d m s u x U on - off |
(?idmsux-idmsux: X) | X, as a non-capturing group with the given flags i d m s u x on - off |
(?= X) | X, via zero-width positive lookahead |
(?! X) | X, via zero-width negative lookahead |
(?<= X) | X, via zero-width positive lookbehind |
(?<! X) | X, via zero-width negative lookbehind |
(?> X) | X, as an independent, non-capturing group |