1. Character Classes为何物?
如果你阅读Pattern类的定义,你将会看到里面有一个表格总结了Pattern类所支持的正则表达式构造类型,下面是从中截取的。
Construct | Description | 翻译 |
---|---|---|
[abc] | a, b, or c (simple class) | a, b或c (简单类型) |
[^abc] | Any character except a, b, or c (negation) | 处理a, b和c之外任意字符 (否定) |
[a-zA-Z] | a through z, or A through Z, inclusive (range) | a ~ z 或者A~Z中字符 (范围) |
[a-d[m-p]] | a through d, or m through p: [a-dm-p] (union) | a ~d或者m ~p (联合) |
[a-z&&[def]] | d, e, or f (intersection) | a~z范围而且要么是d,要么e,要么f,所以综合下来只能是d,e和f (交集) |
[a-z&&[^bc]] | a through z, except for b and c: [ad-z] (subtraction) | a ~ z而且不能为b或c (差集) |
[a-z&&[^m-p]] | a through z, and not m through p: [a-lq-z] (subtraction) | a ~ z而且不能为m~p之间的字符 (差集) |
classes的理解,这里的classes不要同java语言里面的类进行混淆,这个classes是正则表达式语义下的, character classess是用中括号包含着的字符串,这些字符串被用来和带匹配字符串中的字符进行比较。
下面我们将分别实验这六种类型:
2 实验
2.1 simple classes(简单类型)
2.2 negation
2.3 Ranges
2.4 Unions
2.5 Intersections
2.6 Subtraction
上一节: 【Java正则表达式系列】3. 字符串