1.4.10 Parentheses andBackreferences
小括号的用途:
1.限制多选项(alternation)的范围;
2.使用「|」将若干字符组合为一个单元,受问号或星号之类量词的作用;
3.反向引用(backreference);表现形式为元字符序列「\1…」
例如,
% egrep -i '\<([a-z]+) +\1\>' files…
1.4.11 The Great Escape
转义符:在除了字符组内部之外使用反斜线(backslash),使元字符失去特殊含义,成为普通字符。
比如:
「\.」:转义的点号
1.5 Expanding the Foundation
1.5.1 A Few More Examples
1.5.1.1 A string within double quotes
A simple solution to matching a stringwithin double quotes might be: 「”[^”]*”」
两端的引号用来匹配字符串开头和结尾的引号。在这两个引号之间的文本可以包括双引号之外的任何字符。所以我们使用「[^*]」来匹配除双引号之外的任何字符,用「*」来表示两个引号之间可以存在任意数目的非双引号字符。
1.5.2 Regular Expression Nomenclature
1.5.2.1 Regex
正则表达式,简称正则(Regex)
1.5.2.2 Matching
正则表达式「a」不能匹配cat,但是能匹配cat中的a。
1.5.2.3 Metacharacter
只有在字符组外部并且是在未转义的情况下,才有意义。
1.5.2.4 Flavor
我们主要讲Perl流派。
1.5.2.5 Subexpression
“子表达式”指的是整个正则表达式中的一部分,通常是小括号内的表达式,或者是由「|」分隔的多选(alternation)分支。
1.5.2.6 Character
ASCII编码的字节
1.5.3 Summary
Egrep工具的元字符总结。
Table1-3. Egrep Metacharacter Summary
Items to Match a Single Character | ||
Metacharacter | Matches | |
. | dot | Matches any one character |
[…] | character class | Matches any one character listed |
[^…] | negated character class | Matches any one character not listed |
\char | escaped character | When char is a metacharacter, or the escaped combination is not otherwise special, matches the literal char |
Items Appended to Provide “Counting”: The Quantifiers | ||
? | question | One allowed, but it is optional |
* | star | Any number allowed, but all are optional |
+ | plus | At least one required; additional are optional |
{min, max} | specified range† | Min required, max allowed |
Items That Match a Position | ||
^ | caret | Matches the position at the start of the line |
$ | dollar | Matches the position at the end of the line |
\< | word boundary† | Matches the position at the start of a word |
\> | word boundary† | Matches the position at the end of a word |
Other | ||
| | alternation | Matches either expression it separates |
(…) | parentheses | Limits scope of alternation, provide grouping for the quantifies, and “captures” for backreferences |
\1, \2, ... | backreference† | Matches text previously matched within first, second, etc., set of parentheses. |