正则表达式
一:什么是正则表达式
正则表达式是对字符串操作的一种逻辑公式,就是用事先定义好的一些特定字符、及这些特定字符的组合,组成一个“规则字符串”,
这个“规则字符串”用来表达对字符串的一种过滤逻辑。规定一些特殊语法表示字符类、数量限定符和位置关系,然后用这些特殊语法和普
通字符一起表示一个模式,这就是正则表达式(Regular Expression)。
给定一个正则表达式和另一个字符串,我们可以达到如下的目的:
- 给定的字符串是否符合正则表达式的过滤逻辑(称作“匹配”);
- 可以通过正则表达式,从字符串中获取我们想要的特定部分。
1.1:正则表达式的特点
- 灵活性、逻辑性和功能性非常的强;
- 可以迅速地用极简单的方式达到字符串的复杂控制。
- 对于刚接触的人来说,比较晦涩难懂。由于正则表达式主要应用对象是文本,因此它在各种文本编辑器场合都有应用,小到著名编辑器EditPlus,大到 MicrosoftWord、Visual Studio等大型编辑器,都可以使用正则表达式来处理文本内容。
1.2:正则表达式结构
1.3:基础正则表达式
正则表达式的字符串表达方法根据不同的严谨程度与功能分为基本正则表达式与扩展正则表达式。基础正则表达式是常用的正则表达式的最基础的部分。在Linux 系统中常见的文件处理工具中 grep 与 sed 支持基础正则表达式,而 egrep 与 awk 支持扩展正则表达式
1.4:基础正则表达式:grep命令
- “-n”表示显示行号
- “-i”表示不区分大小写
- 命令执行后,符合匹配标准的字符,字体颜色会变为红色
我们就用安装的httpd配置目录举个例子
我们把httpd配置复制到opt下面重名为httpd.txt
1.41:查找特定字符
#导入一篇txt供测试
[root@server3 opt]# cat a.txt
A man may usually be known by the books he reads as well as by the company he keeps; for there is
a companionship of books as well as of men; and one should always live in the best company,
whether it be of books or of men.
A good book may be amonggggg the best of friends. It is the same today that it always was, and it will
never change. It is the most patient and cheerful of companions. It does not turn its back upon us
in times of adversity or distress. It always receives us with the same kindness; amusing and
instructing us in youth, and comforting and consoling us in age.
Men often discover their affinity to each other by the mutual love they have for a book just
as two persons sometimes discover a friend by the admirationgggg which both entertain for a third.
There is an old proverb, ¡®Love me, love my dog.¡± But there is more wisdom in this:¡± Love me,
love my book.¡± The book is a truer and higher bond of union. Men can think, feel, and sympathize
with each other through their favorite author. They live in him together, and he in them.
a good book is often the best urn of a life enshrining the best that life could think out;
for the world of a man¡¯s life is, for the most part, but the world of his thoughts.
Thus the best books are treasuries of good words, the golden thoughts, which,
remembered and cherished, become our constant companions and comforters.
books possess an essence of immortality. They are by far the most lasting products
of human effort. Temples and statues decay, but books survive. Time is of no account with
great thoughts, which are as fresh today as when they first passed through their author¡¯s
minds, ages aggggggggggo. What was then said and thought still speaks to us as vividly as ever from
the printed page. The only effect of time have been to sift out the bad products; for nothing
in literature can long survive e but what is really good!!!!!!!
Books introduce us into the best society; they bring us into the presence of the greatest
minds that have ever lived. We hear what they said and did; we see the as if they were really
alive; we sympathize with them, enjoy with them, grieve with them; their experience becomes ours,
and we feel as if we were in a measure actors with them in the scenes which they describe.
1/3 = 0.33333333333333
I - 1
II - 2
III - 3
IV - 4
V ¨C 5
VI - 6
VII ¨C 7
VIII - 8
IX - 9
X ¨C 10
The great and good do not die, even in this world. Embalmed in books, their spirits walk abroad.
The book is a living voice. It is an intellect to which on still listens.
[root@server3 opt]# grep -n 'the' /opt/a.txt
[root@server3 opt]# grep -in 'the' /opt/a.txt #不区分大小写
[root@server3 opt]# grep -vn 'the' a.txt #-v 反向选择。 查找不包含“the”字符的行
1.42:利用中括号“[]”来查找集合字符
- 想要查找“shirt”与“short”这两个字符串时,可以发现这两个字符串均包含“sh” 与“rt”
- “[]”中无论有几个字符,都仅代表一个字符,也就是说“[io]”表示匹配“i”或者“o”
- 同时查找到“shirt”与“short”这两个字符串文件内没有自己添加三个
[root@server3 opt]# echo "shart" >> a.txt
[root@server3 opt]# echo "shirt" >> a.txt
[root@server3 opt]# echo "short" >> a.txt
[root@server3 opt]# grep -n 'sh[ioa]rt' a.txt
49:shart
50:shirt
51:short
#这种是以或者的模式去匹配,只能从中括号中去轮流匹配一个字符
#在grep中,只要检索的是字符,就加单引号
1.43:查询连续字符
若要查找包含重复单个字符“oo”时,只需要执行以下命令即可。
[root@server3 opt]# grep -n 'oo' a.txt
[root@server3 opt]# grep -n 'o' a.txt ##筛选一个‘o’时会出现一个及以上的连续o
[root@localhost opt]# grep -n '[^g]oo' httpd.txt
#不以g开头带oo的,第5行的good已过滤
- ^放在括号里面就是取反,不包含的意思
- ^放在括号外面就是以什么为开头
1.44:连续字符前面不为字母
若不希望“oo”前面存在小写字母,可以使用“grep –n‘[^a-z]oo’test.txt”命令实现,其中
- “a-z”表示小写字母,