正则表达式学习中

yin258357

已于 2024-09-27 00:06:15 修改

阅读量274

点赞数 4

文章标签：正则表达式学习

于 2024-09-26 15:26:58 首次发布

本文链接：https://blog.csdn.net/yin258357/article/details/142532888

版权

提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档

文章目录

前言
一、文件内容
二、使用步骤
总结

前言

一、文件内容

我们接下来的命令的学习都是通过对这个文件的操作向大家展示，该文件是我从《鸟哥私房菜》里地址下载来的。

vim test.txt ==>把数据放在该文件里

"Open Source" is agood mechanism to develop  programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
However, this dress is about $ 3183 dollars.^M
GNU is free air not free beer.^M
Her hair is very beauty.^M
I can't finish the test.^M
Oh!The soup taste good.^M
motorcycle is cheap than car.
This window is clear.
the symboyl 'the' is represented as start.
Hi! My god!
software is a library for drafting programs.^M
loo are the best is mean you are the no.1.
The world <Happy> is the same with "glad".
I like dog.
google is the best tools for search keyword.
gooooole yes!
go!go! Let's go
#I am VBird
~

二、使用步骤

1.特殊字符

[ :alnum: ] :alnum看起来像所有的数字，但是要注意不是，它是代表数字和英文字母。0~9  a~z  A~Z
[ :alpha: ] :alpha代表所有英语字母，a~z A~Z
[ :digit: ] :digit代表数字，0~9
[ :lower: ] :lower 低级的，这个是英文的小写容易记住 a~z
[ :upper: ] :upper ,这一眼英语字母大写 A~Z

这几个都是看一眼就可以理解的，随后还有几个特殊字符，在实验中学习可能更容易理解。

2.grep再次学习

我们在bash基础里学习了部分grep的用法

[user@localhost ~]$ grep -n 'the' test.txt  ==> -n 显示行号，'the' 查找有the的行  
8:I can't finish the test.^M
12:the symboyl 'the' is represented as start.
15:loo are the best is mean you are the no.1.
16:The world <Happy> is the same with "glad".
18:google is the best ztools for search keyword.

[user@localhost ~]$ grep -vn 'the' test.txt ==>-n仍然，当然它就是显示行号的。-v 'the'就不是找有the的行，而是找没the的行
1:"Open Source" is agood mechanism to develop  programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.^M
6:GNU is free air not free beer.^M
7:Her hair is very beauty.^M
9:Oh!The soup taste good.^M
10:motorcycle is cheap than car.
11:This window is clear.
13:Hi! My god!
14:software is a library for drafting programs.^M
17:I like dog.
19:gooooole yes!
20:go!go! Let's go
21:#I am VBird

[user@localhost ~]$ grep -in 'the' test.txt ==> -i 'the' 就是要有the这三个字母，但三者的大小写随便
8:I can't finish the test.^M
9:Oh!The soup taste good.^M
12:the symboyl 'the' is represented as start.
15:loo are the best is mean you are the no.1.
16:The world <Happy> is the same with "glad".
18:google is the best ztools for search keyword.
[user@localhost ~]$ grep -in 'tHe' test.txt  ==>-i 'tHe' 就是要有tHe这三个字母，但三者的大小写随便
8:I can't finish the test.^M
9:Oh!The soup taste good.^M
12:the symboyl 'the' is represented as start.
15:loo are the best is mean you are the no.1.
16:The world <Happy> is the same with "glad".
18:google is the best ztools for search keyword.

[user@localhost ~]$ grep -n 't[ae]st' test.txt  ==>只要[]里的字符有一个存在即可
8:I can't finish the test.^M
9:Oh!The soup taste good.^M
我们刚才查询the与The,我们也可以换成[]形式
[user@localhost ~]$ grep -n '[Tt]he' test.txt  ==>找t或T开头的he的字符段
8:I can't finish the test.^M
9:Oh!The soup taste good.^M
12:the symboyl 'the' is represented as start.
15:loo are the best is mean you are the no.1.
16:The world <Happy> is the same with "glad".
18:google is the best ztools for search keyword.

[user@localhost ~]$ grep -n 'oo' test.txt  ==>我们很可能理解为查找只找有两个o的，但是下面有好几个o的
1:"Open Source" is agood mechanism to develop  programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh!The soup taste good.^M
15:loo are the best is mean you are the no.1.
18:google is the best ztools for search keyword.
19:gooooole yes!  ==>这就是好几个o的，我的理解是从从左到右查找，gooooole查找时到‘goo’就有两个o了已经符
号条件，无论后面有没有o都可以输出

[user@localhost ~]$ grep -n '[^g]oo' test.txt   ==> '[^g]00'有啥oo，但不是goo ,[^g]不是g
2:apple is my favorite food.            ==>foo
3:Football game is not use feet only.   ==>Foo
15:loo are the best is mean you are the no.1.  ==>loo
18:google is the best ztools for search keyword.  ==>goo ztoo ,咦！goo不是说不能有goo吗？
==>这一行在查找从左到右，查到goo是并未输出本行，再往后查，查到ztoo符合不是g的oo格式，所以查找到本行
19:gooooole yes! =>gooooo  ，gooooo和上面的原理一样，goo时不输出，向右查一个ooo时输出本行

[user@localhost ~]$ grep -n '[^a-z]oo' test.txt ==>[^a-z]这里是不要从[a到z]+oo形式的。-就是谁到谁
3:Football game is not use feet only.

这是test.txt文件的第19行，思考一下它这么多o为啥不在'[^g]00'中输出？
19:gooooole yes!

[user@localhost ~]$ grep -n '[0-9]' test.txt   ==>查找有数字0-9的行
5:However, this dress is about $ 3183 dollars.^M
15:loo are the best is mean you are the no.1.
[user@localhost ~]$ grep -n '[:digit:]' test.txt ==>咦！咋错误了？
grep: 字符类的语法是 [[:space:]],而非 [:space:]  ==>看一下，哦，要两个中括号。[:digit:]原本就代替0-9
==>[0-9]对应的特殊字符字符写法也就可以写作[[:digit:]]
[user@localhost ~]$ grep -n '[[:digit:]]' test.txt 
5:However, this dress is about $ 3183 dollars.^M
15:loo are the best is mean you are the no.1.

[user@localhost ~]$ grep -n '^the' test.txt   ==>咋又不加[]了？
12:the symboyl 'the' is represented as start. ==> ^不是反向取的意思吗？怎么两个the还出来了
==> ^不一定是取反的意思，在[]前面是与[]内的数据取反，要是^与数据之间没有[]相隔，就是什么开头
==>'^the'查找以the开头的行

[user@localhost ~]$ grep -n '^[a-z]' test.txt  ==>加中括号选其中之一，a-z其中之一开头就行
2:apple is my favorite food.
4:this dress doesn't fit me.
10:motorcycle is cheap than car.
12:the symboyl 'the' is represented as start.
14:software is a library for drafting programs.^M
15:loo are the best is mean you are the no.1.
18:google is the best ztools for search keyword.
19:gooooole yes!
20:go!go! Let's go

^[] :^在[]外，以空号内的任一字符开头
[^] :^在[]内，不取[]里的任一字符

[user@localhost ~]$ grep -n '\.$' test.txt  ==>$是结尾的标志， \将.的代表任意字符转义成单纯的点格式
1:"Open Source" is agood mechanism to develop  programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
10:motorcycle is cheap than car.
11:This window is clear.
12:the symboyl 'the' is represented as start.
15:loo are the best is mean you are the no.1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best ztools for search keyword.

[user@localhost ~]$ grep -n '^$' test.txt  ==> ^表示开头 $表示结束。^$开头就是结尾标识，也就是空行
22:
23:

[user@localhost ~]$ vim test
#Hello 
You are so good

real!!!!
~

[user@localhost ~]$ grep -v  '^$' test | grep  -v  '^#' ==>不要空行，不要#开头
You are so good
real!!!!

3 …

. 代表任意一字符,有且仅有一个字符

[user@localhost ~]$ grep  -n 'g..d' test.txt  ==>g和d里面有两个点，就是有任意两个字符
1:"Open Source" is agood mechanism to develop  programs.
9:Oh!The soup taste good.^M
16:The world <Happy> is the same with "glad".

4.*

" * " 星号重复前面的字符，0到无数次

vim test ==>查看一下test的内容
Hello
You are so good
goood
real!!!!
```c
[user@localhost ~]$ grep -n 'o*' test ==>有0到无数个o
1:Hello
2:You are so good
3:goood
4:real!!!!
[user@localhost ~]$ grep -n 'oo*' test ==>有1+[0到无数个o]
1:Hello
2:You are so good
3:goood
[user@localhost ~]$ grep -n 'ooo*' test  ==>有2+[0到无数个o]
2:You are so good
3:goood
[user@localhost ~]$ grep -n 'oooo*' test   ==>有3+[0到无数个o]
3:goood
[user@localhost ~]$ grep -n 'ooooo*' test  ==>有4+[0到无数个o]
[user@localhost ~]$

我们想要查找以g开头，以d结尾。. 代表任意字符，任意的存在0到无穷个，g.*d只要以g开头以d结尾，中间有没有，无论有啥有任何字符都可以。

[user@localhost ~]$ grep -n 'g.*d' test
2:You are so good
3:goood

{在bash中有特殊的意义，加\{转义
[user@localhost ~]$ grep -n 'o\{2\}' test ==>找两个o的
2:You are so good
3:goood
[user@localhost ~]$ grep -n 'o\{2,\}' test ==>找2的5的o
2:You are so good
3:goood
[user@localhost ~]$ grep -n 'o\{2,\}' test ==>2到无穷个o
2:You are so good
3:goood

总结一下，RE字符
^word ：以word为开头的行
word$ : 以word为结尾的行
. :代表任意一字符
* ：重复前面0到无数次
[list]：列出想选取的字符
[n1-n2]:选取字符范围，n1到n2

5.sed

sed也是一个管道符，可以分析标准的输入

[user@localhost ~]$ cat test       ==>查看test的内容
Hello
You are so good
goood
real!!!!
[user@localhost ~]$ cat test |sed '3,4d' ==>'3,4d'删除3到4行
Hello
You are so good           ==>4行，只显示前两行
[user@localhost ~]$ cat test |sed '1a Hi' ==>'1a Hi'在1的下一行添加Hi
Hello
Hi                          
You are so good           ==>原本的第二行变第三行
goood
real!!!
[user@localhost ~]$ cat test  ==>看此时test文件的内容
Hello
You are so good
goood
real!!!!    ==>咦！我改了那么久，好像没有任何改变，没有改变就对了。sed默认状态下可以理解为
==>对文件内容的特定格式的输出，对文件本身的数据不做修改。
[user@localhost ~]$ cat test | sed '1a Hi \   ==>加一个换行符
> bie Hi le '   ==>注意添加的都在单引号里，编辑时先打左打完所有内容再打右，不然换行符时会点击【Enter】会执行
Hello
Hi 
bie Hi le             ==>添加两行
You are so good
goood
real!!!!
添加会了，下面学习替换，咋换？
```c
sed /s/要替换的字符/新字符/g'

[user@localhost ~]$ cat test | sed  's/H/h/g' ==>将H切换成h
hello
You are so good
goood
real!!!!
[user@localhost ~]$ cat test | sed  's/g/G/g' ==>将G切换成g
Hello
You are so Good
Goood
real!!!!
[user@localhost ~]$ cat test | sed  '2s/g/G/g'  ==>将第2行的g切换成G
Hello
You are so Good      ==>改行为第2行，g换成G
goood                ==>改行的g未改变
real!!!!

[user@localhost ~]$ cat test
Hello
You are so good
goood
real!!!!
[user@localhost ~]$ cat test | sed '2d' ==>删除第2行
Hello
goood
real!!!!
[user@localhost ~]$ cat test | sed '2,4d' ==>删除第2到4行
Hello

[user@localhost ~]$ cat test  
Hello
You are so good
goood
[user@localhost ~]$ cat test | sed '1c This is first'   ==>1c对第一行替换"This is first"
This is first
You are so good
goood
real!!!!
[user@localhost ~]$ cat test | sed '2,4c This is 2-4'   ==>2-4c 将2-4行替换"This is 2-4"
Hello
This is 2-4

[user@localhost ~]$ nl test |sed  -n '2p'    ==>打印2行
     2	You are so good
[user@localhost ~]$ nl test |sed  -n '2,4p'  ==>打印2到4行
     2	You are so good
     3	goood
     4	real!!!!

以上的增删改打印都相当于对文件特殊形式的查看，其实也可以将文件修改

[user@localhost ~]$ cat test    ==>查看文件内容
Hello
You are so good
goood
real!!!!
[user@localhost ~]$ sed -i '1a Hi' test  ==>在第一行后添加Hi
[user@localhost ~]$ cat test 
Hello 
Hi                      ==>添加成功
You are so good
goood
real!!!!
[user@localhost ~]$ sed -i 's/g/G/g' test   ==>将g替换成G
[user@localhost ~]$ cat test 
Hello
Hi          
You are so Good
Goood                  ==>g切换G成功
real!!!!

+ :一个或一个以上
? :0个或一个
|:或
():( | )找群组字符串
()+: A(xyz)+B,A开头B结尾，中间一个以上的xyz

6. printf

打印的格式与其他语言一样，不做解释了

7.awk

awk是数据处理工具

[user@localhost ~]$ last -n 5   ==>通过last解析登记记录与审计轨迹的利器
user     pts/0        :0               Thu Sep 26 21:20   still logged in   
user     :0           :0               Thu Sep 26 21:19   still logged in   
reboot   system boot  3.10.0-957.el7.x Thu Sep 26 21:17 - 23:45  (02:27)    
user     pts/0        :0               Thu Sep 26 14:14 - crash  (07:03)    
user     :0           :0               Thu Sep 26 14:13 - crash  (07:04)    

wtmp begins Sat Sep 21 02:53:57 2024
[user@localhost ~]$ last -n 5 | awk '{print $1 "\t"  $3}' ==>$1第一栏，$3第三栏
==>$1、$3都代表一整列数据
user	:0
user	:0
reboot	boot
user	:0
user	:0
	
wtmp	Sat

NF:$0拥有的字段总数
NR:行数
FS:目前的分隔符

具体的更多命令与选择更应该在实践中熟悉。

总结

yin258357

关注

4
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫