grep 整理

在这里插入图片描述

1. 正则表达式和通配符

首先,我们回顾下正则表达式和通配符相关内容,这有助于接下来的grep学习:

基础正则表达式

RE 字符意义与范例
^word意义:待搜寻的字符串(word)在行首!范例:搜寻行首为 # 开始的那一行,并列出行号。grep -n ‘^#’ regular_express.txt
word$意义:待搜寻的字符串(word)在行尾!范例:将行尾为 ! 的那一行打印出来,并列出行号。grep -n ‘!$’ regular_express.txt
.意义:代表『一定有一个任意字符』的字符!范例:搜寻的字符串可以是 (eve) (eae) (eee) (e e), 但不能仅有 (ee) !亦即 e 与 e 中间『一定』仅有一个字符,而空格符也是字符!grep -n ‘e.e’ regular_express.txt
\意义:跳脱字符,将特殊符号的特殊意义去除!范例:搜寻含有单引号 ’ 的那一行!grep -n ’ regular_express.txt
*意义:重复零个到无穷多个的前一个 RE 字符。范例:找出含有 (es) (ess) (esss) 等等的字符串,注意,因为 * 可以是 0 个,所以 es 也是符合带搜寻字符串。另外,因为 * 为重复『前一个 RE 字符』的符号, 因此,在 * 之前必须要紧接着一个 RE 字符喔!例如任意字符则为 『.*』 !grep -n ‘ess*’ regular_express.txt
[list]意义:字符集合的 RE 字符,里面列出想要撷取的字符!范例搜寻含有 (gl) 或 (gd) 的那一行,需要特别留意的是,在 [] 当中『谨代表一个待搜寻的字符』, 例如『 a[afl]y 』代表搜寻的字符串可以是 aay, afy, aly 即 [afl] 代表 a 或f 或 l 的意思!grep -n ‘g[ld]’ regular_express.txt
[n1-n2]意义:字符集合的 RE 字符,里面列出想要撷取的字符范围!范例:搜寻含有任意数字的那一行!需特别留意,在字符集合 [] 中的减号 - 是有特殊意义的,他代表两个字符之间的所有连续字符!但这个连续与否与 ASCII 编码有关,因此,你的编码需要设定正确(在 bash 当中,需要确定 LANG 与 LANGUAGE 的变量是否正确!) 例如所有大写字符则为 [A-Z]。grep -n ‘[A-Z]’ regular_express.txt
[^list]意义:字符集合的 RE 字符,里面列出不要的字符串或范围!范例:搜寻的字符串可以是 (oog) (ood) 但不能是 (oot) ,那个 ^ 在 [] 内时,代表的意义是『反向选择』的意思。 例如,我不要大写字符,则为 [^A-Z]。但是,需要特别注意的是,如果以 grep -n [^A-Z] regular_express.txt 来搜寻,却发现该文件内的所有行都被列出,为什么?因为这个 [^A-Z] 是『非大写字符』的意思, 因为每一行均有非大写字符,例如第一行的 “Open Source” 就有 p,e,n,o… 等等的小写字。grep -n ‘oo[^t]’ regular_express.txt
\{ n,m\}意义:连续 n 到 m 个的『前一个 RE 字符』意义:若为 {n} 则是连续 n 个的前一个 RE 字符,意义:若是 {n,} 则是连续 n 个以上的前一个 RE 字符! 范例:在 g 与 g 之间有 2 个到3 个的 o 存在的字符串,亦即 (goog)(gooog)grep -n ‘go{2,3}g’ regular_express.txt

兼容于 POSIX 的正则表达式

描述
[[:alpha:]]匹配任意字母字符,不管大小写,[a-z],[A-Z]
[[:alnum:]]匹配任意字母字符和数字,[0-9],[a-z],[A-Z]
[[:upper:]]匹配大写字母,[A-Z]
[[:lower:]]匹配小写字母,[a-z]
[[:digit:]]匹配数字,[0-9]
[[:blank:]]匹配空格或值表符
[[:cntrl:]]代表键盘上面的控制按键,亦即包括 CR, LF, Tab, Del… 等等
[[:graph:]]除了空格符 (空格键与 [Tab] 按键) 外的其他所有按键
[[:print:]]匹配任意可打印字符
[[:punct:]]匹配标点符号
[[:space:]]匹配任意空白字符,空格,制表,NL,FF,VT,CR
[[:xdigit:]]代表 16 进位的数字类型,因此包括: 0-9, A-F, a-f 的数字与字符

上表中的[:alnum:], [:alpha:], [:upper:], [:lower:], [:digit:]代表区间合集,注意在匹配中只能代表一个字符

扩展正则表达式

RE 字符意义与范例
+意义:重复『一个或一个以上』的前一个 RE 字符
范例:搜寻 (god) (good) (goood)… 等等的字符串。 那个 o+ 代表『一个以上的 o 』所以,底下的执行成果会将第 1, 9, 13 行列出来。
egrep -n 'go+d' regular_express.txt
?意义:『零个或一个』的前一个 RE 字符
范例:搜寻 (gd) (god) 这两个字符串。 那个 o? 代表『空的或 1 个 o 』所以,上面的执行成果会将第 13, 14 行列出来。 有没有发现到,这两个案例( ‘go+d’ 与 ‘go?d’ )的结果集合与 ‘go*d’ 相同?想想看,这是为什么喔! ^_^
egrep -n 'go?d' regular_express.txt
|意义:用或( or )的方式找出数个字符串
范例:搜寻 gd 或 good 这两个字符串,注意,是『或』! 所以,第 1,9,14 这三行都可以被打印出来喔!那如果还想要找出 dog 呢?
egrep -n 'gd|good' regular_express.txt
egrep -n 'gd|good|dog' regular_express.txt
()意义:找出『群组』字符串
范例:搜寻 (glad) 或 (good) 这两个字符串,因为 g 与 d 是重复的,所以,我就可以将 la 与 oo 列于 ( ) 当中,并以 | 来分隔开来,就可以啦!
egrep -n 'g(la|oo)d' regular_express.txt
()+意义:多个重复群组的判别
范例:将『AxyzxyzxyzxyzC』用 echo 叫出,然后再使用如下的方法搜寻一下!echo 'AxyzxyzxyzxyzC' | egrep 'A(xyz)+C'
上面的例子意思是说,我要找开头是 A 结尾是 C ,中间有一个及以上的 “xyz” 字符串的意思~

linux通配符

符号意义
*代表『 0 个到无穷多个』任意字符
?代表『一定有一个,单个字符』任意字符
[ ]同样代表『一定有一个在括号内』的字符(非任意字符)。例如 [abcd] 代表『一定有一个字符, 可能是 a, b, c, d 这四个任何一个』
[ - ]若有减号在中括号内时,代表『在编码顺序内的所有字符』。例如 [0-9] 代表 0 到 9 之间的所有数字,因为数字的语系编码是连续的!
[^ ]若中括号内的第一个字符为指数符号 (^) ,那表示『反向选择』,例如 [^abc] 代表 一定有一个字符,只要是非 a, b, c 的其他字符就接受的意思。

linux特殊符号

符号内容
#批注符号:这个最常被使用在 script 当中,视为说明!在后的数据均不执行
\跳脱符号:将『特殊字符或通配符』还原成一般字符
|管线 (pipe):分隔两个管线命令的界定(后两节介绍);
;连续指令下达分隔符:连续性命令的界定 (注意!与管线命令并不相同)
~用户的家目录
$取用变数前导符:亦即是变量之前需要加的变量取代值
&工作控制 (job control):将指令变成背景下工作
!逻辑运算意义上的『非』 not 的意思!
/目录符号:路径分隔的符号
>, >>数据流重导向:输出导向,分别是『取代』与『累加』
<, <<数据流重导向:输入导向 (这两个留待下节介绍)
’ ’单引号,不具有变量置换的功能 ($ 变为纯文本)
" "具有变量置换的功能! ($ 可保留相关功能)
` `两个『 ` 』中间为可以先执行的指令,亦可使用 $( )
( )在中间为子 shell 的起始与结束
{ }在中间为命令区块的组合!

没有对比就没有伤害~,我们发现正则表达式和通配符中有一些符号相同,但含义不同

符号RE含义通配符含义
*重复零个到无穷多个的前一个 RE 字符代表『 0 个到无穷多个』任意字符
?『零个或一个』的前一个 RE 字符代表『一定有一个,单个字符』任意字符
>[root@node-249 test]# touch {101..110}
[root@node-249 test]# ls
101  102  103  104  105  106  107  108  109  110
[root@node-249 test]# ls|grep '10*'
101
102
103
104
105
106
107
108
109
110
[root@node-249 test]# ls 10*
101  102  103  104  105  106  107  108  109
[root@node-249 test]# ls |egrep '10?'
101
102
103
104
105
106
107
108
109
110
[root@node-249 test]# ls 10?
101  102  103  104  105  106  107  108  109

2. grep

文本搜索工具

grep (缩写来自Globally search a Regular Expression and Print)是一种强大的文本搜索工具,它能使用特定模式匹配(包括正则表达式)搜索文本,并默认输出匹配行。Unix的grep家族包括grepegrepfgrep。Windows系统下类似命令FINDSTR

egrepfgrep的命令只跟grep有很小不同。egrep和fgrep都是grep的扩展,支持更多的re元字符,fgrep就是fixed grep或fast grep,它们把所有的字母都看作单词,也就是说,正则表达式中的元字符表示回其自身的字面意义,不再特殊。linux使用GNU版本的grep。它功能更强,可以通过-G-E-F命令行选项来使用egrep和fgrep的功能。

In addition, two variant programs egrep and fgrep are available. egrep is the same as grep -E. fgrep is the same as grep -F. Direct invocation as either egrep or fgrep is deprecated, but is provided to allow historical applications that rely on them to run unmodified.

egrep和fgrep并不推荐使用

默认grep只支持基础正则表达式,而通过grep -E或者egrep则可以使用扩展正则表达式

关于基础正则表达式和扩展正则表达式,可以查看
https://blog.csdn.net/u010230019/article/details/132075257
https://blog.csdn.net/u010230019/article/details/132097203

2.1 基本用法

grep [-acinv] [--color=auto] '搜寻字符串' filename
#选项与参数:
-a :将 binary 文件以 text 文件的方式搜寻数据
-c :计算找到 '搜寻字符串' 的行数
-i :忽略大小写的不同,所以大小写视为相同
-n :顺便输出行号
-v :反向选择,亦即显示出没有 '搜寻字符串' 内容的那一行!
--color=auto :可以将找到的关键词部分加上颜色的显示喔!

示例

[root@node-249 test]# cat txt
100
101
105
110
111
115
120
121
125
Dog
dog
[root@node-249 test]# grep -c '11' txt
3
[root@node-249 test]# vim txt
[root@node-249 test]# grep -i 'dog' txt
Dog
dog
[root@node-249 test]# grep  'dog' txt
dog
[root@node-249 test]# grep -ni 'dog' txt
10:Dog
11:dog
[root@node-249 test]# grep -v 'dog' txt
100
101
105
110
111
115
120
121
125
Dog
[root@node-249 test]# grep --color 'dog' txt
dog
[root@node-249 test]# grep --color '11' txt
110
111
115
[root@node-249 test]# grep --color '10' txt
100
101
105
110
[root@node-249 test]# alias
alias cp='cp -i'
alias egrep='egrep --color=auto'
alias fgrep='fgrep --color=auto'
alias grep='grep --color=auto'
alias l.='ls -d .* --color=auto'
alias ll='ls -l --color=auto'
alias ls='ls --color=auto'
alias mv='mv -i'
alias rm='rm -i'
alias which='alias | /usr/bin/which --tty-only --read-alias --show-dot --show-tilde'

2.2 进阶使用

[dmtsai@study ~]$ grep [-A] [-B] [--color=auto] '搜寻字符串' filename
#选项与参数:
-A :后面可加数字,为 after 的意思,除了列出该行外,以该行为锚点,后续的 n 行也列出来;
-B :后面可加数字,为 befer 的意思,除了列出该行外,以该行为锚点,前面的 n 行也列出来;
-n :显示匹配内容的行号
-i :忽略大小写
--color=auto 可将正确的那个撷取数据列出颜色
-l :列出文件名

示例

[root@node-249 test]# grep -A1 -B1 '110' txt
105
110
111

2.3 配合正则表达式使用

这里我们编写个内容更多的文件

[root@node-249 test]# cat txt
"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
However, this dress is about $ 3183 dollars.^M
GNU is free air not free beer.^M
Her hair is very beauty.^M
I can't finish the test.^M
Oh! The soup taste good.^M
motorcycle is cheap than car.
This window is clear.
the symbol '*' is represented as start.
Oh! My god!
The gd software is a library for drafting programs.^M
You are the best is mean you are the no. 1.
The world <Happy> is the same with "glad".
I like dog.
google is the best tools for search keyword.
goooooogle yes!
go! go! Let's go.
# I am VBird
  • 利用中括号 [] 来搜寻集合字符
#只能匹配a或e一个字符
[root@node-249 test]# grep -n 't[ae]ste*' txt
8:I can't finish the test.^M
9:Oh! The soup taste good.^M

#匹配不包含g或o连接的oo字符串
[root@node-249 test]# grep -n '[^go]oo' txt
2:apple is my favorite food.
3:Football game is not use feet only.
18:google is the best tools for search keyword.

#匹配不以小写字母连接的oo
[root@node-249 test]# grep -n '[^a-z]oo' txt
3:Football game is not use feet only.

[root@node-249 test]# grep -n '[^[:lower:]]oo' txt
3:Football game is not use feet only.

#匹配数字
[root@node-249 test]# grep -n '[0-9]' txt
5:However, this dress is about $ 3183 dollars.^M
15:You are the best is mean you are the no. 1.

#匹配数字
[root@node-249 test]# grep -n [[:digit:]] txt
5:However, this dress is about $ 3183 dollars.^M
15:You are the best is mean you are the no. 1.
  • 行首与行尾字符 ^ $
#以the开头的行
[root@node-249 test]# grep -n '^the' txt
12:the symbol '*' is represented as start.

#以大写字母开头的行
[root@node-249 test]# grep -n '^[A-Z]' txt
#不包含大写字母的单词
[root@node-249 test]# grep -n '[^A-Z]' txt

#以小写字母开头的行
[root@node-249 test]# grep -n '^[[:lower:]]' txt
2:apple is my favorite food.
4:this dress doesn't fit me.
10:motorcycle is cheap than car.
12:the symbol '*' is represented as start.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.

#不以小写字母开头的行
[root@node-249 test]# grep -n '^[^[:lower:]]' txt
1:"Open Source" is a good mechanism to develop programs.
3:Football game is not use feet only.
5:However, this dress is about $ 3183 dollars.^M
6:GNU is free air not free beer.^M
7:Her hair is very beauty.^M
8:I can't finish the test.^M
9:Oh! The soup taste good.^M
11:This window is clear.
13:Oh! My god!
14:The gd software is a library for drafting programs.^M
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
21:# I am VBird

#不以字母开头的行
[root@node-249 test]# grep -n '^[^a-zA-Z]' txt
1:"Open Source" is a good mechanism to develop programs.
21:# I am VBird

[root@node-249 test]# grep -n '^[^[:alpha:]]' txt
1:"Open Source" is a good mechanism to develop programs.
21:# I am VBird

#以.结尾的行
[root@node-249 test]# grep -n '\.$' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
...

#匹配空行和非空行
[root@node-249 test]# echo '' >> txt
[root@node-249 test]# grep -n '^$' txt
22:
[root@node-249 test]# grep -nv '^$' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.^M
6:GNU is free air not free beer.^M
...
  • 任意一个字符.与重复字符 *

.(小数点):代表『一定有一个任意字符』的意思;
* (星星号):代表『重复前一个字符, 0 到无穷多次』的意思,为组合形态

[root@node-249 test]# grep -n 'g..d' txt
1:"Open Source" is a good mechanism to develop programs.
9:Oh! The soup taste good.^M
16:The world <Happy> is the same with "glad".

#至少两个 o 以上的字符串
[root@node-249 test]# grep -n 'ooo*' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! The soup taste good.^M
18:google is the best tools for search keyword.
19:goooooogle yes!

#以g开头,g结尾中间至少一个o
[root@node-249 test]# grep -n 'goo*g' txt
18:google is the best tools for search keyword.
19:goooooogle yes!

#以g开头,g结尾,中间可以有任意个字符
[root@node-249 test]# grep -n 'g.*g' txt
1:"Open Source" is a good mechanism to develop programs.
14:The gd software is a library for drafting programs.^M
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.

#至少有两个数字
[root@node-249 test]# grep -n '[0-9][0-9][0-9]*' txt
5:However, this dress is about $ 3183 dollars.^M
  • 限定连续 RE 字符范围 {}
    限制一个范围区间内的重复字符数

注意:因为 { 与 } 的符号在 shell 是有特殊意义的,因此, 我们必须要使用跳脱字符 \ 来让他失去特殊意义才行

#连续出现2次以上的o
[root@node-249 test]# grep -n 'o\{2\}' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! The soup taste good.^M
18:google is the best tools for search keyword.
19:goooooogle yes!

#数量限制
[root@node-249 test]# grep -n 'go\{2,5\}g' txt
18:google is the best tools for search keyword.
[root@node-249 test]# grep -n 'go\{2,\}g' txt
18:google is the best tools for search keyword.
19:goooooogle yes!
[root@node-249 test]# grep -n 'go\{,5\}g' txt
18:google is the best tools for search keyword.

2.4 grep增强版

这里提到的grep增强版egrep,由于egrep不推荐使用,所以这里我们用grep -E代替

egrep是一种增强版的grep,它使用更多的正则表达式来搜索文本,比如可以使用更多的元字符,更多的重复模式,更多的可选项等。

#匹配()中字符串一个及以上
[root@node-249 test]#  echo 'AxyzxyzxyzxyzC' | grep -E 'A(xyz)+C'
AxyzxyzxyzxyzC
[root@node-249 test]#  echo 'AxyzxyzxyzxyzC' | grep -E 'A(x)+C'
[root@node-249 test]#

#匹配o零个或1个
[root@node-249 test]# grep -En 'o?' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.^M
6:GNU is free air not free beer.^M
7:Her hair is very beauty.^M
8:I can't finish the test.^M
9:Oh! The soup taste good.^M
10:motorcycle is cheap than car.
11:This window is clear.
12:the symbol '*' is represented as start.
13:Oh! My god!
14:The gd software is a library for drafting programs.^M
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
21:# I am VBird
22:

#匹配o 零个或多个
[root@node-249 test]# grep -En 'o*' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.^M
6:GNU is free air not free beer.^M
7:Her hair is very beauty.^M
8:I can't finish the test.^M
9:Oh! The soup taste good.^M
10:motorcycle is cheap than car.
11:This window is clear.
12:the symbol '*' is represented as start.
13:Oh! My god!
14:The gd software is a library for drafting programs.^M
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
21:# I am VBird
22:

#匹配o 1个及以上
[root@node-249 test]# grep -En 'o+' txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However, this dress is about $ 3183 dollars.^M
6:GNU is free air not free beer.^M
9:Oh! The soup taste good.^M
10:motorcycle is cheap than car.
11:This window is clear.
12:the symbol '*' is represented as start.
13:Oh! My god!
14:The gd software is a library for drafting programs.^M
15:You are the best is mean you are the no. 1.
16:The world <Happy> is the same with "glad".
17:I like dog.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.

#匹配good或glad
[root@node-249 test]# grep -En 'g(oo|la)d' txt
1:"Open Source" is a good mechanism to develop programs.
9:Oh! The soup taste good.^M
16:The world <Happy> is the same with "glad".

#gd之间只能是o,至少一个
[root@node-249 test]# grep -En 'g(o)+d' txt
1:"Open Source" is a good mechanism to develop programs.
9:Oh! The soup taste good.^M
13:Oh! My god!

比较常用的命令

#去除空行和注释行,这个对查看配置文件很有用
[root@node-249 test]# grep -Ev '^$|^#' txt
"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
However, this dress is about $ 3183 dollars.^M
...
  • 14
    点赞
  • 14
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值