shell 编程--grep

最新推荐文章于 2024-04-06 13:00:51 发布

阿无@_@

最新推荐文章于 2024-04-06 13:00:51 发布

阅读量497

点赞数

本文链接：https://blog.csdn.net/jiaofan_yun/article/details/122976339

版权

linux shell 核心编程专栏收录该内容

14 篇文章 3 订阅

订阅专栏

本文介绍了正则表达式的基本和扩展用法，以及在grep命令中的实践。通过示例展示了如何使用grep进行数据过滤，包括匹配特定字符、单词、行以及数字的出现次数。同时，还探讨了POSIX规范和GNU规范下的正则表达式特性，如边界匹配、特殊字符匹配等。这些技巧对于日常的数据处理和文本分析非常实用。

摘要由CSDN通过智能技术生成

一、数据过滤与正则表达式

用法：

grep  [选项]  匹配模式  [文件]
常用选项：
        -i     忽略字母大小写。
        -v     取反匹配
        -w     匹配单词
        -q     静默匹配，不将结果显示在屏幕上。

案例：

[root@localhost jiaofan]# cat test.txt             #<==文本内容
th The ccc
the bbb
theabc
hello world
[root@localhost jiaofan]# grep the test.txt        #<==过滤有the的行（区分大小写）
the bbb
theabc
[root@localhost jiaofan]# grep -i the test.txt           #<==过滤有the的行（不区分大小写）
th The ccc
the bbb
theabc
[root@localhost jiaofan]# grep -w the test.txt           #<==过滤有the单词的行（theabc不是单词the所以不会过滤）
the bbb
[root@localhost jiaofan]# grep -v the test.txt           #<==过滤没有the的行（区分大小写）
th The ccc
hello world         
[root@localhost jiaofan]# grep -q the test.txt           #<==静默匹配
[root@localhost jiaofan]#

1）基本正则表达式

字符	含义
c	匹配字母c
.	匹配任意单个字符
*	匹配前一个字符出现零次或多次
.*	匹配多个任意字符
[]	匹配集合中的任意单个字符，括号可以使任意数量字符的集合
[x-y]	匹配连续的字符串范围
^	匹配字符串的开头
$	匹配字符串的结尾
[^]	匹配否定，对括号中的集合取反
\	匹配转义后的字符串
\{n,m\}	匹配前一个字符重复n到m次
\{n,m\}	匹配前一个字符重复至少n次
	将之间的内容存储在“保留空间”，最多可存储9个
\n	通过\1至\9调用保留空间中的内容

案例：

[root@localhost jiaofan]# cp /etc/passwd /tmp                      #<==复制素材模板文件
[root@localhost jiaofan]# grep "root" /tmp/passwd                #<==查找有root的行
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
[root@localhost jiaofan]# grep ":..0:" /tmp/passwd                 #<==查找：与"0："之间包含任意两个字符的字符串，并显示该行
root:x:0:0:root:/root:/bin/bash
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
games:x:12:100:games:/usr/games:/sbin/nologin
[root@localhost jiaofan]# grep "00*" /tmp/passwd                   #<==查找至少包含一个0的行
root:x:0:0:root:/root:/bin/bash
sync:x:5:0:sync:/sbin:/bin/sync
games:x:12:100:games:/usr/games:/sbin/nologin
... ...
[root@localhost jiaofan]# grep "o[os]t" /tmp/passwd                 #<==匹配o和t之间有o或s的行
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
[root@localhost jiaofan]# grep "[0-9]" /tmp/passwd                    #<==匹配有数字的行
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
... ...
[root@localhost jiaofan]# grep "^root" /tmp/passwd                      #<==匹配root开头的行
root:x:0:0:root:/root:/bin/bash
[root@localhost jiaofan]# grep "bash$" /tmp/passwd                    #<==匹配bash结尾的行
root:x:0:0:root:/root:/bin/bash
jiaofan:x:1000:1001::/home/jiaofan:/bin/bash
... ...
[root@localhost jiaofan]# grep "sbin[^n]" /tmp/passwd                  #<==匹配sbin后面不跟n的行
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
... ...
[root@localhost jiaofan]# grep "0\{1,2\}" /tmp/passwd                   #<==查找数字0出现最少1次，最多2次的行。
root:x:0:0:root:/root:/bin/bash
sync:x:5:0:sync:/sbin:/bin/sync
games:x:12:100:games:/usr/games:/sbin/nologin
... ...
[root@localhost jiaofan]# grep "\(root\).*\1"  /tmp/passwd              #<==匹配两个root之间是任意字符的行。\(root\)匹配root并放到保留空间，.*是匹配任意字符，\1调用保留空间的root。
root:x:0:0:root:/root:/bin/bash
[root@localhost jiaofan]# grep "^$" /tmp/passwd                         #<==查找空行
[root@localhost jiaofan]# grep -v "^$" /tmp/passwd                     #<==查找除空行以外的行

2）扩展正则表达式

字符	含义
{n,m}	等同于基本正则表达式的{n,m}
+	匹配前一个字符出现一次或多次
?	匹配前一个字符出现零次或多次
\|	匹配逻辑或，即匹配\|前或后的字串
()	匹配正则集合，同时也有保留的意思，等同于基本正则表达式

案例：grep命令默认不支持扩展正则表达式，需要使用grep -E 或者使用 egrep 命令进行扩展正则表达式的过滤。

[root@localhost jiaofan]# egrep "0{1,2}" /tmp/passwd              #<==查找数字出现最少1次最多2次的行
root:x:0:0:root:/root:/bin/bash
sync:x:5:0:sync:/sbin:/bin/sync
games:x:12:100:games:/usr/games:/sbin/nologin
... ...
[root@localhost jiaofan]# grep -E "0+" /tmp/passwd                #<==查找包含最少一个0的行
root:x:0:0:root:/root:/bin/bash
sync:x:5:0:sync:/sbin:/bin/sync
games:x:12:100:games:/usr/games:/sbin/nologin
... ...
[root@localhost jiaofan]# egrep  "(root|jiaofan)" /tmp/passwd      #<==查找root或者jiaofan的行。
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
jiaofan:x:1000:1001::/home/jiaofan:/bin/bash
[root@localhost jiaofan]#

3）POSIX 规范的正则表达式

字符集	含义	字符集	含义
[:alpha:]	字母字符	[:graph:]	非空格字符
[:alnum:]	字母与数字字符	[:print:]	任意可以显示的字符
[:cntrl:]	控制字符	[:space:]	任意可以产生空白的字符
[:digit:]	数字字符	[:blank:]	空格与Tab键字符
[:xdigit:]	十六进制数字字符	[:lower:]	小写字符
[:punct:]	标点符号	[:upper:]	大写字符

案例：

[root@localhost jiaofan]# grep "[[:digit:]]" /tmp/passwd    #<==查找有数字的行
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
... ...
[root@localhost jiaofan]# grep "[[:alpha:]]" /tmp/passwd    #<==查找有字母的行
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
... ...
[root@localhost jiaofan]# grep "[[:punct:]]" /tmp/passwd     #<==查找有标点的行
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
... ...
[root@localhost jiaofan]# grep "[[:space:]]" /tmp/passwd     #<==查找有空格的行
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
systemd-network:x:192:192:systemd Network Management:/:/sbin/nologin
... ...
[root@localhost jiaofan]#

4）GNU 规范

字符	含义
\b	边界字符，匹配单词的开始或结尾
\B	与\b为反义词，匹配单词中间部分
\w	等同于[_[:alnum:]] 匹配字母、数字、下划线
\W	等同于[^_[:alnum:]] 不匹配字符、数字、下划线
\d	任意数字
\D	任意非数字
\s	任意空白字符（空格、制表符等）
\S	任意非空白字符

注：/d、/D、/s、/S 只有部分软件支持，使用时添加 -P [–color] 参数,-p 让grep 支持 perl 兼容的正则表达式。

案例：

[root@localhost jiaofan]# grep "i\b" /tmp/passwd             #<==匹配单词i结尾的行
jiaoi:x:1115:1115::/home/jiaoi:/bin/bash
[root@localhost jiaofan]# grep "\W" /tmp/passwd              #<==匹配所有非字母、数字、下划线组合的行
[root@localhost jiaofan]# grep "\w" /tmp/passwd              #<==匹配所有字母、数字、下划线组合的行
[root@localhost jiaofan]# grep -P --color "\d" /tmp/passwd   #<==匹配任意数字
[root@localhost jiaofan]# grep -P --color "\D" /tmp/passwd   #<==匹配任意非数字