Linux文本操作指令grep、sed和awk

最近还是沉迷Linux shell无法自拔,其中的文本指令如grep、sed和awk指令还是有丶东西的。故来总结归纳一番。

1.grep

首先是grep,其功能是在文件内查找指定的字符串,如果发现在文本中找到了指定的字符串,预设grep指令会把含有该字符串的那一行显示出来,且该字符串高亮。
grep语法:

grep [-abcEFGhHilLnqrsvVwxy][-A<显示列数>][-B<显示列数>][-C<显示列数>]
[-d<进行动作>][-e<范本样式>][-f<范本文件>][--help][范本样式][文件或目录...]

参数貌似有点多,来个最简单的格式grep 字符串 文件

[root@localhost ~]# cat /etc/test.txt #测试文件内容
Hi,what's your name?
I'm very happy to meet you. 
How do you do?
I'm fine,thanks.
Would you like to drink sth?
Myname is Linux.
I wanna drink some coffee.
It's a nice day,isn't it?
Where are you from?
[root@localhost ~]# grep you /etc/test.txt 
Hi,what's your name?
I'm very happy to meet you. 
How do you do?
Would you like to drink sth?
Where are you from?
#将显示test.txt文件中所有含you的行,粘贴过来高亮貌似不能显示,见谅。

这时候我们试试加点参数,比如-r(recusion),递归查找,在etc/sysconfig目录下递归查找含有update字符串的文件行,此时不止输出含有update的行,还会输出该文件名。

[root@localhost sysconfig]# grep -r update /etc/sysconfig 
/etc/sysconfig/network-scripts/ifup-TeamPort:       /usr/bin/teamdctl ${TEAM_MASTER} port config update ${DEVICE} "${TEAM_PORT_CONFIG}" || exit 1
/etc/sysconfig/network-scripts/ifdown-post:update_DNS_entries
/etc/sysconfig/network-scripts/ifup-aliases:# addrs will be updated on existing aliases, and new aliases will be setup.
/etc/sysconfig/network-scripts/ifup-aliases:        # update ARP cache of neighboring computers:
/etc/sysconfig/network-scripts/ifup-eth:            # update ARP cache of neighboring computers
/etc/sysconfig/network-scripts/ifup-eth:            if ! is_false "${arpupdate[$idx]}" && [ "${REALDEVICE}" != "lo" ]; then
/etc/sysconfig/network-scripts/ifup-post:    update_DNS_entries

试试-v参数,反向查找,查找不含有you*的表达式,上篇文章刚刚介绍了正则表达式,此处表示查找不含有以you开头的字符串,在上述test.txt中显然会过滤到含有you和your的行。

[root@localhost /]# grep -v you* /etc/test.txt 
I'm fine,thanks.
Myname is Linux.
I wanna drink some coffee.
It's a nice day,isn't it?

2.sed

其次我们介绍sed指令,sed命令是利用script来处理文本文件。sed可依照script的指令,来处理、编辑文本文件。Sed主要用来自动编辑一个或多个文件;简化对文件的反复操作;编写转换程序等。
sed语法:

sed [-hnV][-e<script>][-f<script文件>][文本文件]
#这个参数不多,就简单介绍一下
#-h 不用多说,I need some help
#-n 仅显示script处理后的结果。
#-V 当然是version啦
#-e 目测是execution,具体的script操作,如a(新增)、d(删除)、i(插入)、p(打印)、s(取代)等
#-f script file,我们可以将-e那些操作写进file中。

我们现在利用sed指令向test.txt最后一行添加一个字符串“I’m new here!”。

[root@localhost /]# sed -e 9a\'I'm new here!' /etc/test.txt #测试文件共9行,注意为什么是9而不是10 
Hi,what's your name?
I'm very happy to meet you. 
How do you do?
I'm fine,thanks.
Would you like to drink sth?
Myname is Linux.
I wanna drink some coffee.
It's a nice day,isn't it?
Where are you from?
I'm new here!#新插入的行
[root@localhost /]#cat /etc/test.txt #cat一下,似乎少了什么
Hi,what's your name?
I'm very happy to meet you. 
How do you do?
I'm fine,thanks.
Would you like to drink sth?
Myname is Linux.
I wanna drink some coffee.
It's a nice day,isn't it?
Where are you from?
[root@localhost /]#

细心一点的读者会发现此时再cat,test.txt文件并没有被修改,是的,因为sed是对流进行操作而不是对文件本身进行操作,当我们确认文件无误时可以覆盖源文件。

[root@localhost /]# sed -e 9a\'I'm new here!' /etc/test.txt > /etc/test.txt.tmp #导入到临时文件
[root@localhost /]# cat /etc/test.txt.tmp #确认临时文件无误
Hi,what's your name?
I'm very happy to meet you. 
How do you do?
I'm fine,thanks.
Would you like to drink sth?
Myname is Linux.
I wanna drink some coffee.
It's a nice day,isn't it?
Where are you from?
'Im new here!
[root@localhost /]# mv /etc/test.txt.tmp /etc/test.txt #覆盖源文件
mv:是否覆盖"/etc/test.txt"? y
[root@localhost /]# cat /etc/test.txt
Hi,what's your name?
I'm very happy to meet you. 
How do you do?
I'm fine,thanks.
Would you like to drink sth?
Myname is Linux.
I wanna drink some coffee.
It's a nice day,isn't it?
Where are you from?
'Im new here!  #此时源文件test.txt就被修改了。
[root@localhost /]# 
#我们还有另外的方法,使用-i指令,当然不推荐。
[root@localhost /]# sed -i 9a\'I'm new here!' /etc/test.txt 

用sed指令删除指定行。

nl /etc/test.txt | sed '2d' #只删除第二行
nl /etc/test.txt | sed '3,$d' #删除第三行到最后一行
#其中nl是另一个指令,主要功能是显示文件内容并标注行号。

用sed指令搜索含有指定字符串的行。

[root@localhost /]# nl /etc/test.txt | sed '/you/p'
     1  Hi,what's your name?
     1  Hi,what's your name?
     2  I'm very happy to meet you. 
     2  I'm very happy to meet you. 
     3  How do you do?
     3  How do you do?
     4  I'm fine,thanks.
     5  Would you like to drink sth?
     5  Would you like to drink sth?
     6  Myname is Linux.
     7  I wanna drink some coffee.
     8  It's a nice day,isn't it?
     9  Where are you from?
     9  Where are you from?
    10  'Im new here!

我们注意到含有you的行输出了两次,如果you找到,除了输出所有行,还会输出匹配行。使用-n的时候将只打印包含you的行。

[root@localhost /]# nl /etc/test.txt | sed -n '/you/p'
     1  Hi,what's your name?
     2  I'm very happy to meet you. 
     3  How do you do?
     5  Would you like to drink sth?
     9  Where are you from?

3.awk

AWK是一种处理文本文件的语言,是一个强大的文本分析工具。
awk语法;

awk [选项参数] 'script' var=value file(s)
awk [选项参数] -f scriptfile var=value file(s)

参数略多,先不列出,下面看实例。

[root@localhost /]# awk '{print $1,$4}' /etc/test.txt #打印该文件的第1,4列,默认采用空格或Tab划分列。
Hi,what's 
I'm to
How do?
I'm 
Would to
Myname 
I some
It's day,isn't
Where from?
I'm 

指定分隔符-F,我们发现在测试文件中 ’ 挺多的,不妨以 ’ 划分。

[root@localhost /]# awk -F'  '{print $1,$4}' /etc/test.txt 
>
#然而并没有结果,联想到正则表达式的转义字符,我斗胆尝试一下

[root@localhost /]# awk -F\'  '{print $1,$4}' /etc/test.txt 
Hi,what 
I 
How do you do? 
I 
Would you like to drink sth? 
Myname is Linux. 
I wanna drink some coffee. 
It 
Where are you from? 
I 

接下来介绍awk的变量,有丶东西。

awk -v  # 设置变量

为了方便,修改test.txt文件部分内容。

[root@localhost /]# cat /etc/test.txt
3 Hi,what's your name?
6 I'm very happy to meet you. 
How do you do?
9 I'm fine,thanks.
Would you like to drink sth?
0 Myname is Linux.
1 I wanna drink some coffee.
It's a nice day,isn't it?
5 Where are you from?
I'm new here!

希望大家对比以下两段代码。

rootocalhost /]# awk -va=1 '{print $1,$1+a}' /etc/test.txt
3 4
6 7
How 1
9 10
Would 1
0 1
1 2
It's 1
5 6
I'm 1
[root@localhost /]# awk -va=1 '{print $1,$(1+a)}' /etc/test.txt
3 Hi,what's
6 I'm
How do
9 I'm
Would you
0 Myname
1 I
It's a
5 Where
I'm new

第一段结果略奇怪,注意 $1+a,显然是划分后第一列的值加上a,a=1,所以数值开头的行很好理解,如果是以字符串开头的行,比如How,均按0处理,故这些行均为1。对于第二段,实质就是打印第一列和第二列。接下来来理解下面这段就不困难了吧。

[root@localhost /]# awk -va=1 -vb=s '{print $1,$1+a,$1b}' /etc/test.txt 
3 4 3s
6 7 6s
How 1 Hows
9 10 9s
Would 1 Woulds
0 1 0s
1 2 1s
It's 1 It'ss
5 6 5s
I'm 1 I'ms

过滤数值,比如过滤第一列大于3的行。

[root@localhost /]# awk '$1>3' /etc/test.txt 
6 I'm very happy to meet you. 
How do you do?
9 I'm fine,thanks.
Would you like to drink sth?
It's a nice day,isn't it?
5 Where are you from?
I'm new here!

此时大家可能会有疑惑,为什么字符串开头的行都没有过滤?因为思维定势,上面我们认为字符串默认为0,而在比较大小过程中,先尝试能否将字符串转换成数值,比如0123转换成123,而对于How显然没办法,所以此时比较的是它们的ASCII码,所以这些字符串并没有被过滤。
最后举几个awk的应用实例。
(1)打印9*9乘法表格

[root@localhost /]# seq 9 | sed 'H;g' | awk -v RS='' '{for(i=1;i<=NF;i++)printf("%dx%d=%d%s", i, NR, i*NR, i==NR?"\n":"\t")}'
1x1=1
1x2=2   2x2=4
1x3=3   2x3=6   3x3=9
1x4=4   2x4=8   3x4=12  4x4=16
1x5=5   2x5=10  3x5=15  4x5=20  5x5=25
1x6=6   2x6=12  3x6=18  4x6=24  5x6=30  6x6=36
1x7=7   2x7=14  3x7=21  4x7=28  5x7=35  6x7=42  7x7=49
1x8=8   2x8=16  3x8=24  4x8=32  5x8=40  6x8=48  7x8=56  8x8=64
1x9=9   2x9=18  3x9=27  4x9=36  5x9=45  6x9=54  7x9=63  8x9=72  9x9=81

(2)磁盘占用率过滤

[root@localhost ~]# df -h #查看文件系统
文件系统               容量   已用  可用   已用% 挂载点
/dev/mapper/cl-root   50G  2.0G   49G    4%  /
devtmpfs             7.8G     0  7.8G    0%  /dev
tmpfs                7.8G     0  7.8G    0%  /dev/shm
tmpfs                7.8G   12M  7.8G    1%  /run
tmpfs                7.8G     0  7.8G    0%  /sys/fs/cgroup
/dev/sda1           1014M  177M  838M   18%  /boot
/dev/mapper/cl-home  441G   33M  441G    1%  /home
tmpfs                1.6G     0  1.6G    0%  /run/user/0

比如我们想找出占用率大于3%的设备,数一数“已用%”在第五列,即$5,这时候发现后面有%,给出一种思路,按空格和%划分,那么第五列显然是纯数字了。

[root@localhost ~]# df -h | awk -F'[ %]+' '{if($5>3)print}'
文件系统             容量  已用  可用 已用% 挂载点
/dev/mapper/cl-root   50G  2.0G   49G    4% /
/dev/sda1           1014M  177M  838M   18% /boot

没错,又用到了正则表达式,[ %]+,匹配一个或多个空格或者%。

4.总结

限于篇幅,有关于grep、sed和awk的指令并没有介绍详细,还需要在日后的工作和学习继续巩固。总结一下,grep 更适合单纯的查找文本,sed 更适合编辑匹配到的文本,awk 更适合格式化文本,对文本进行较复杂格式处理。熟练掌握这些指令能更好地处理文本,大大提高工作效率。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值