最近还是沉迷Linux shell无法自拔,其中的文本指令如grep、sed和awk指令还是有丶东西的。故来总结归纳一番。
1.grep
首先是grep,其功能是在文件内查找指定的字符串,如果发现在文本中找到了指定的字符串,预设grep指令会把含有该字符串的那一行显示出来,且该字符串高亮。
grep语法:
grep [-abcEFGhHilLnqrsvVwxy][-A<显示列数>][-B<显示列数>][-C<显示列数>]
[-d<进行动作>][-e<范本样式>][-f<范本文件>][--help][范本样式][文件或目录...]
参数貌似有点多,来个最简单的格式grep 字符串 文件
[root@localhost ~]# cat /etc/test.txt #测试文件内容
Hi,what's your name?
I'm very happy to meet you.
How do you do?
I'm fine,thanks.
Would you like to drink sth?
Myname is Linux.
I wanna drink some coffee.
It's a nice day,isn't it?
Where are you from?
[root@localhost ~]# grep you /etc/test.txt
Hi,what's your name?
I'm very happy to meet you.
How do you do?
Would you like to drink sth?
Where are you from?
#将显示test.txt文件中所有含you的行,粘贴过来高亮貌似不能显示,见谅。
这时候我们试试加点参数,比如-r(recusion),递归查找,在etc/sysconfig目录下递归查找含有update字符串的文件行,此时不止输出含有update的行,还会输出该文件名。
[root@localhost sysconfig]# grep -r update /etc/sysconfig
/etc/sysconfig/network-scripts/ifup-TeamPort: /usr/bin/teamdctl ${TEAM_MASTER} port config update ${DEVICE} "${TEAM_PORT_CONFIG}" || exit 1
/etc/sysconfig/network-scripts/ifdown-post:update_DNS_entries
/etc/sysconfig/network-scripts/ifup-aliases:# addrs will be updated on existing aliases, and new aliases will be setup.
/etc/sysconfig/network-scripts/ifup-aliases: # update ARP cache of neighboring computers:
/etc/sysconfig/network-scripts/ifup-eth: # update ARP cache of neighboring computers
/etc/sysconfig/network-scripts/ifup-eth: if ! is_false "${arpupdate[$idx]}" && [ "${REALDEVICE}" != "lo" ]; then
/etc/sysconfig/network-scripts/ifup-post: update_DNS_entries
试试-v参数,反向查找,查找不含有you*的表达式,上篇文章刚刚介绍了正则表达式,此处表示查找不含有以you开头的字符串,在上述test.txt中显然会过滤到含有you和your的行。
[root@localhost /]# grep -v you* /etc/test.txt
I'm fine,thanks.
Myname is Linux.
I wanna drink some coffee.
It's a nice day,isn't it?
2.sed
其次我们介绍sed指令,sed命令是利用script来处理文本文件。sed可依照script的指令,来处理、编辑文本文件。Sed主要用来自动编辑一个或多个文件;简化对文件的反复操作;编写转换程序等。
sed语法:
sed [-hnV][-e<script>][-f<script文件>][文本文件]
#这个参数不多,就简单介绍一下
#-h 不用多说,I need some help
#-n 仅显示script处理后的结果。
#-V 当然是version啦
#-e 目测是execution,具体的script操作,如a(新增)、d(删除)、i(插入)、p(打印)、s(取代)等
#-f script file,我们可以将-e那些操作写进file中。
我们现在利用sed指令向test.txt最后一行添加一个字符串“I’m new here!”。
[root@localhost /]# sed -e 9a\'I'm new here!' /etc/test.txt #测试文件共9行,注意为什么是9而不是10
Hi,what's your name?
I'm very happy to meet you.
How do you do?
I'm fine,thanks.
Would you like to drink sth?
Myname is Linux.
I wanna drink some coffee.
It's a nice day,isn't it?
Where are you from?
I'm new here!#新插入的行
[root@localhost /]#cat /etc/test.txt #cat一下,似乎少了什么
Hi,what's your name?
I'm very happy to meet you.
How do you do?
I'm fine,thanks.
Would you like to drink sth?
Myname is Linux.
I wanna drink some coffee.
It's a nice day,isn't it?
Where are you from?
[root@localhost /]#
细心一点的读者会发现此时再cat,test.txt文件并没有被修改,是的,因为sed是对流进行操作而不是对文件本身进行操作,当我们确认文件无误时可以覆盖源文件。
[root@localhost /]# sed -e 9a\'I'm new here!' /etc/test.txt > /etc/test.txt.tmp #导入到临时文件
[root@localhost /]# cat /etc/test.txt.tmp #确认临时文件无误
Hi,what's your name?
I'm very happy to meet you.
How do you do?
I'm fine,thanks.
Would you like to drink sth?
Myname is Linux.
I wanna drink some coffee.
It's a nice day,isn't it?
Where are you from?
'Im new here!
[root@localhost /]# mv /etc/test.txt.tmp /etc/test.txt #覆盖源文件
mv:是否覆盖"/etc/test.txt"? y
[root@localhost /]# cat /etc/test.txt
Hi,what's your name?
I'm very happy to meet you.
How do you do?
I'm fine,thanks.
Would you like to drink sth?
Myname is Linux.
I wanna drink some coffee.
It's a nice day,isn't it?
Where are you from?
'Im new here! #此时源文件test.txt就被修改了。
[root@localhost /]#
#我们还有另外的方法,使用-i指令,当然不推荐。
[root@localhost /]# sed -i 9a\'I'm new here!' /etc/test.txt
用sed指令删除指定行。
nl /etc/test.txt | sed '2d' #只删除第二行
nl /etc/test.txt | sed '3,$d' #删除第三行到最后一行
#其中nl是另一个指令,主要功能是显示文件内容并标注行号。
用sed指令搜索含有指定字符串的行。
[root@localhost /]# nl /etc/test.txt | sed '/you/p'
1 Hi,what's your name?
1 Hi,what's your name?
2 I'm very happy to meet you.
2 I'm very happy to meet you.
3 How do you do?
3 How do you do?
4 I'm fine,thanks.
5 Would you like to drink sth?
5 Would you like to drink sth?
6 Myname is Linux.
7 I wanna drink some coffee.
8 It's a nice day,isn't it?
9 Where are you from?
9 Where are you from?
10 'Im new here!
我们注意到含有you的行输出了两次,如果you找到,除了输出所有行,还会输出匹配行。使用-n的时候将只打印包含you的行。
[root@localhost /]# nl /etc/test.txt | sed -n '/you/p'
1 Hi,what's your name?
2 I'm very happy to meet you.
3 How do you do?
5 Would you like to drink sth?
9 Where are you from?
3.awk
AWK是一种处理文本文件的语言,是一个强大的文本分析工具。
awk语法;
awk [选项参数] 'script' var=value file(s)
awk [选项参数] -f scriptfile var=value file(s)
参数略多,先不列出,下面看实例。
[root@localhost /]# awk '{print $1,$4}' /etc/test.txt #打印该文件的第1,4列,默认采用空格或Tab划分列。
Hi,what's
I'm to
How do?
I'm
Would to
Myname
I some
It's day,isn't
Where from?
I'm
指定分隔符-F,我们发现在测试文件中 ’ 挺多的,不妨以 ’ 划分。
[root@localhost /]# awk -F' '{print $1,$4}' /etc/test.txt
>
#然而并没有结果,联想到正则表达式的转义字符,我斗胆尝试一下
[root@localhost /]# awk -F\' '{print $1,$4}' /etc/test.txt
Hi,what
I
How do you do?
I
Would you like to drink sth?
Myname is Linux.
I wanna drink some coffee.
It
Where are you from?
I
接下来介绍awk的变量,有丶东西。
awk -v # 设置变量
为了方便,修改test.txt文件部分内容。
[root@localhost /]# cat /etc/test.txt
3 Hi,what's your name?
6 I'm very happy to meet you.
How do you do?
9 I'm fine,thanks.
Would you like to drink sth?
0 Myname is Linux.
1 I wanna drink some coffee.
It's a nice day,isn't it?
5 Where are you from?
I'm new here!
希望大家对比以下两段代码。
rootocalhost /]# awk -va=1 '{print $1,$1+a}' /etc/test.txt
3 4
6 7
How 1
9 10
Would 1
0 1
1 2
It's 1
5 6
I'm 1
[root@localhost /]# awk -va=1 '{print $1,$(1+a)}' /etc/test.txt
3 Hi,what's
6 I'm
How do
9 I'm
Would you
0 Myname
1 I
It's a
5 Where
I'm new
第一段结果略奇怪,注意 $1+a,显然是划分后第一列的值加上a,a=1,所以数值开头的行很好理解,如果是以字符串开头的行,比如How,均按0处理,故这些行均为1。对于第二段,实质就是打印第一列和第二列。接下来来理解下面这段就不困难了吧。
[root@localhost /]# awk -va=1 -vb=s '{print $1,$1+a,$1b}' /etc/test.txt
3 4 3s
6 7 6s
How 1 Hows
9 10 9s
Would 1 Woulds
0 1 0s
1 2 1s
It's 1 It'ss
5 6 5s
I'm 1 I'ms
过滤数值,比如过滤第一列大于3的行。
[root@localhost /]# awk '$1>3' /etc/test.txt
6 I'm very happy to meet you.
How do you do?
9 I'm fine,thanks.
Would you like to drink sth?
It's a nice day,isn't it?
5 Where are you from?
I'm new here!
此时大家可能会有疑惑,为什么字符串开头的行都没有过滤?因为思维定势,上面我们认为字符串默认为0,而在比较大小过程中,先尝试能否将字符串转换成数值,比如0123转换成123,而对于How显然没办法,所以此时比较的是它们的ASCII码,所以这些字符串并没有被过滤。
最后举几个awk的应用实例。
(1)打印9*9乘法表格
[root@localhost /]# seq 9 | sed 'H;g' | awk -v RS='' '{for(i=1;i<=NF;i++)printf("%dx%d=%d%s", i, NR, i*NR, i==NR?"\n":"\t")}'
1x1=1
1x2=2 2x2=4
1x3=3 2x3=6 3x3=9
1x4=4 2x4=8 3x4=12 4x4=16
1x5=5 2x5=10 3x5=15 4x5=20 5x5=25
1x6=6 2x6=12 3x6=18 4x6=24 5x6=30 6x6=36
1x7=7 2x7=14 3x7=21 4x7=28 5x7=35 6x7=42 7x7=49
1x8=8 2x8=16 3x8=24 4x8=32 5x8=40 6x8=48 7x8=56 8x8=64
1x9=9 2x9=18 3x9=27 4x9=36 5x9=45 6x9=54 7x9=63 8x9=72 9x9=81
(2)磁盘占用率过滤
[root@localhost ~]# df -h #查看文件系统
文件系统 容量 已用 可用 已用% 挂载点
/dev/mapper/cl-root 50G 2.0G 49G 4% /
devtmpfs 7.8G 0 7.8G 0% /dev
tmpfs 7.8G 0 7.8G 0% /dev/shm
tmpfs 7.8G 12M 7.8G 1% /run
tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup
/dev/sda1 1014M 177M 838M 18% /boot
/dev/mapper/cl-home 441G 33M 441G 1% /home
tmpfs 1.6G 0 1.6G 0% /run/user/0
比如我们想找出占用率大于3%的设备,数一数“已用%”在第五列,即$5,这时候发现后面有%,给出一种思路,按空格和%划分,那么第五列显然是纯数字了。
[root@localhost ~]# df -h | awk -F'[ %]+' '{if($5>3)print}'
文件系统 容量 已用 可用 已用% 挂载点
/dev/mapper/cl-root 50G 2.0G 49G 4% /
/dev/sda1 1014M 177M 838M 18% /boot
没错,又用到了正则表达式,[ %]+,匹配一个或多个空格或者%。
4.总结
限于篇幅,有关于grep、sed和awk的指令并没有介绍详细,还需要在日后的工作和学习继续巩固。总结一下,grep 更适合单纯的查找文本,sed 更适合编辑匹配到的文本,awk 更适合格式化文本,对文本进行较复杂格式处理。熟练掌握这些指令能更好地处理文本,大大提高工作效率。