Shell学习（cut，sed，awk，sort）-CSDN博客

本文链接：https://blog.csdn.net/qq_31807385/article/details/83374194

cut

cut的工作就是“剪”，具体的说就是在文件中负责剪切数据用的。cut 命令从文件的每一行剪切字节、字符和字段并将这些字节、字符和字段输出。

选项参数	功能
-f	列号，提取第几列
-d	分隔符，按照指定分隔符分割列

提取eth0后面的IP地址：

[root@hadoop103 ~]# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 00:0C:29:45:40:B6  
          inet addr:192.168.1.103  Bcast:192.168.1.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fe45:40b6/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:216 errors:0 dropped:0 overruns:0 frame:0
          TX packets:158 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:19235 (18.7 KiB)  TX bytes:15639 (15.2 KiB)

[root@hadoop103 ~]# ifconfig eth0|grep "inet addr"
          inet addr:192.168.1.103  Bcast:192.168.1.255  Mask:255.255.255.0
[root@hadoop103 ~]# ifconfig eth0|grep "inet addr"| cut -d : -f 2
192.168.1.103  Bcast
[root@hadoop103 ~]# ifconfig eth0|grep "inet addr"| cut -d : -f 2|cut -d " " -f 1
192.168.1.103

`` 的作用：将‘’ 中间的内容作为一个的输出作为输入：

[root@hadoop103 ~]# `ifconfig eth0|grep "inet addr"| cut -d : -f 2|cut -d " " -f 1`

-bash: 192.168.1.103: command not found


[root@hadoop103 ~]# A=`ifconfig eth0|grep "inet addr"| cut -d : -f 2|cut -d " " -f 1`
[root@hadoop103 ~]# echo $A
192.168.1.103

如果我们要获取Bcast的话：不是第二列，应该是第三列，因为两个空格，都会被切割：

sed

sed是一种流编辑器，它一次处理一行内容。处理时，把当前处理的行存储在临时缓冲区中，称为“模式空间”，接着用sed命令处理缓冲区中的内容，处理完成后，把缓冲区的内容送往屏幕。接着处理下一行，这样不断重复，直到文件末尾。文件内容并没有改变，除非你使用重定向存储输出。

基本用法：sed [选项参数] ‘command’ filename

选项参数

功能

-e

- i

直接在指令列模式上进行sed的动作编辑。

对原文件进行原地的修改

命令	功能描述
a	新增，a的后面可以接字串，在下一行出现
d	删除
s	查找并替换

[root@hadoop103 ~]# vim sed.txt

dong shen
hen
guan zhen
wo  wo
lai  lai

le  le

需求1：在第二行添加1234。愚以为这个过程应该是这样的：cat的输入流，第一行会流入sed，不做任何的处理，接着第二行流入，由于是参数a ，在其后加上一些数据，以此类推、

[root@hadoop103 ~]# cat sed.txt|sed '2a1234'
dong shen
hen
1234
guan zhen
wo  wo
lai  lai

le  le

需求：在以zheng开头的行添加iseayou，删除zheng开头的那行

[root@hadoop103 ~]# cat sed.txt|sed '/^zheng/aiseayou'
dong shen
zheng dong
iseayou
guan zhen
wo  wo
lai  lai

le  le
[root@hadoop103 ~]# cat sed.txt|sed '/^zheng/d'
dong shen
guan zhen
wo  wo
lai  lai

le  le

sed 的前面是末班匹配，后面是操作。使用正则匹配的时候，要将正则使用//包裹起来

需求：将以z***e的行取出来，并换成z**e。下面的正则表达式，有 \ 表示转义，\ 1表示第一个子式。

[root@hadoop103 ~]# cat sed.txt
dong shen
zheng dong
guan zhen
wo  wo
lai  lai

le  le
[root@hadoop103 ~]# cat sed.txt|sed 's/.*\(z.*e\).*/\1/g'
dong shen
zhe
zhe
wo  wo
lai  lai

le  le

需求2，删除第二行：

[root@hadoop103 ~]# cat sed.txt|sed '2d'
dong shen
guan zhen
wo  wo
lai  lai

le  le
[root@hadoop103 ~]# cat sed.txt 
dong shen
hen
guan zhen
wo  wo
lai  lai

le  le

来看一个奇怪的现象：（暂时记住为重定向，如果没有中转的文件，那么就对原来的文件进行完全的覆盖）

[root@hadoop103 ~]# cat sed.txt|sed '2d' > sed.txt 
[root@hadoop103 ~]# cat sed.txt 
[root@hadoop103 ~]#

需求3：将该文件中的第二行删除掉，要求对源文件进行修改

[root@hadoop103 ~]# cat sed.txt
dong shen
guan zhen
wo  wo
lai  lai

le  le
[root@hadoop103 ~]# cat sed.txt|sed '2d' > sed2.txt && cat sed2.txt > sed.txt
[root@hadoop103 ~]# cat sed.txt
dong shen
wo  wo
lai  lai

le  le

另外一种直接修改源文件的方式：追加第二行的内容iseayou

[root@hadoop103 ~]# cat sed.txt
dong shen
guan zhen
wo  wo
lai  lai

le  le
[root@hadoop103 ~]# sed -i '2aisea_you' sed.txt
[root@hadoop103 ~]# cat sed.txt
dong shen
guan zhen
isea_you
wo  wo
lai  lai

le  le
[root@hadoop103 ~]#

需求四：将所有lai更为lailai，将第二个lai更为lailai：

[root@hadoop103 ~]# cat sed.txt
dong shen
guan zhen
wo  wo
lai  lai

le  le
[root@hadoop103 ~]# cat sed.txt|sed 's/lai/lailai/g'
dong shen
guan zhen
wo  wo
lailai  lailai

le  le
[root@hadoop103 ~]# cat sed.txt|sed 's/ lai/lailai/'
dong shen
guan zhen
wo  wo
lai lailai

le  le

一些关联到正则表达式的练习：

匹配到以dong为开头的一行，^ 是匹配开头：

[root@hadoop103 ~]# sed -i '1azheng dong' sed.txt
[root@hadoop103 ~]# cat sed.txt 
dong shen
zheng dong
guan zhen
wo  wo
lai  lai

le  le


[root@hadoop103 ~]# cat sed.txt| grep 'dong'
dong shen
zheng dong
[root@hadoop103 ~]# cat sed.txt| grep '^dong'
dong shen

匹配空行 $ 是匹配结尾：

[root@hadoop103 ~]# cat sed.txt| grep '^$'

[root@hadoop103 ~]#

匹配以 shenn（最后一个n出现0次或者多次）即，shen，shenn，shennn，都要匹配。

[root@hadoop103 ~]# cat sed.txt| grep 'shenn*'
dong shen

* 区别于+ ，*是匹配0个或者是多个，+ 表示至少一个（一个或多个）。

匹配任意字符：

[root@hadoop103 ~]# cat sed.txt| grep '.*'
dong shen
zheng dong
guan zhen
wo  wo
lai  lai

le  le

匹配10个字符的行，由于linux对{n}不支持只能使用十个 . ：

[root@hadoop103 ~]# cat sed.txt| grep '.{10}'
[root@hadoop103 ~]# cat sed.txt| grep '..........'
zheng dong

匹配c或者d开头的一行 [a-z] 匹配字符范围，匹配任意范围内的的任何字符

[root@hadoop103 ~]# cat sed.txt| grep '^[c|d]'
dong shen
[root@hadoop103 ~]# cat sed.txt| grep '^[c-d]'
dong shen

\d 相当于 [0-9] 推荐使用 [0-9]；

awk

一个强大的文本分析工具，把文件逐行的读入，以空格为默认分隔符将每行切片，切开的部分再进行分析处理。cut能实现的，AWK都能实现。基本用法：awk [选项参数] ‘pattern1{action1} pattern2{action2}...’ filename

pattern：表示AWK在数据中查找的内容，就是匹配模式

action：在找到匹配内容时所执行的一系列命令

选项参数说明

选项参数	功能
-F	指定输入文件折分隔符
-v	赋值一个用户定义变量

AWK的处理流程：首先拿出第一行，然后匹配第一个模板，如果匹配上，就执行action1，如果匹配不上，就什么都不执行，然后匹配第二个模板，匹配上就执行action1，匹配不上就什么也不执行。

数据准备如下：

[root@hadoop103 shell]# cp /etc/passwd ./
[root@hadoop103 shell]# ll
总用量 8
-rw-r--r--. 1 root root   48 10月 24 09:14 f1.sh
-rw-r--r--. 1 root root 1348 10月 25 11:01 passwd
[root@hadoop103 shell]# cat passwd 
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin
games:x:12:100:games:/usr/games:/sbin/nologin
gopher:x:13:30:gopher:/var/gopher:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
nobody:x:99:99:Nobody:/:/sbin/nologin
dbus:x:81:81:System message bus:/:/sbin/nologin
rtkit:x:499:499:RealtimeKit:/proc:/sbin/nologin
avahi-autoipd:x:170:170:Avahi IPv4LL Stack:/var/lib/avahi-autoipd:/sbin/nologin
vcsa:x:69:69:virtual console memory owner:/dev:/sbin/nologin
abrt:x:173:173::/etc/abrt:/sbin/nologin
haldaemon:x:68:68:HAL daemon:/:/sbin/nologin
ntp:x:38:38::/etc/ntp:/sbin/nologin
saslauth:x:498:76:Saslauthd user:/var/empty/saslauth:/sbin/nologin
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
gdm:x:42:42::/var/lib/gdm:/sbin/nologin
pulse:x:497:496:PulseAudio System Daemon:/var/run/pulse:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
tcpdump:x:72:72::/:/sbin/nologin
mysql:x:496:493:MySQL server:/var/lib/mysql:/bin/bash

各个字段的信息：

用户名：密码加密（x）：编号；组号；组名；家目录；解释器。

接下来是看命令自己理解意思和结果：

[root@hadoop103 shell]# awk -F : '/^root/{print $3}' passwd
0

[root@hadoop103 shell]# awk -F : '/^root/{print $3} /^mysql/{print $1}' passwd
0
mysql
[root@hadoop103 shell]# awk -F : '/^root/{print $3} /^mysql/{print $1} /^a/{print $4}' passwd
0
4
170
173
mysql

[root@hadoop103 shell]# awk -F : -v sum=0 '{sum+=$3}END{print sum}' passwd
3061
[root@hadoop103 shell]# awk -F :'BEGIN{sum=0}{sum+=$3}END{print sum}' passwd
^C
[root@hadoop103 shell]# awk -F : 'BEGIN{sum=0}{sum+=$3}END{print sum}' passwd
3061

[root@hadoop103 shell]# awk -F : 'BEGIN{sum=0}{sum+=$3;print NR}END{print sum}' passwd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
3061
[root@hadoop103 shell]#

[root@hadoop103 shell]# awk -F : 'BEGIN{sum=0}{sum+=$3;print NF}END{print sum}' passwd
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
3061

AWK能使用正则表达式使用，在模式中填写上正则表达式

sort

sort命令是在Linux里非常有用，它将文件进行排序，并将排序结果标准输出。

基本语法，sort(选项)(参数)

选项	说明
-n	依照数值的大小排序
-r	以相反的顺序来排序
-t	设置排序时所用的分隔字符
-k	指定需要排序的列

[root@hadoop103 shell]# cat passwd |grep '^a'
adm:x:3:4:adm:/var/adm:/sbin/nologin
avahi-autoipd:x:170:170:Avahi IPv4LL Stack:/var/lib/avahi-autoipd:/sbin/nologin
abrt:x:173:173::/etc/abrt:/sbin/nologin

#按照数值的大小排序
[root@hadoop103 shell]# cat passwd |grep '^a' |sort -nt : -k 3
adm:x:3:4:adm:/var/adm:/sbin/nologin
avahi-autoipd:x:170:170:Avahi IPv4LL Stack:/var/lib/avahi-autoipd:/sbin/nologin
abrt:x:173:173::/etc/abrt:/sbin/nologin

#按照数值的字符序进行排序
[root@hadoop103 shell]# cat passwd |grep '^a' |sort -t : -k 3
avahi-autoipd:x:170:170:Avahi IPv4LL Stack:/var/lib/avahi-autoipd:/sbin/nologin
abrt:x:173:173::/etc/abrt:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin

一些面试题：

问题1：使用Linux命令查询file1中空行所在的行号

[root@hadoop103 shell]# cd
[root@hadoop103 ~]# awk '/^$/{print NR}' sed.txt
6
[root@hadoop103 ~]# cat sed.txt 
dong shen
zheng dong
guan zhen
wo  wo
lai  lai

le  le

问题2：

使用Linux命令计算第二列的和并输出，有文件chengji.txt内容如下:

张三 40
李四 50
王五 60

[root@hadoop103 ~]# cat chengji.txt 
张三 40
李四 50
王五 60
[root@hadoop103 ~]# awk -F " " 'BEGIN{sum=0}{sum+=$2}END{print sum}' chengji.txt 
150
[root@hadoop103 ~]#

问题1：Shell脚本里如何检查一个文件是否存在？如果不存在该如何处理？

[root@hadoop103 shell]# sh exist.sh 
文件不存在
[root@hadoop103 shell]# vim exist.sh

#!/bin/bash
if [ -f file.txt ]
then
        echo "文件存在"
else
        echo "文件不存在"
fi

问题4：用shell写一个脚本，对文本中无序的一列数字排序

[root@hadoop103 shell]# sort -n test.txt 


1
2
3
4
5
6
7
8
9
10
[root@hadoop103 shell]# vim test.txt

9
8
7
6
5
4
3
2
10
1

问题5：请用shell脚本写出查找当前文件夹（/home）下所有的文本文件内容中包含有字符”shen”的文件名称

[root@hadoop103 ~]# grep -r "shen" ./
./sed.txt:dong shen
[root@hadoop103 ~]# grep -r "shen" ./|cut -d : -f 1
./sed.txt
[root@hadoop103 ~]#