shell脚本的文本处理工具

最新推荐文章于 2023-04-05 17:58:36 发布

lambda-小张

最新推荐文章于 2023-04-05 17:58:36 发布

阅读量823

点赞数 1

分类专栏： linux运维 # shell 文章标签： linux 运维正则表达式

本文链接：https://blog.csdn.net/m0_55834564/article/details/126445700

版权

linux运维同时被 2 个专栏收录

52 篇文章 0 订阅

订阅专栏

shell

13 篇文章 0 订阅

订阅专栏

一、cut

cut 的工作就是“剪”，具体的说就是在文件中负责剪切数据用的。cut 命令从文件的每一行剪切字节、字符和字段并将这些字节、字符和字段输出。

1）基本用法

cut [选项参数] filename

说明：默认分隔符是制表符

2）选项参数说明

选项参数	功能
-f	列号，提取第几列
-d	分隔符，按照指定分隔符分割列，默认是制表符“\t”
-c	按字符进行切割后加加 n 表示取第几列比如 -c 1

3）案例实操

（1）数据准备

[root@hadoop scripts]# vim cut.txt
dong shen
guan zhen
wo wo
lai lai
le le

（2）切割 cut.txt 第一列

[root@hadoop scripts]# cut -d " " -f 1 cut.txt
dong
guan
wo
lai
le

（3）切割 cut.txt第二、三列

[root@hadoop scripts]# cut -d " " -f 2,3 cut.txt
shen
zhen
wo
lai
le

（4）在 /etc/passwd 文件中切割出

[root@hadoop scripts]# cat /etc/passwd | grep bash$ | cut -d ":" -f 1,6,7    #1,6,7表示列数
root:/root:/bin/bash
mysql:/var/lib/mysql:/bin/bash
[root@hadoop scripts]# cat /etc/passwd | grep bash$ | cut -d ":" -f 4-    #4-表示4以后列数
0:root:/root:/bin/bash
27:MySQL Server:/var/lib/mysql:/bin/bash
[root@hadoop scripts]# cat /etc/passwd | grep bash$ | cut -d ":" -f -4    #-4表示4以前列数
root:x:0:0
mysql:x:27:27

（5）选取系统 PATH 变量值，第 2 个“：”开始后的所有路径：

[root@hadoop scripts]# echo $PATH | cut -d ":" -f 2
/usr/local/bin
[root@hadoop scripts]# echo $PATH | cut -d ":" -f 10-    #第十个以后路径
/usr/soft/hive/bin:/usr/soft/jdk/bin:/usr/soft/phoenix/bin:/usr/soft/scala/bin:/usr/soft/spark/bin:/usr/soft/spark/sbin:/usr/soft/sqoop/bin:/root/bin

（6）切割 ifconfig 后打印的 IP 地址

[root@hadoop scripts]# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 00:0C:29:02:DA:76  
          inet addr:192.168.17.151  Bcast:192.168.17.255  Mask:255.255.255.0
          inet6 addr: fe80::20c:29ff:fe02:da76/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1187588 errors:0 dropped:0 overruns:0 frame:0
          TX packets:214810 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1297311120 (1.2 GiB)  TX bytes:45156985 (43.0 MiB)
[root@hadoop scripts]# ifconfig eth0 | grep Bcast | cut -d " " -f 12 |cut -d ":" -f 2
192.168.17.151

二、awk

一个强大的文本分析工具，把文件逐行的读入，以空格为默认分隔符将每行切片，切开的部分再进行分析处理。

1）基本用法

awk [选项参数] ‘/pattern1/{action1} /pattern2/{action2}...’ filename

pattern：表示 awk 在数据中查找的内容，就是匹配模式

action：在找到匹配内容时所执行的一系列命令

2）选项参数说明

选项参数	功能
-F	指定输入文件分隔符
-v	赋值一个用户定义变量

3）案例实操

1.搜索 passwd 文件以 root 关键字开头的所有行，并输出该行的第 7 列

[root@hadoop scripts]# cat /etc/passwd | grep ^root | cut -d ":" -f 7
/bin/bash
[root@hadoop scripts]# cat /etc/passwd | awk -F ":" '/^root/ {print $7}'
/bin/bash

2.搜索 passwd 文件以 root 关键字开头的所有行，并输出该行的第 1 列和第 7 列，中间以“，”号分割。

[root@hadoop scripts]# cat /etc/passwd | awk -F ":" '/^root/ {print $1","$6","$7}'
root,/root,/bin/bash

3.只显示/etc/passwd 的第一列和第七列，以逗号分割，且在所有行前面添加列名 user， shell 在最后一行添加"dahaige，/bin/zuishuai"。

[root@hadoop scripts]# cat /etc/passwd | awk -F ":" 'BEGIN{print "user, shell"}{print $1","$7} END{print "end of file"}'
user, shell
root,/bin/bash
bin,/sbin/nologin
daemon,/sbin/nologin
adm,/sbin/nologin
lp,/sbin/nologin
sync,/bin/sync
shutdown,/sbin/shutdown
halt,/sbin/halt
mail,/sbin/nologin
uucp,/sbin/nologin
operator,/sbin/nologin
games,/sbin/nologin
gopher,/sbin/nologin
ftp,/sbin/nologin
nobody,/sbin/nologin
vcsa,/sbin/nologin
saslauth,/sbin/nologin
postfix,/sbin/nologin
sshd,/sbin/nologin
mysql,/bin/bash
ntp,/sbin/nologin
end of file

4.将 passwd 文件中的用户 id 增加数值 1 并输出

[root@hadoop scripts]# cat /etc/passwd | awk -v i=1 -F ":" '{print $3+i}'    #-v是表示变量
1
2
3
4
5
6
7
8
9
11
12
13
14
15
100
70
500
90
75
28
39

4）awk 的内置变量

变量	说明
FILENAME	文件名
NR	已读的记录数（行号）
NF	浏览记录的域的个数（切割后，列的个数）

5）案例实操

1.统计 passwd 文件名，每行的行号，每行的列数

[root@hadoop scripts]# awk -F ":" '{print "文件名："FILENAME " 行号："NR "列数: "NF}' /etc/passwd
文件名：/etc/passwd 行号：1列数: 7
文件名：/etc/passwd 行号：2列数: 7
文件名：/etc/passwd 行号：3列数: 7
文件名：/etc/passwd 行号：4列数: 7
文件名：/etc/passwd 行号：5列数: 7
文件名：/etc/passwd 行号：6列数: 7
文件名：/etc/passwd 行号：7列数: 7
文件名：/etc/passwd 行号：8列数: 7
文件名：/etc/passwd 行号：9列数: 7
文件名：/etc/passwd 行号：10列数: 7
文件名：/etc/passwd 行号：11列数: 7
文件名：/etc/passwd 行号：12列数: 7
文件名：/etc/passwd 行号：13列数: 7
文件名：/etc/passwd 行号：14列数: 7
文件名：/etc/passwd 行号：15列数: 7
文件名：/etc/passwd 行号：16列数: 7
文件名：/etc/passwd 行号：17列数: 7
文件名：/etc/passwd 行号：18列数: 7
文件名：/etc/passwd 行号：19列数: 7
文件名：/etc/passwd 行号：20列数: 7
文件名：/etc/passwd 行号：21列数: 7

2.查询 ifconfig 命令输出结果中的空行所在的行号

[root@hadoop scripts]# ifconfig | grep -n ^$    #不能添加信息
9:
18:
[root@hadoop scripts]# ifconfig | awk '/^$/ {print NR}'
9
18
[root@hadoop scripts]# ifconfig | awk '/^$/ {print "空行："NR}'    #添加提示信息
空行：9
空行：18

3.切割 IP

[root@hadoop scripts]# ifconfig eth0 | grep Bcast
          inet addr:192.168.17.151  Bcast:192.168.17.255  Mask:255.255.255.0
[root@hadoop scripts]# ifconfig | awk '/Bcast/ {print $2}' |cut -d ":" -f 2
192.168.17.151