shell编程--八大文件处理工具

摩羯居士

已于 2023-05-04 19:46:45 修改

阅读量882

点赞数

分类专栏： Linux-shell脚本文章标签： bash linux unix

于 2023-05-04 19:43:43 首次发布

本文链接：https://blog.csdn.net/weixin_46362974/article/details/130493774

版权

Linux-shell脚本专栏收录该内容

10 篇文章 1 订阅

订阅专栏

1.文件处理工具

1.1 grep工具

行过滤：把文本文件里面的数据以行为单位进行过滤

grep [选项] '关键字' filename
grep:根据关键字进行行过滤

选项：
-i:不区分大小写
-n:显示行号
-c:统计匹配到的次数
-v:查找不包含指定内容的行，反向选择
-w:按单词查找
-o:打印匹配关键字
-e:使用正则表达式进行匹配
-E:使用扩展正则表达式进行匹配
^key:以某某关键字开头
key$:以某某关键字结尾
-A NUM:打印匹配行的后number行
-B NUM:打印匹配行的前number行
-C NUM:打印匹配行的前后number行
--color[=WHEN]:将匹配行在终端以彩色显示WHEN is never（从不）, always（始终）, or auto（自动）

案例：匹配空行并且打印行号

[root@localhost shelltest]# grep -n ^$ passwd
49:

案例：忽略大小写，匹配到的root关键字的行数

[root@localhost shelltest]# grep -nic 'root' passwd
2

案例：忽略大小写，匹配以root开头的行

[root@localhost shelltest]# grep -i ^root passwd
root:x:0:0:root:/root:/bin/bash

案例：匹配以bash结尾的行

[root@localhost shelltest]# grep bash$ passwd
root:x:0:0:root:/root:/bin/bash
amandabackup:x:33:6:Amanda user:/var/lib/amanda:/bin/bash
user:x:1000:1000:user:/home/user:/bin/bash
mysql:x:27:27:MySQL Server:/var/lib/mysql:/bin/bash
lisa:x:1001:1001::/home/lisa:/bin/bash

案例：打印匹配的关键字ftp，并显示所在行数

[root@localhost shelltest]# grep -on ftp passwd
12:ftp
12:ftp

案例：匹配mail关键字及其后面5行

[root@localhost shelltest]# grep -A 5 mail passwd
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin

games:x:12:100:games:/usr/games:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
nobody:x:99:99:Nobody:/:/sbin/nologin
注意：空行也算一行

案例：匹配mail关键字及其前面5行

[root@localhost shelltest]# grep -B 5 mail passwd
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin

案例：匹配mail关键字及其前后5行

[root@localhost shelltest]# grep -C 5 mail passwd
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
operator:x:11:0:operator:/root:/sbin/nologin

games:x:12:100:games:/usr/games:/sbin/nologin
ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
nobody:x:99:99:Nobody:/:/sbin/nologin

案例：匹配root关键字并且找到的关键字加上颜色显示

[root@localhost shelltest]# grep --color=auto root passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin

#如果每次输入grep的时候都要加上这个color的选项会很麻烦，我们想默认当我们输入grep的时候，都要显示颜色，让其默认带上选项--color=auto
1）临时设置（系统重启后将失效）：直接在控制台输入
[root@localhost shelltest]# alias grep='grep --color=auto'
  //只针对当前用户和当前终端有效
2）永久设置（系统重启后仍有效）：两种方式
①全局设置：针对所有的用户都生效
# vim /etc/bashrc
alias grep='grep --color=auto'
# source bashrc   //刷新一下，立即生效
②局部设置：针对当前的登录用户
# vim ~/.bashrc
alias grep='grep --color=auto'
# source ~/.bashrc

1.2 cut工具

cut是列截取工具

cut [选项] filename
选项：
-c, --characters=LIST:以字符为单位进行分割
-d, --delimiter=DELIM:自定义分隔符（列于列之间），默认的分隔符是制表符：\t（Tab键）
-f[NUM1,NUM2,···], --fields=LIST:指定截取的列，一般与-d一起使用

案例：以冒号:进行分割截取passwd文件第一列的内容

[root@localhost shelltest]# cut -d: -f1 passwd
root
bin
daemon
adm
lp
sync
···

案例：以冒号为分隔符，截取passwd文件第1列、第6列、第7列的内容

[root@localhost shelltest]# cut -d: -f1,6,7 passwd
root:/root:/bin/bash
bin:/bin:/sbin/nologin
daemon:/sbin:/sbin/nologin
adm:/var/adm:/sbin/nologin
lp:/var/spool/lpd:/sbin/nologin
sync:/sbin:/bin/sync
···

案例：截取passwd文件中每一行的第四个字符

[root@localhost shelltest]# cut -c4 passwd
t
:
m
:
x
···

案例：截取passwd文件中的每一行的第1到4这几个字符

[root@localhost shelltest]# cut -c1-4 passwd
root
bin:
daem
adm:
lp:x
sync
···

案例：从第5个字符开始，截取passwd文件后面的所有字符

[root@localhost shelltest]# cut -c5- passwd
:x:0:0:root:/root:/bin/bash
x:1:1:bin:/bin:/sbin/nologin
on:x:2:2:daemon:/sbin:/sbin/nologin
x:3:4:adm:/var/adm:/sbin/nologin
:4:7:lp:/var/spool/lpd:/sbin/nologin
:x:5:0:sync:/sbin:/bin/sync
···

1.3 sort工具

排序：将文件的每一行作为单位，从首字符向后，依次按照ASCII码进行比较，最后将他们按章升序输出

sort [选项] filename
选项：
-u:去除重复行
-r:表示降序排序，默认为升序排列
-o:将排序结果输出到文件中，作用类似与重定向符号
-n:以数字排序，默认是按照字符进行排序
-t:指定分隔符
-k:第几列
-b:忽略前导空格
-R:随机排序，每次运行的结果都不同

案例：将passwd文件按照第三列进行降序排序

[root@localhost shelltest]# sort -nr -t: -k3 passwd
nfsnobody:x:65534:65534:Anonymous NFS User:/var/lib/nfs:/sbin/nologin
lisa:x:1001:1001::/home/lisa:/bin/bash
user:x:1000:1000:user:/home/user:/bin/bash
polkitd:x:999:998:User for polkitd:/:/sbin/nologin
···

案例：对passwd文件按照数字进行排序

[root@localhost shelltest]# sort -n passwd


abrt:x:173:173::/etc/abrt:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
amandabackup:x:33:6:Amanda user:/var/lib/amanda:/bin/bash
apache:x:48:48:Apache:/usr/share/httpd:/sbin/nologin
···

案例：将passwd文件按照数字进行排序后输出到1.txt文件中

[root@localhost shelltest]# sort -n passwd -o 1.txt
[root@localhost shelltest]# cat 1.txt


abrt:x:173:173::/etc/abrt:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
amandabackup:x:33:6:Amanda user:/var/lib/amanda:/bin/bash
apache:x:48:48:Apache:/usr/share/httpd:/sbin/nologin
···

案例：按照数字排序并且去重

[root@localhost shelltest]# cat 2.txt
aaaa
2222
aaaa
1111
2222
3333
aaaa
[root@localhost shelltest]# sort -nu 2.txt
aaaa
1111
2222
3333

1.4 uniq工具

去除连续的重复行

uniq [选项] filename
选项：
-i:忽略大小写
-c:统计重复行的次数
-d:只显示重复行

案例：

testfile中的原有内容为：

[root@localhost shelltest]# cat testfile
aaaa
aaaa
1111
1111
1111
bbbb
bbbb
bbbb
bbbb

使用uniq 命令删除重复的行后，有如下输出结果：

[root@localhost shelltest]# uniq testfile
aaaa
1111
bbbb

在文件中找出重复的行：

[root@localhost shelltest]# uniq -d testfile
aaaa
1111
bbbb

检查文件并删除文件中重复出现的行，并在行首显示该行重复出现的次数

[root@localhost shelltest]# uniq -dc testfile
      2 aaaa
      3 1111
      4 bbbb

1.5 tee工具

从标准输入读取并写入标准输出和文件，即：双向覆盖重定向（可以屏幕输出、文本输入）

tee [-a] filename
-a:双向追加重定向

案例：打印"hello world"并输入到文件file1中

[root@localhost shelltest]# echo 'hello world' | tee file1
hello world
[root@localhost shelltest]# cat file1
hello world

案例：打印"hi world"并追加输入到文件file1中

[root@localhost shelltest]# echo 'hi world' | tee -a file1
hi world
[root@localhost shelltest]# cat file1
hello world
hi world

案例：去除/etc/vsftpd/vsftpd.conf文件中带#注释符的行和空行

[root@localhost shelltest]# yum install vsftpd -y
[root@localhost shelltest]# cd /etc/vsftpd
[root@localhost shelltest]# cat vsftpd.conf
# Example config file /etc/vsftpd/vsftpd.conf
#
# The default compiled in settings are fairly paranoid. This sample file
# loosens things up a bit, to make the ftp daemon more usable.
# Please see vsftpd.conf.5 for all compiled in defaults.
#
# READ THIS: This example file is NOT an exhaustive list of vsftpd options.
# Please read the vsftpd.conf.5 manual page to get a full idea of vsftpd's
# capabilities.
···
[root@localhost vsftpd]# grep -v '^#' vsftpd.conf |grep -v '^$'|tee vsftpd.conf.bak
anonymous_enable=YES
local_enable=YES
write_enable=YES
local_umask=022
dirmessage_enable=YES
xferlog_enable=YES
connect_from_port_20=YES
xferlog_std_format=YES
listen=NO
listen_ipv6=YES
pam_service_name=vsftpd
userlist_enable=YES
tcp_wrappers=YES

1.6 paste工具

主要用于合并文件行

paste [选项] filename
-d:自定义间隔符，默认为\t
-s:串行处理，非并行

案例：

原文件为：

[root@localhost shelltest]# cat a.txt
hello
[root@localhost shelltest]# cat b.txt
hello world
888
999

使用paste命令：

[root@localhost shelltest]# paste a.txt b.txt
hello   hello world
        888
        999
[root@localhost shelltest]# paste b.txt a.txt
hello world     hello
888
999

自定义分隔符’@‘，显示结果为：

[root@localhost shelltest]# paste -d @ a.txt b.txt
hello@hello world
@888
@999

串行处理：

[root@localhost shelltest]# paste -s a.txt b.txt
hello
hello world     888     999

1.7 diff工具

diff 用于逐行比较两个文件的不同

注意：diff用来描述文件不同的方式：告诉我们怎样改变第一个文件之后与第二个文件匹配

diff [选项] filename1 filename2
选项：
-d:不检查空格
-B:不检查空白行
-i:不检测大小写
-W:忽略所有的空格
--normal:正常格式的显示（默认格式）
-c:上下文格式显示
-u:合并格式显示

案例：

原文件：

[root@localhost shelltest]# vim file1
[root@localhost shelltest]# cat file1
aaaa
111
hello world
222
333
bbb
[root@localhost shelltest]# vim file2
[root@localhost shelltest]# cat file2
aaa
hello
111
222
bbb
333
world

file1如何改变才能和file2相同

[root@localhost shelltest]# diff file1 file2
1c1,2	//第一个文件的第1行需要改变(c=change)才能和第二个文件的第1行到第2行匹配
< aaaa	//<表示左边的文件内容
---	//分隔符
> aaa	//>表示右边文件的内容
> hello	//>表示右边文件的内容
3d3	//第一个文件的第3行删除(d=delete)后才能和第二个文件的第3行匹配
< hello world
5d4	//第一个文件的第5行删除(d=delete)后才能和第二个文件的第4行匹配
< 333
6a6,7	//第一个文件的第6行增加(a=add)内容后才能和第二个文件的第6行和第7行匹配
> 333	//需要增加的内容是第二个文件里面的333
> world	//需要增加的内容是第二个文件里面的world

上下文合并显示：

[root@localhost shelltest]# diff -c file1 file2
//前面两行主要列出需要比较的文件名和文件的时间戳，文件名前面的符合***表示file1，---表示file2
*** file1       2023-05-04 23:41:37.534155828 +0800
--- file2       2023-05-04 23:38:49.754161802 +0800
***************	//表示分隔符
*** 1,6 ****	//以***开头表示file1文件，1,6表示1到6行
! aaaa	//!表示该行需要修改才能与第二个文件匹配
  111
- hello world	//-表示需要删除该行才能与第二个文件匹配
  222
- 333	//-表示需要删除该行才能与第二个文件匹配
  bbb
--- 1,7 ----	//以---开头表示file2文件，1,7表示1到7行
! aaa	//!表示第1个文件需要修改才能与第二个文件匹配
! hello	//!表示第1个文件需要修改才能与第二个文件匹配
  111
  222
  bbb
+ 333	//表示第一个文件需要加上该行才能与第二个文件匹配
+ world	//表示第一个文件需要加上该行才能与第二个文件匹配

合并格式显示：

[root@localhost shelltest]# diff -u file1 file2
//前面两行主要列出需要比较的文件名和文件的时间戳，文件1前面---，文件2的前面+++
--- file1       2023-05-04 23:41:37.534155828 +0800
+++ file2       2023-05-04 23:38:49.754161802 +0800
@@ -1,6 +1,7 @@	//-表示第一个文件，如-aaaa；+表示第二个文件，如+aaa
-aaaa
+aaa
+hello
 111
-hello world
 222
-333
 bbb
+333
+world

diff还可以用于比较两个目录的不同

默认情况下比较两个目录里面相同文件的内容：

[root@localhost shelltest]# mkdir dir1 dir2
[root@localhost shelltest]# vim dir1/file1
[root@localhost shelltest]# vim dir2/file1
[root@localhost shelltest]# cat dir1/file1
hello
[root@localhost shelltest]# cat dir2/file1
hello world
[root@localhost shelltest]# diff dir1 dir2
diff dir1/file1 dir2/file1
1c1
< hello
---
> hello world

如果只判断两个文件的不同，不比较：

[root@localhost shelltest]# diff -q dir1 dir2
Files dir1/file1 and dir2/file1 differ

拓展：清空文件内容的小技巧

[root@localhost shelltest]# >dir1/file1
[root@localhost shelltest]# cat dir1/file1
[root@localhost shelltest]#

1.8 tr工具

字符转换：替换，删除

从标准输入中通过替换或者删除操作进行字符转换；主要用于删除文件中控制字符或者进行字符转换

使用tr时要转换两个字符串，字符串1用于查询，字符串2用于处理各种转换

commands | tr 'str1' 'str2'
tr 'string1' 'string2' <filename
tr options 'string1' <filename
-d:删除字符串1中所有输入的字符
-s:删除所有重复出现字符序列，只保留第一个；即将重复出现的字符压缩为一个字符串
[a-z]任意小写的字符
[A-Z]任意大写的字符
[0-9]任意数字
[:alnum:]所有字母和数字
[:alpha:]所有字母
[:blank:]所有水平空白
[:contrl:]所有控制字符
\b:退格符
\f:走行换页
\n:换行
\r:回车
\t:tab键
[:digit:]所有数字
[:lower:]所有的小写字母
[:punct:]所有的标点符号
[:space:]所有水平或者垂直的空格
[:upper:]所有的大写字母
[:xdigit:]所有十六进制数字
[=CHAR=]所有字符

案例：

[root@server shell01]# cat 3.txt 自己创建该文件用于测试
ROOT:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
boss02:x:516:511::/home/boss02:/bin/bash
vip:x:517:517::/home/vip:/bin/bash
stu1:x:518:518::/home/stu1:/bin/bash
mailnull:x:47:47::/var/spool/mqueue:/sbin/nologin
smmsp:x:51:51::/var/spool/mqueue:/sbin/nologin
aaaaaaaaaaaaaaaaaaaa
bbbbbb111111122222222222233333333cccccccc
hello world 888
666
777
999
# tr -d '[:/]' < 3.txt 删除文件中的:和/
# cat 3.txt |tr -d '[:/]' 删除文件中的:和/
# tr '[0-9]' '@' < 3.txt 将文件中的数字替换为@符号
# tr '[a-z]' '[A-Z]' < 3.txt 将文件中的小写字母替换成大写字母
# tr -s '[a-z]' < 3.txt 匹配小写字母并将重复的压缩为一个
# tr -s '[a-z0-9]' < 3.txt 匹配小写字母和数字并将重复的压缩为一个
# tr -d '[:digit:]' < 3.txt 删除文件中的数字
# tr -d '[:blank:]' < 3.txt 删除水平空白
# tr -d '[:space:]' < 3.txt 删除所有水平和垂直空白

摩羯居士

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
shell编程--八大文件处理工具

通过shell脚本来完成一些复杂工作：数据服务的搭建、批量处理①shell基本的语法结构如变量、条件判断、循环语句、分支语句、函数、数组等②基本正则表达式的应用③文件处理三剑客：grep,sed,awk④通过shell脚本来处理比较复杂的任务：服务的搭建、批量处理。
复制链接

扫一扫

专栏目录