文本处理工具

C。L.

已于 2023-08-01 09:24:31 修改

阅读量983

点赞数 1

文章标签： vim 编辑器 linux

于 2022-09-25 17:17:19 首次发布

170512

本文链接：https://blog.csdn.net/weixin_43332972/article/details/127039765

版权

2.3.8continue 和 break

1 文本编辑工具VIM

vim
VIsual editor iMproved ，和 vi 使用方法一致，但功能更为强大，不是必备软件
官网：www.vim.org

其他相关编辑器：gvim 一个Vim编辑器的图形版本

参考链接：
https://www.w3cschool.cn/vim/

1.1vim 命令格式

说明：
如果该文件存在，文件被打开并显示内容
如果该文件不存在，当编辑后第一次存盘时创建它

#格式：
vim [OPTION]... FILE...

#常用选项：
+# 打开文件后，让光标处于第#行的行首，+默认行尾
+/PATTERN 让光标处于第一个被PATTERN匹配到的行行首
-b file 二进制方式打开文件
-d file1 file2… 比较多个文件，相当于 vimdiff
-m file   只读打开文件
-e file   直接进入ex模式，相当于执行ex file
-y file  Easy mode (like "evim", modeless)，直接可以操作文件，ctrl+o:wq|q! 保存和不
保存退出

1.2三种常见模式：

命令或普通(Normal)模式：默认模式，可以实现移动光标，剪切/粘贴文本
插入(Insert)或编辑模式：用于修改文本
扩展命令(extended command )或命令(末)行模式：保存，退出等

命令模式 --> 插入模式

i insert, 在光标所在处输入
I 在当前光标所在行的行首输入
a append, 在光标所在处后面输入
A 在当前光标所在行的行尾输入
o 在当前光标所在行的下方打开一个新行
O 在当前光标所在行的上方打开一个新行

插入模式 --- ESC-----> 命令模式
命令模式 ---- : ----> 扩展命令模式
扩展命令模式 ----ESC,enter----> 命令模式

例: 插入颜色字符
1 切换至插入模式
2 按ctrl+v+[ 三个键,显示^[
3 后续输入颜色信息,如:^[[32mhello^[[0m
4 切换至扩展命令模式,保存退出
5 cat 文件可以看到下面显示

1.3 扩展命令模式

按“:”进入Ex模式，创建一个命令提示符: 处于底部的屏幕左侧

w     写（存）磁盘文件
wq     写入并退出
x     写入并退出
X      加密
q     退出
q！     不存盘退出，即使更改都将丢失
r      filename 读文件内容到当前文件中
w      filename 将当前文件内容写入另一个文件
!command     执行命令
r!command     读入命令的输出

*地址定界

#格式：
:start_pos,end_pos CMD

#     #具体第#行，例如2表示第2行
#,#     #从左侧#表示起始行，到右侧#表示结尾行
#,+#     #从左侧#表示的起始行，加上右侧#表示的行数，范例：2,+3 表示2到5行
.       #当前行
$     #最后一行
.,$-1     #当前行到倒数第二行
%     #全文, 相当于1,$
/pattern/       #从当前行向下查找，直到匹配pattern的第一行,即:正则表达式
/pat1/,/pat2/     #从第一次被pat1模式匹配到的行开始，一直到第一次被pat2匹配到的行结束
#,/pat/       #从指定行开始，一直找到第一个匹配pattern的行结束
/pat/,$        #向下找到第一个匹配patttern的行到整个文件的结尾的所有行


#地址定界后跟一个编辑命令
d     #删除
y     #复制
w file     #将范围内的行另存至指定文件中
r file     #在指定位置插入指定文件中的所有内容
t#行号    将前面指定的行复制到#行后
m#行号    将前面指定的行移动到#行后

*查找并替换

#格式：
s/要查找的内容/替换为的内容/修饰符


#注释：
要查找的内容：可使用基本正则表达式模式
替换为的内容：不能使用模式，但可以使用\1, \2, ...等后向引用符号；还可以使用“&”引用前面查找时查
找到的整个内容


#修饰符：
i #忽略大小写
g #全局替换(即贪婪模式)，默认情况下，每一行只替换第一次出现(即懒惰模式)
gc #全局替换，每次替换前询问


#查找替换中的分隔符/可替换为其它字符，如：#,@
#例：
%s@/etc@/var@g
%s#/boot#/#i

1.4vim的工作特性

扩展命令模式的配置只是对当前vim进程有效，可将配置存放在文件中持久保存

#配置文件：
/etc/vimrc #全局
~/.vimrc #个人

#行号：
显示：set number，简写 set nu
取消显示：set nonumber, 简写 set nonu

#忽略字符的大小写：
启用：set ignorecase，简写 set ic
不忽略：set noic


#自动缩进：
启用：set autoindent，简写 set ai
禁用：set noai


#复制保留格式：
启用：set paste
禁用：set nopaste


#显示Tab ^I和换行符 和$显示:
启用：set list
禁用：set nolist


#高亮搜索:
启用：set hlsearch
禁用：set nohlsearch 简写：nohl


#语法高亮：
启用：syntax on
禁用：syntax off


#文件格式：
启用windows格式：set fileformat=dos
启用unix格式：set fileformat=unix
简写 set ff=dos|unix


#Tab用空格代替：
启用：set expandtab  默认为8个空格代替Tab
禁用：set noexpandtab
简写：set et


#Tab用指定空格的个数代替
启用：set tabstop=# 指定#个空格代替Tab
简写：set ts=4


#设置缩进宽度
#向右缩进     命令模式>>
#向左缩进     命令模式<<
#默认缩进8个字符,可以设置缩进为4个字符
set shiftwidth=4


#设置文本宽度：
set textwidth=65 (vim only) #从左向右计数
set wrapmargin=15      #从右到左计数

#设置光标所在行的标识线：
启用：set cursorline  简写 set cul
禁用：set nocursorline


#加密：
启用： set key=password
禁用： set key=

1.5命令模式

命令模式，又称为Normal模式，功能强大，只是此模式输入指令并在屏幕上显示，所以需要记忆大量的快捷按键才能更好的使用

#退出VIM
ZZ 保存退出
ZQ 不保存退出

#光标跳转，字符间跳转：
h: 左
L: 右
j: 下
k: 上
#COMMAND：跳转由#指定的个数的字符


#单词间跳转：
w：下一个单词的词首
e：当前或下一单词的词尾
b：当前或前一个单词的词首
#COMMAND：由#指定一次跳转的单词数


#当前页跳转：
H：页首  
M：页中间行  
L：页底
zt：将光标所在当前行移到屏幕顶端
zz：将光标所在当前行移到屏幕中间
zb：将光标所在当前行移到屏幕底端


#行首行尾跳转：
^ 跳转至行首的第一个非空白字符
0 跳转至行首
$ 跳转至行尾


#行间移动：
#G 或者扩展命令模式下
:#  跳转至由第#行,在EX模式下
G 最后一行
1G, gg 第一行


#句间移动：
) 下一句
( 上一句

#段落间移动：
} 下一段
{ 上一段


#命令模式翻屏操作：
Ctrl+f     向文件尾部翻一屏,相当于Pagedown
Ctrl+b     向文件首部翻一屏,相当于Pageup
Ctrl+d     向文件尾部翻半屏
Ctrl+u     向文件首部翻半屏



#字符编辑：
x     剪切光标处的字符
#x     剪切光标处起始的#个字符
xp     交换光标所在处的字符及其后面字符的位置
~     转换大小写
J     删除当前行后的换行符



# 替换命令(replace)：
r     只替换光标所在处的一个字符
R     切换成REPLACE模式（在末行出现-- REPLACE -- 提示）,按ESC回到命令模式


#删除命令(delete)：
d     删除命令，可结合光标跳转字符，实现范围删除
d$     删除到行尾
d^     删除到非空行首
d0     删除到行首
dw
de
db
#COMMAND
dd：      剪切光标所在的行
#dd     多行删除
D：    从当前光标位置一直删除到行尾，等同于d$



# 复制命令(yank)：
y     复制，行为相似于d命令
y$
y0
y^
ye
yw
yb
#COMMAND
yy：    复制行
#yy     复制多行
Y：    复制整行



#粘贴命令(paste)
p 缓冲区存的如果为整行，则粘贴当前光标所在行的下方；否则，则粘贴至当前光标所在处的后面
P 缓冲区存的如果为整行，则粘贴当前光标所在行的上方；否则，则粘贴至当前光标所在处的前面


#改变命令(change)
命令 c 删除后切换成插入模式
c$
c^
c0
cb
ce
cw
#COMMAND
cc  #删除当前行并输入新内容，相当于S
#cc 
C  #删除当前光标到行尾，并切换成插入模式,相当于c$


# 查找:
/PATTERN：从当前光标所在处向文件尾部查找
?PATTERN：从当前光标所在处向文件首部查找
n：与命令同方向
N：与命令反方向


#撤消更改:
u     撤销最近的更改，相当于windows中ctrl+z
#u     撤销之前多次更改
U     撤消光标落在这行后所有此行的更改
Ctrl-r     重做最后的“撤消”更改，相当于windows中crtl+y
.     重复前一个操作
#.     重复前一个操作#次



#高级用法:
<start position><command><end position>
常见Command：y 复制、d 删除、gU 变大写、gu 变小写
#例：
0y$ 命令
0 → 先到行头
y → 从这里开始拷贝
$ → 拷贝到本行最后一个字符


#例：粘贴“wen”100次
100iwen [ESC]
di"  光标在” “之间，则删除” “之间的内容
yi(  光标在()之间，则复制()之间的内容
vi[  光标在[]之间，则选中[]之间的内容
dtx 删除字符直到遇见光标之后的第一个 x 字符
ytx 复制字符直到遇见光标之后的第一个 x 字符

1.6可视化模式

在末行有”-- VISUAL -- “指示，表示在可视化模式

允许选择的文本块
        v 面向字符，-- VISUAL --
        V 面向整行，-- VISUAL LINE --
        ctrl-v 面向块，-- VISUAL BLOCK --
可视化键可用于与移动键结合使用
        w ) } 箭头等
突出显示的文字可被删除，复制，变更，过滤，搜索，替换等

#例：在文件指定行的行首插入#
1、先将光标移动到指定的第一行的行首
2、输入ctrl+v 进入可视化模式
3、向下移动光标，选中希望操作的每一行的第一个字符
4、输入大写字母 I 切换至插入模式
5、输入 #
6、按 ESC 键


#例：在指定的块位置插入相同的内容
1、光标定位到要操作的地方
2、CTRL+v 进入“可视块”模式，选取这一列操作多少行
3、SHIFT+i(I)
4、输入要插入的内容
5、按 ESC 键

1.7 多窗口模式

#多文件分割：
vim -o|-O FILE1 FILE2 ...
-o: 水平或上下分割
-O: 垂直或左右分割（vim only）
在窗口间切换：Ctrl+w, Arrow


#单文件窗口分割：
Ctrl+w,s：split, 水平分割，上下分屏
Ctrl+w,v：vertical, 垂直分割，左右分屏
ctrl+w,q：取消相邻窗口
ctrl+w,o：取消全部窗口
:wqall 退出

2文本处理grep，sed，awk

grep 命令主要对文本的（正则表达式）行基于模式进行过滤
sed：stream editor，文本编辑工具
awk：Linux上的实现gawk，文本报告生成器

2.1文本处理之 grep

作用：文本搜索工具，根据用户指定的“模式”对目标文本逐行进行匹配检查；打印匹配到的行
模式：由正则表达式字符及文本字符所编写的过滤条件

#帮助：
https://man7.org/linux/man-pages/man1/grep.1.html

#格式：
grep [OPTIONS] PATTERN [FILE...]

#常见选项
-color=auto 对匹配到的文本着色显示
-m  # 匹配#次后停止
-v 显示不被pattern匹配到的行,即取反
-i 忽略字符大小写
-n 显示匹配的行号
-c 统计匹配的行数
-o 仅显示匹配到的字符串
-q 静默模式，不输出任何信息
-A # after, 后#行
-B # before, 前#行
-C # context, 前后各#行
-e 实现多个选项间的逻辑or关系,如：grep –e ‘cat ' -e ‘dog' file
-w 匹配整个单词
-E 使用ERE，相当于egrep
-F 不支持正则表达式，相当于fgrep
-P 支持Perl格式的正则表达式
-f file 根据模式文件处理
-r   递归目录，但不处理软链接
-R   递归目录，但处理软链接




#例：
grep root /etc/passwd
grep "USER" /etc/passwd
grep 'USER' /etc/passwd
grep whoami /etc/passwd



#例：取两个文件的相同行
[root@centos8 ~]#cat /data/f1.txt
a
b
1
c
[root@centos8 ~]#cat /data/f2.txt
b
e
f
c
1
2
[root@centos8 ~]#grep -f /data/f1.txt /data/f2.txt
b
c
1


#例: 分区利用率最大的值
[root@centos8 ~]#df | grep '^/dev/sd' |tr -s ' ' %|cut -d% -f5|sort -n|tail -1
[root@centos8 ~]#df |grep '^/dev/sd' |grep -oE '\<[0-9]{,3}%'|tr -d '%'|sort -
nr|head -n1
[root@centos8 ~]#df |grep '^/dev/sd' |grep -oE '\<[0-9]{,3}%'|grep -Eo '[0-9]+'
|sort -nr|head -n1
13


#例: 哪个IP和当前主机连接数最多的前三位
[root@centos8 ~]#ss -nt | grep "^ESTAB" |tr -s ' ' : |cut -d: -f6|sort |uniq -
c|sort -nr|head -n3
   3 10.0.0.1
   1 172.16.4.100
   1 172.16.31.188


#例: 连接状态的统计
[root@centos8 -liyun-pc ~]# ss -nta | grep -v '^State' |cut -d" " -f1|sort |uniq -c
   7 ESTAB
   4 LISTEN
   7 TIME-WAIT
[root@centos8 -liyun-pc ~]# ss -nta | tail -n +2 |cut -d" " -f1|sort |uniq -c
   3 ESTAB
   4 LISTEN
  12 TIME-WAIT
[root@centos8 -liyun-pc ~]#


#例：
[root@centos8 ~]#grep -v "^#" /etc/profile | grep -v '^$'
[root@centos8 ~]#grep -v "^#\|^$" /etc/profile
[root@centos8 ~]#grep -v "^\(#\|$\)" /etc/profile
[root@centos8 ~]#grep -Ev "^(#|$)" /etc/profile
[root@centos8 ~]#egrep -v "^(#|$)" /etc/profile
[root@centos6 ~]#egrep -v '^(#|$)' /etc/httpd/conf/httpd.conf



#例：
[root@centos8 ~]#grep -o 'r..t' /etc/passwd
root
root
root
root
r/ft
rypt



#例：
[root@centos8 ~]#ifconfig | grep -E '[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]
{1,3}'
   inet 10.0.0.8 netmask 255.255.255.0 broadcast 10.0.0.255
   inet 172.16.0.123 netmask 255.255.0.0 broadcast 172.16.255.255
   inet6 fe80::c11e:4792:7e77:12a4 prefixlen 64 scopeid 0x20<link>
   inet 127.0.0.1 netmask 255.0.0.0
[root@centos8 ~]#ifconfig | grep -E '([0-9]{1,3}.){3}[0-9]{1,3}'
   inet 10.0.0.8 netmask 255.255.255.0 broadcast 10.0.0.255
   inet 172.16.0.123 netmask 255.255.0.0 broadcast 172.16.255.255
   inet6 fe80::c11e:4792:7e77:12a4 prefixlen 64 scopeid 0x20<link>
   inet 127.0.0.1 netmask 255.0.0.0
[root@centos8 ~]#ifconfig eth0 | grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}'|head -1
10.0.0.8
[root@centos8 ~]#cat regex.txt
([0-9]{1,3}\.){3}[0-9]{1,3}
[root@centos8 ~]#ifconfig | grep -oEf regex.txt
10.0.0.8
255.255.255.0
10.0.0.255
127.0.0.1
255.0.0.0



#例：
[root@centos8 ~]#grep -w root /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin
[root@centos8 ~]#grep '\<root\>' /etc/passwd
root:x:0:0:root:/root:/bin/bash
operator:x:11:0:operator:/root:/sbin/nologin


#例: 过滤掉文件的注释(包括#号的行)和空行
[root@centos8 ~]#grep -Ev '^$|#' /etc/fstab
UUID=01f1068e-6937-4fb2-b64b-0d7d6b85ad08 /            xfs  
defaults     0 0
UUID=cb21e5ce-edf6-4ed1-8df9-ba98520a68dc /boot          xfs  
defaults     0 0
UUID=9ea3524a-7cff-49a0-951e-8429a30bd0a0 /data          xfs  
defaults     0 0
UUID=42174d44-41aa-448b-88bc-fd36d6a49e39 swap          swap 
defaults     0 0



#例：算出所有人的年龄总和
[root@centos8 ~]#cat /data/age.txt
xiaoming=20
xiaohong=18
xiaoqiang=22
[root@centos8 ~]#cut -d"=" -f2 /data/age.txt|tr '\n' + | grep -Eo ".*[0-9]"|bc
60
[root@centos8 ~]#grep -Eo "[0-9]+" /data/age.txt | tr '\n' + | grep -Eo ".*[0-
9]"|bc
60
[root@centos8 ~]#grep -oE '[0-9]+' /data/age.txt| paste -s -d+|bc
60

2.2文本处理之 sed

sed 即 Stream EDitor，和 vi 不同，sed是行编辑器

Sed是从文件或管道中读取一行，处理一行，输出一行；再读取一行，再处理一行，再输出一行，直到
最后一行。每当处理一行时，把当前处理的行存储在临时缓冲区中，称为模式空间（Pattern
Space），接着用sed命令处理缓冲区中的内容，处理完成后，把缓冲区的内容送往屏幕。接着处理下一
行，这样不断重复，直到文件末尾。一次处理一行的设计模式使得sed性能很高，sed在读取大文件时不
会出现卡顿的现象。如果使用vi命令打开几十M上百M的文件，明显会出现有卡顿的现象，这是因为vi命
令打开文件是一次性将文件加载到内存，然后再打开。Sed就避免了这种情况，一行一行的处理，打开
速度非常快，执行速度也很快

#官网：http://sed.sourceforge.net/


#帮助：https://man7.org/linux/man-pages/man1/sed.1.html


#格式：
sed [option]... 'script;script;...' [inputfile...]


#常用选项：
-n 不输出模式空间内容到屏幕，即不自动打印
-e 多点编辑
-f FILE 从指定文件中读取编辑脚本
-r, -E 使用扩展正则表达式
-i.bak 备份文件并原处编辑
-s       将多个文件视为独立文件，而不是单个连续的长文件流
#说明:
-ir  不支持
-i -r 支持
-ri  支持
-ni  危险选项,会清空文件


#script 格式：
'地址命令'

#地址格式：
1. 不给地址：对全文进行处理
2. 单地址：
 #：指定的行，$：最后一行
 /pattern/：被此处模式所能够匹配到的每一行
3. 地址范围：
 #,#   #从#行到第#行，3，6 从第3行到第6行
 #,+#  #从#行到+#行，3,+4 表示从3行到第7行
 /pat1/,/pat2/
 #,/pat/
 /pat/,#
4. 步进：~
  1~2 奇数行
  2~2 偶数行


#命令：
p 打印当前模式空间内容，追加到默认输出之后
Ip 忽略大小写输出
d 删除模式空间匹配的行，并立即启用下一轮循环
a [\]text 在指定行后面追加文本，支持使用\n实现多行追加
i [\]text 在行前面插入文本
c [\]text 替换行为单行或多行文本
w file 保存模式匹配的行至指定文件
r file 读取指定文件的文本至模式空间中匹配到的行后
= 为模式空间中的行打印行号
! 模式空间中匹配行取反处理
q      结束或退出sed



#查找替代：
s/pattern/string/修饰符 查找替换,支持使用其它分隔符，可以是其它形式：s@@@，s###
替换修饰符：
g 行内全局替换
p 显示替换成功的行
w  /PATH/FILE 将替换成功的行保存至文件中
I,i  忽略大小写


#例：#默认sed会将输入信息直接输出
[root@centos8 ~]#sed ''
welcome
welcome
to
to
makejon
makejon
[root@centos8 ~]#sed '' /etc/issue
\S
Kernel \r on an \m
[root@centos8 ~]#sed 'p' /etc/issue
\S
\S
Kernel \r on an \m
Kernel \r on an \m
[root@centos8 ~]#sed -n '' /etc/issue
[root@centos8 ~]#sed -n 'p' /etc/issue
\S
Kernel \r on an \m
[root@centos8 ~]#sed -n '1p' /etc/passwd
root:x:0:0:root:/root:/bin/bash
[root@centos8 ~]#ifconfig eth0 | sed '2p'
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
   inet 10.0.0.8 netmask 255.255.255.0 broadcast 10.0.0.255
   inet 10.0.0.8 netmask 255.255.255.0 broadcast 10.0.0.255
   inet6 fe80::20c:29ff:fe45:a8a1 prefixlen 64 scopeid 0x20<link>
   ether 00:0c:29:45:a8:a1 txqueuelen 1000 (Ethernet)
   RX packets 89815 bytes 69267453 (66.0 MiB)
   RX errors 0 dropped 0 overruns 0 frame 0
   TX packets 115634 bytes 79827662 (76.1 MiB)
   TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
[root@centos8 ~]#ifconfig eth0 | sed -n '2p'
   inet 10.0.0.8 netmask 255.255.255.0 broadcast 10.0.0.255
   
[root@centos8 ~]#sed -n '$p' /etc/passwd
postfix:x:89:89::/var/spool/postfix:/sbin/nologin
#倒数第二行
[root@ubuntu1804 ~]#sed -n "$(echo $[`cat /etc/passwd|wc -l`-1])p" /etc/passwd
[root@centos8 ~]#ifconfig eth0 |sed -n '/netmask/p'
   inet 10.0.0.8 netmask 255.255.255.0 broadcast 10.0.0.255
[root@centos8 ~]#df | sed -n '/^\/dev\/sd/p'
/dev/sda2    104806400 4872956  99933444  5% /
/dev/sda3    52403200  398860  52004340  1% /data
/dev/sda1     999320  848568   81940  92% /boot
[root@centos8 ~]#seq 10 | sed -n '3,6p'
3
4
5
6
[root@centos8 ~]#seq 10 | sed -n '3,+4p'
3
4
5
6
7
[root@centos8 ~]#seq 10 | sed -n '3,$p'
3
4
5
6
7
8
9
10
[root@centos8 ~]#seq 10 |sed -n '1~2p'
1
3
5
7
9
[root@centos8 ~]#seq 10 |sed -n '2~2p'
2
4
6
8
10
[root@centos8 ~]#seq 10 |sed  '1~2d'
2
4
6
8
10
[root@centos8 ~]#seq 10 |sed  '2~2d'
1
3
5
7
9
[root@centos8 ~]#sed -e '2d' -e '4d' seq.log
1
3
5
6
7
8
9
10
[root@centos8 ~]#sed '2d;4d' seq.log
1
3
5
6
7
8
9
10
#不显示注释行和空行
[root@centos6 ~]#sed '/^#/d;/^$/d' /etc/httpd/conf/httpd.conf
[root@centos6 ~]#grep -Ev '^#|^$' /etc/httpd/conf/httpd.conf
[root@centos8 ~]#sed -i.orig '2d;4d' seq.log
[root@centos8 ~]#cat seq.log.orig
1
2
3
4
5
6
7
8
9
10
[root@centos8 ~]#cat seq.log
1
3
5
6
7
8
9
10
[root@centos8 ~]#seq 10 > seq.log
[root@centos8 ~]#sed -i.orig '2d;4d' seq.log
[root@centos8 ~]#sed -i  '/^listen 9527/a listen 80 \nlisten 8080'
/etc/httpd/conf/httpd.conf

#删除所有以#开头的行
[root@centos8 ~]#sed -i '/^#/d' fstab

#只显示非#开头的行
[root@centos8 ~]#sed -n '/^#/!p' fstab

#修改网卡配置
[root@centos8 ~]#sed -Ei.bak '/^GRUB_CMDLINE_LINUX/s/(.*)(")$/\1
net.ifnames=0\2/' /etc/default/grub



#例： 搜索替换和&
[root@centos8 ~]#sed -nr 's/r..t/&er/gp' /etc/passwd
rooter:x:0:0:rooter:/rooter:/bin/bash
operator:x:11:0:operator:/rooter:/sbin/nologin
ftp:x:14:50:FTP User:/var/fterp:/sbin/nologin


#例：除指定文件外其余删除
[root@rocky8 ~]#rm -f `ls | grep -Ev '(3|5|7)\.txt'`
[root@rocky8 ~]#ls | sed -n '/[^357].txt/p'|xargs rm
[root@rocky8 ~]#ls | grep -Ev '(3|5|7)\.txt' | sed -n 's/.*/rm &/p'|bash
[root@rocky8 ~]#ls | grep -Ev '(3|5|7)\.txt' | sed -En 's/(.*)/rm \1/p'|bash


#例：获取分区利用率
[root@centos8 ~]#df | sed -En '/^\/dev\/sd/s@.* ([0-9]+)%.*@\1@p'
3
1
13


#例：
sed '2p' /etc/passwd
sed  -n '2p' /etc/passwd
sed  -n '1,4p' /etc/passwd
sed  -n '/root/p' /etc/passwd
sed  -n '2,/root/p' /etc/passwd 从2行开始
sed  -n '/^$/=' file 显示空行行号
sed  -n  -e '/^$/p' -e '/^$/=' file
Sed'/root/a\superman' /etc/passwd行后
sed '/root/i\superman' /etc/passwd 行前
sed '/root/c\superman' /etc/passwd 代替行
sed '/^$/d' file
sed '1,10d'  file
nl  /etc/passwd | sed '2,5d'
nl  /etc/passwd | sed '2a tea'
sed 's/test/mytest/g' example
sed -n 's/root/&superman/p' /etc/passwd 单词后
sed -n 's/root/superman&/p' /etc/passwd 单词前
sed -e 's/dog/cat/' -e 's/hi/lo/' pets 
sed -i.bak  's/dog/cat/g' pets


#例：取IP 地址
[root@centos8 ~]#ifconfig eth0 |sed -nr "2s/[^0-9]+([0-9.]+).*/\1/p" 
10.0.0.8
[root@centos6 ~]#ifconfig eth0 | sed -En '2s/^[^0-9]+([0-9.]{7,15}).*/\1/p'
10.0.0.6
[root@centos8 ~]#ifconfig eth0 | sed -rn '2s/^[^0-9]+([0-9.]+) .*$/\1/p'
10.0.0.8
[root@centos8 ~]#ifconfig eth0 | sed -n '2s/^.*inet //p' | sed -n 's/
netmask.*//p'
10.0.0.8
[root@centos8 ~]#ifconfig eth0 | sed -n '2s/^.*inet //;s/ netmask.*//p'
10.0.0.8
[root@centos8 ~]#ifconfig eth0 | sed -rn '2s/(.*inet )([0-9].*)(
netmask.*)/\2/p'
10.0.0.8


#例：取基名和目录名
echo "/etc/sysconfig/network-scripts/" |sed -r 's#(^/.*/)([^/]+/?)#\2#' 取基名
echo "/etc/sysconfig/network-scripts/" |sed -r 's#(^/.*/)([^/]+/?)#\1#' 取目录
#取目录名
[root@centos8 ~]#echo /etc/sysconfig/ | sed -rn 's#(.*)/([^/]+)/?#\1#p'
/etc
#取基名
[root@centos8 ~]#echo /etc/sysconfig/ | sed -rn 's#(.*)/([^/]+)/?#\2#p'
sysconfig


#例: 取文件的前缀和后缀
[root@centos8 data]#echo a.b.c.gz |sed -En 's/(.*)\.([^.]+)$/\1/p'
a.b.c
[root@centos8 data]#echo a.b.c.gz |sed -En 's/(.*)\.([^.]+)$/\2/p'
gz
[root@centos8 data]#echo a.b.c.gz |grep -Eo '.*\.'
a.b.c
[root@centos8 data]#echo a.b.c.gz |grep -Eo '[^.]+$'
gz
[root@centos8 ~]#echo a.b.tar.gz | sed -rn 's@.*\.([^.]+)\.([^.]+)$@\1.\2@p'
tar.gz


例：将非#开头的行加#
[root@centos8 ~]#sed -rn "s/^[^#]/#&/p"  /etc/fstab
#UUID=1b950ef9-7142-46bd-975c-c4ac1e0d47e8 /            xfs  
defaults    0 0
#UUID=667a4c81-8b4b-4a39-a111-b11cb6d09309 /boot          ext4 
defaults    1 2
#UUID=38d14714-c018-41d5-922c-49e415decbca /data          xfs  
defaults    0 0
#UUID=a0efb2bb-8227-4317-a79d-0a70d515046c swap          swap 
defaults    0 0
[root@centos8 ~]#sed -rn 's/^[^#](.*)/#\1/p' /etc/fstab
#UID=1b950ef9-7142-46bd-975c-c4ac1e0d47e8 /            xfs  
defaults    0 0
#UID=667a4c81-8b4b-4a39-a111-b11cb6d09309 /boot          ext4 
defaults    1 2
#UID=38d14714-c018-41d5-922c-49e415decbca /data          xfs  
defaults    0 0
#UID=a0efb2bb-8227-4317-a79d-0a70d515046c swap          swap 
defaults    0 0
[root@centos8 ~]#sed -rn '/^#/!s@^@#@p' /etc/fstab
#
#UUID=1b950ef9-7142-46bd-975c-c4ac1e0d47e8 /            xfs  
defaults    0 0
#UUID=667a4c81-8b4b-4a39-a111-b11cb6d09309 /boot          ext4 
defaults    1 2
#UUID=38d14714-c018-41d5-922c-49e415decbca /data          xfs  
defaults    0 0
#UUID=a0efb2bb-8227-4317-a79d-0a70d515046c swap          swap 
defaults    0 0


#例：将#开头的行删除#
[root@centos8 ~]#sed -ri.bak '/^#/s/^#//' /etc/fstab


例：取分区利用率
[root@centos8 ~]#df | sed -nr '/^\/dev\/sd/s# .* ([0-9]+)%.*# \1#p'
/dev/sda2 3
/dev/sda5 1
/dev/sda1 14
[root@centos8 ~]#df | sed -rn '/^\/dev\/sd/ s#([^[:space:]]+[[:space:]]+){4}
(.*)%.*#\2#p'
3
1
19
[root@centos8 ~]#df | sed -rn '/^\/dev\/sd/ s#(\S+\s+){4}(.*)%.*#\2#p'
3
1
19


#例：修改内核参数
[root@centos8 ~]#sed -nr '/^GRUB_CMDLINE_LINUX/s/"$/ net.ifnames=0"/p'
/etc/default/grub
GRUB_CMDLINE_LINUX="crashkernel=auto resume=UUID=8363289d-138e-4e4a-abaf-
6e028babc924 rhgb quiet net.ifnames=0"
[root@centos8 ~]#sed -rn '/^GRUB_CMDLINE_LINUX=/s@(.*)"$@\1 net.ifnames=0"@p'
/etc/default/grub
GRUB_CMDLINE_LINUX="crashkernel=auto resume=UUID=a0efb2bb-8227-4317-a79d-
0a70d515046c rhgb quiet net.ifnames=0"
[root@centos8 ~]#sed -rn '/^GRUB_CMDLINE_LINUX=/s@"$@ net.ifnames=0"@p'
/etc/default/grub
GRUB_CMDLINE_LINUX="crashkernel=auto resume=UUID=a0efb2bb-8227-4317-a79d-
0a70d515046c rhgb quiet net.ifnames=0 net.ifnames=0"


#例：修改网卡名称
#centos7,8
[root@centos8 ~]#sed -i '/GRUB_CMDLINE_LINUX=/s#quiet#& net.ifnames=0#'
/etc/default/grub
[root@centos8 ~]#sed -ri '/^GRUB_CMDLINE_LINUX=/s@"$@ net.ifnames=0"@'
/etc/default/grub
[root@centos8 ~]#grub2-mkconfig -o /boot/grub2/grub.cfg
#ubuntu
[root@ubuntu ~]#grub-mkconfig -o /boot/grub/grub.cfg


#例：查看配置文件
#过滤掉空行和#开头的行
sed  -r  '/^(#|$)/d' /etc/httpd/conf/httpd.conf
sed  -r  '/^#|^$/d' /etc/httpd/conf/httpd.conf
#可以排除行首后加多个空白符之后有#这种行
sed  -n '/^$/d;/^[[:space:]]*#/!p' /etc/httpd/conf/httpd.conf
sed  -n  -e '/^$/d' -e '/^[[:space:]]*#/!p' /etc/httpd/conf/httpd.conf
#注意:以下前后顺序不同,执行效果不同
sed  -n '/^[[:space:]]*#/!p;/^$/d' /etc/httpd/conf/httpd.conf
sed  -n  -e '/^[[:space:]]*#/!p' -e '/^$/d' /etc/httpd/conf/httpd.conf

#例：引用变量
[root@centos8 ~]#echo|sed "s/^/$RANDOM.rmvb/"
5242.rmvb
[root@centos8 ~]#echo|sed 's/^/$RANDOM.rmvb/'
$RANDOM.rmvb
[root@centos8 ~]#echo|sed 's/^/'$RANDOM'.rmvb/'
13849.rmvb
[root@centos8 ~]#echo|sed 's/^/'''$RANDOM'''.rmvb/'
28767.rmvb


#例：修改配置文件
[root@centos6 ~]#sed  -e '/^#<VirtualHost/,/^#<\/VirtualHost>/s@#@@' -e
'/^#NameVirtualHost/s@#@@' /etc/httpd/conf/httpd.conf


#例: 变量实现多点编辑配置文件
[root@centos8 ~]#port=8080
[root@centos8 ~]#sed -ri.bak -e 's/^Listen 80/Listen '$port'/' -e "/ServerName/c
ServerName `hostname`:$port" /etc/httpd/conf/httpd.conf


#例: 显示前10行
[root@centos8 ~]#seq 100 > test.txt
[root@centos8 ~]#sed 10q test.txt
1
2
3
4
5
6
7
8
9
10
[root@centos8 ~]#

2.2.1sed 高级用法

sed 中除了模式空间，还另外还支持保持空间（Hold Space）,利用此空间，可以将模式空间中的数据，临时保存至保持空间，从而后续接着处理，实现更为强大的功能。

#常见高级命令：
P     打印模式空间开端至\n内容，并追加到默认输出之前
h     把模式空间中的内容覆盖至保持空间中
H     把模式空间中的内容追加至保持空间中
g     从保持空间取出数据覆盖至模式空间
G     从保持空间取出内容追加至模式空间
x     把模式空间中的内容与保持空间中的内容进行互换
n     读取匹配到的行的下一行覆盖至模式空间
N     读取匹配到的行的下一行追加至模式空间
d     删除模式空间中的行
D     如果模式空间包含换行符，则删除直到第一个换行符的模式空间中的文本，并不会读取新的输入行，而使
用合成的模式空间重新启动循环。如果模式空间不包含换行符，则会像发出d命令那样启动正常的新循环




#例：
sed -n 'n;p' FILE
seq 10 | sed 'N;s/\n//'
sed '1!G;h;$!d' FILE
seq 10 | sed -n '/3/{g;1!p;};h'  #前一行
seq 10 | sed -nr '/3/{n;p}'    #后一行
sed  'N;D'FILE
seq 10 |sed  '3h;9G;9!d'
sed '$!N;$!D' FILE
sed '$!d' FILE
sed 'G' FILE
sed 'g' FILE
sed '/^$/d;G' FILE
sed 'n;d' FILE
sed -n '1!G;h;$p' FILE


#例: 打印偶数行
[root@centos8 ~]#seq 10 | sed -n 'n;p'
2
4
6
8
10
[root@centos8 ~]#seq 10 | sed -n '2~2p'
2
4
6
8
10
[root@centos8 ~]#seq 10 | sed '1~2d'
2
4
6
8
10
[root@centos8 ~]#seq 10 | sed -n '1~2!p'
2
4
6
8
10

2.3文本处理之 awk

有多种版本：
AWK：原先来源于 AT & T 实验室的的AWK
NAWK：New awk，AT & T 实验室的AWK的升级版
GAWK：即GNU AWK。所有的GNU/Linux发布版都自带GAWK，它与AWK和NAWK完全兼容

gawk：模式扫描和处理语言，可以实现下面功能
文本处理
输出格式化的文本报表
执行算数运算
执行字符串操作

#格式：
awk [options]  'program' var=value  file…
awk [options]  -f programfile   var=value file…


#常见选项：
-F “分隔符” 指明输入时用到的字段分隔符，默认的分隔符是若干个连续空白符
-v var=value 变量赋值

*Program格式：

#格式：
pattern{action statements;..}


pattern：决定动作语句何时触发及触发事件，比如：BEGIN,END,正则表达式等
action statements：对数据进行处理，放在{}内指明，常见：print, printf

AWK工作过程：

第一步：执行BEGIN{action;… }语句块中的语句
第二步：从文件或标准输入(stdin)读取一行，然后执行pattern{ action;… }语句块，它逐行扫描文件，从
第一行到最后一行重复这个过程，直到文件全部被读取完毕。
第三步：当读至输入流末尾时，执行END{action;…}语句块
BEGIN语句块在awk开始从输入流中读取行之前被执行，这是一个可选的语句块，比如变量初始化、打
印输出表格的表头等语句通常可以写在BEGIN语句块中
END语句块在awk从输入流中读取完所有的行之后即被执行，比如打印所有行的分析结果这类信息汇总
都是在END语句块中完成，它也是一个可选语句块
pattern语句块中的通用命令是最重要的部分，也是可选的。如果没有提供pattern语句块，则默认执行{
print }，即打印每一个读取到的行，awk读取的每一行都会执行该语句块

分割符、域和记录

由分隔符分隔的字段（列column,域field）标记$1,$2...$n称为域标识，$0为所有域，注意：和
shell中变量$符含义不同
文件的每一行称为记录record
如果省略action，则默认执行 print $0 的操作

常用的action分类

output statements：print,printf
Expressions：算术，比较表达式等
Compound statements：组合语句
Control statements：if, while等
input statements

awk控制语句

{ statements;… } 组合语句
if(condition) {statements;…}
if(condition) {statements;…} else {statements;…}
while(conditon) {statments;…}
do {statements;…} while(condition)
for(expr1;expr2;expr3) {statements;…}
break
continue
exit

2.3.1 动作 print

#格式：
print item1, item2, ...

#注释：
逗号分隔符
输出item可以字符串，也可是数值；当前记录的字段、变量或awk的表达式
如省略item，相当于print $0
固定字符符需要用“ ” 引起来，而变量和数字不需要


#例：
[root@centos8 ~]#awk '{print "hello,awk"}'
[root@centos8 ~]#seq 10 | awk '{print "hello,awk"}'
hello,awk
hello,awk
hello,awk
hello,awk
hello,awk
hello,awk
hello,awk
hello,awk
hello,awk
hello,awk
[root@centos8 ~]#seq 3 | awk '{print 2*3}'
6
66
[root@centos8 ~]#awk -F: '{print "wen"}' /etc/passwd
[root@centos8 ~]#awk -F: '{print}' /etc/passwd
[root@centos8 ~]#awk -F: '{print $0}' /etc/passwd
[root@centos8 ~]#awk -F: '{print $1,$3}' /etc/passwd
[root@centos8 ~]#awk -F: '{print $1"\t"$3}' /etc/passwd
[root@centos8 ~]#grep "^UUID" /etc/fstab |awk {'print $2,$3'}
/ xfs
/boot ext4
/data xfs
swap swap



#例：取出网站访问量最大的前3个IP
[root@VM_0_10_centos logs]# awk '{print $1}' nginx.access.log-20200428|sort |
uniq -c |sort -nr|head -3
 5498 122.51.38.20
 2161 117.157.173.214
  953 211.159.177.120


[root@centos8 ~]#awk '{print $1}' access_log |sort |uniq -c|sort -nr|head
 4870 172.20.116.228
 3429 172.20.116.208
 2834 172.20.0.222
 2613 172.20.112.14
 2267 172.20.0.227
 2262 172.20.116.179
 2259 172.20.65.65
 1565 172.20.0.76
 1482 172.20.0.200
 1110 172.20.28.145


#例：取出分区利用率
[root@centos8 ~]#df | awk '{print $1,$5}'
Filesystem Use%
devtmpfs 0%
tmpfs 0%
tmpfs 2%
tmpfs 0%
/dev/sda2 3%
/dev/sda3 1%
/dev/sda1 15%
tmpfs 0%
#使用扩展的正则表达式
[root@centos8 ~]#df | awk -F"[[:space:]]+|%" '{print $5}'
Use
0
0
1
0
5
1
92
1
[root@centos8 ~]#df | awk -F"[ %]+" '{print $5}'
Use
0
0
1
0
3
1
19
0
[root@centos8 ~]#df | awk -F'[[:space:]]+|%' '{print $1,$5}' 
Filesystem Use
devtmpfs 0
tmpfs 0
tmpfs 2
tmpfs 0
/dev/sda2 3
/dev/sda3 1
/dev/sda1 15
tmpfs 0
[root@rocky8 ~]#df | awk -F" +|%" '{print $5}'
Use
0
0
1
0
3
1
17
0
[root@centos8 ~]#df | grep "^/dev/sd" | awk -F"[[:space:]]+|%" '{print $5}'
5
1
92
[root@centos8 ~]#df | grep '^/dev/sd'| awk -F'[[:space:]]+|%' '{print $1,$5}' 
/dev/sda2 3
/dev/sda3 1
/dev/sda1 15
[root@centos8 ~]#df | awk -F"[[:space:]]+|%" '/^\/dev\/sd/{print $5}'
5
1
92
[root@centos8 ~]#df | awk -F'[[:space:]]+|%' '/^\/dev\/sd/{print $1,$5}' 
/dev/sda2 3
/dev/sda3 1
/dev/sda1 15
[root@centos8 ~]#df|awk -F' +|%' '/^\/dev\/sd/{print $1,$5}'
/dev/sda2 3
/dev/sda3 2
/dev/sda1 100



#例：取nginx的访问日志的中IP和时间
[root@VM_0_10_centos ~]# head -n 3 /apps/nginx/logs/nginx.access.log
58.87.87.99 - - [09/Jun/2020:03:42:43 +0800] "POST /wp-cron.php?
doing_wp_cron=1591645363.2316548824310302734375 HTTP/1.1" ""sendfileon
128.14.209.154 - - [09/Jun/2020:03:42:43 +0800] "GET / HTTP/1.1" ""sendfileon
64.90.40.100 - - [09/Jun/2020:03:43:11 +0800] "GET /wp-login.php HTTP/1.1"
""sendfileon
[root@VM_0_10_centos ~]# awk -F'[[ ]' '{print $1,$5}'
/apps/nginx/logs/nginx.access.log|head -3
58.87.87.99 09/Jun/2020:03:42:43
128.14.209.154 09/Jun/2020:03:42:43
64.90.40.100 09/Jun/2020:03:43:11


#例：取 ifconfig 输出结果中的IP地址
[root@centos8 ~]#hostname -I | cat -A
10.0.0.8 $
[root@centos8 ~]#ifconfig eth0|sed -n '2p' |awk '{print $2}'|cat -A
10.0.0.8$
[root@centos8 ~]#ifconfig eth0 | awk '/netmask/{print $2}'
10.0.0.8
[root@centos6 ~]#ifconfig eth0 |awk -F " +|:" '/Mask/{print $4}'
10.0.0.6
[root@centos6 ~]#ip a show eth0 |awk -F' +|\/' '/\<inet\>/{print $3}' 2>
/dev/null
10.0.0.6
[root@centos8 ~]#ifconfig eth0| sed -rn '2s/^[^0-9]+([0-9.]+) .*$/\1/p'
10.0.0.8
[root@centos6 ~]#ifconfig eth0| sed -rn '2s/^[^0-9]+([0-9.]+) .*$/\1/p'
10.0.0.6

2.3.2awk 变量

awk中的变量分为：内置和自定义变量

*内置变量

#FS：输入字段分隔符，默认为空白字符,功能相当于 -F

#例：
awk -v FS=':'  '{print $1,FS,$3}' /etc/passwd
awk -v FS=":" '{print $1FS$3}' /etc/passwd
awk –F:  '{print $1,$3,$7}'  /etc/passwd
S=:;awk -v FS=$S '{print $1FS$3}' /etc/passwd
[root@centos8 ~]#awk -v FS=":" '{print $1FS$3}' /etc/passwd |head -n3
root:0
bin:1
daemon:2
[root@centos8 ~]#S=:;awk -F$S  '{print $1,$3}' /etc/passwd|head -n3
root 0
bin 1
daemon 2
[root@centos8 ~]#
#-F 和 FS变量功能一样，同时使用会冲突
[root@centos8 ~]#awk -v FS=":" -F";" '{print $1FS$3}' /etc/passwd |head -n3
root:x:0:0:root:/root:/bin/bash;
bin:x:1:1:bin:/bin:/sbin/nologin;
daemon:x:2:2:daemon:/sbin:/sbin/nologin;
[root@centos8 ~]#awk -F";" -v FS=":" '{print $1FS$3}' /etc/passwd |head -n3
root:0
bin:1
daemon:2


#-F 和 FS变量功能一样，同时使用会 -F 优先级高
[root@centos8 ~]#awk -v FS=":" -F";" '{print $1}' /etc/passwd |head -n3
root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
[root@centos8 ~]#awk -v FS=";" -F":" '{print $1}' /etc/passwd |head -n3
root
bin
daemon





#OFS：输出字段分隔符，默认为空白字符
[root@centos8 ~]#awk -v FS=':'  '{print $1,$3,$7}'  /etc/passwd|head -n1
root 0 /bin/bash
[root@centos8 ~]#awk -v FS=':' -v OFS=':' '{print $1,$3,$7}' 
/etc/passwd|head -n1
root:0:/bin/bash



#RS：输入记录record分隔符，指定输入时的换行符
awk -v RS=' ' '{print }' /etc/passwd



#ORS：输出记录分隔符，输出时用指定符号代替换行符
awk -v RS=' ' -v ORS='###'  '{print $0}' /etc/passwd


#NF：字段数量
#引用变量时，变量前不需加$
[root@centos8 ~]#awk -F：'{print NF}' /etc/fstab
[root@centos8 ~]#awk -F：'{print $(NF-1)}' /etc/passwd
[root@centos8 ~]#ls /misc/cd/BaseOS/Packages/*.rpm |awk -F"." '{print $(NF-
1)}'|sort |uniq -c
  389 i686
  208 noarch
 1060 x86_64




#例：每十分钟检查将连接数超过100个以上的IP放入黑名单拒绝访问
[root@centos8 ~]#cat deny_dos.sh
LINK=100
while true;do
ss -nt | awk -F"[[:space:]]+|:" '/^ESTAB/{print $(NF-2)}'|sort |uniq -
c|while read count ip;do
if [ $count -gt $LINK ];then
 iptables -A INPUT -s $ip -j REJECT
fi
done
done
[root@centos8 ~]#chmod +x /root/deny_dos.sh
[root@centos8 ~]#crontab -e
[root@centos8 ~]#crontab -l
*/10 * * * *  /root/deny_dos.sh



#例：
[root@centos8 ~]#cat deny_dos.sh
IPLIST=`awk -F" +|:" '/^ESTAB/{print $(NF-2)}' ss.log |sort |uniq -c|sort -
nr|head -3|awk '{print $2}'`
for ip in $IPLIST;do
 iptables -A INPUT -s  $ip -j REJECT
done


#NR：记录的编号
[root@centos8 ~]#awk '{print NR,$0}' /etc/issue /etc/centos-release
1 \S
2 Kernel \r on an \m
3
4 CentOS Linux release 8.1.1911 (Core)


#例：取ifconfig输出结果中的IP地址
[root@centos8 ~]#ifconfig eth0 | awk '/netmask/{print $2}'
10.0.0.8
[root@centos8 ~]#ifconfig eth0 | awk 'NR==2{print $2}'
10.0.0.8


#例：
[root@centos8 ~]#awk -F: '{print NR}' /etc/passwd
1
2
3
.......
[root@centos8 ~]#awk -F: 'END{print NR}' /etc/passwd
57
[root@centos8 ~]#awk -F: 'BEGIN{print NR}' /etc/passwd
0



#FNR：各文件分别计数，记录的编号
awk '{print FNR}' /etc/fstab /etc/inittab
[root@centos8 ~]#awk '{print NR,$0}' /etc/issue /etc/redhat-release
1 \S
2 Kernel \r on an \m
3
4 CentOS Linux release 8.0.1905 (Core)
[root@centos8 script40]#awk '{print FNR,$0}' /etc/issue /etc/redhat-release
1 \S
2 Kernel \r on an \m
3
1 CentOS Linux release 8.0.1905 (Core)


#FILENAME：当前文件名
[root@centos8 ~]#awk '{print FILENAME}' /etc/fstab
[root@centos8 ~]#awk '{print FNR,FILENAME,$0}' /etc/issue /etc/redhat-release
1 /etc/issue \S
2 /etc/issue Kernel \r on an \m
3 /etc/issue
1 /etc/redhat-release CentOS Linux release 8.0.1905 (Core)


#：ARGC：命令行参数的个数
[root@centos8 ~]#awk '{print ARGC}' /etc/issue /etc/redhat-release
3
3
3
3
[root@centos8 ~]#awk 'BEGIN{print ARGC}' /etc/issue /etc/redhat-release
3



#ARGV：数组，保存的是命令行所给定的各参数，每一个参数：ARGV[0]，......
[root@centos8 ~]#awk 'BEGIN{print ARGV[0]}' /etc/issue /etc/redhat-release
awk
[root@centos8 ~]#awk 'BEGIN{print ARGV[1]}' /etc/issue /etc/redhat-release
/etc/issue
[root@centos8 ~]#awk 'BEGIN{print ARGV[2]}' /etc/issue /etc/redhat-release
/etc/redhat-release
[root@centos8 ~]#awk 'BEGIN{print ARGV[3]}' /etc/issue /etc/redhat-release
[root@centos8 ~]#

*自定义变量

自定义变量是区分字符大小写的,使用下面方式进行赋值
-v var=value
在program中直接定义

#例：
[root@centos8 ~]#awk -v test1=test2="hello,gawk" 'BEGIN{print test1,test2}' 
test2=hello,gawk
[root@centos8 ~]#awk -v test1=test2="hello1,gawk"
'BEGIN{test1=test2="hello2,gawk";print test1,test2}' 
hello2,gawk hello2,gawk


#例：
awk  -v test='hello gawk' '{print test}' /etc/fstab
awk  -v test='hello gawk' 'BEGIN{print test}'
awk  'BEGIN{test="hello,gawk";print test}'
awk  -F: '{sex="male";print $1,sex,age;age=18}' /etc/passwd
cat awkscript
{print script,$1,$2}
awk  -F: -f awkscript script="awk" /etc/passwd

*操作符

#算术操作符：
x+y, x-y, x*y, x/y, x^y, x%y
-x：转换为负数
+x：将字符串转换为数值


#字符串操作符：没有符号的操作符，字符串连接
赋值操作符：
=, +=, -=, *=, /=, %=, ^=，++, --


#例：
[root@centos8 ~]#awk 'BEGIN{i=0;print i++,i}'
0 1
[root@centos8 ~]#awk 'BEGIN{i=0;print ++i,i}'
1 1


#例：
[root@centos8 ~]#seq 10 | awk 'n++'
2
3
4
5
6
7
8
9
10
[root@centos8 ~]#awk -v n=0 '!n++' /etc/passwd
root:x:0:0:root:/root:/bin/bash
[root@centos8 ~]#awk -v n=0 '!n++{print n}' /etc/passwd
1
[root@centos8 ~]#awk -v n=1 '!n++{print n}' /etc/passwd
[root@centos8 ~]#awk -v n=0 '!++n{print n}' /etc/passwd
[root@centos8 ~]#awk -v n=0 '!++n' /etc/passwd
[root@centos8 ~]#awk -v n=-1 '!++n' /etc/passwd
root:x:0:0:root:/root:/bin/bash




#比较操作符：
==, !=, >, >=, <, <=
[root@centos8 ~]#awk 'NR==2' /etc/issue
Kernel \r on an \m
[root@centos8 ~]#awk -F: '$3>=1000' /etc/passwd
nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin
wen:x:1000:1000:wen:/home/wen:/bin/bash
make:x:1001:1001::/home/make:/bin/bash


#例：取奇，偶数行
[root@centos8 ~]#seq 10 | awk 'NR%2==0'
2
4
6
8
10
[root@centos8 ~]#seq 10 | awk 'NR%2==1'
1
3
5
7
9
[root@centos8 ~]#seq 10 | awk 'NR%2!=0'
1
3
5
7
9



#模式匹配符：
~ 左边是否和右边匹配，包含关系
!~ 是否不匹配


#例：
[root@centos8 ~]#awk -F: '$0 ~ /root/{print $1}' /etc/passwd
[root@centos8 ~]#awk -F: '$0 ~ "^root"{print $1}' /etc/passwd
[root@centos8 ~]#awk '$0 !~ /root/'  /etc/passwd
[root@centos8 ~]#awk '/root/'  /etc/passwd
[root@centos8 ~]#awk -F: '/r/' /etc/passwd
[root@centos8 ~]#awk -F: '$3==0'   /etc/passwd
[root@centos8 ~]#df | awk -F"[[:space:]]+|%" '$0 ~ /^\/dev\/sd/{print $5}'
5
1
92
[root@centos8 ~]#ifconfig eth0 | awk 'NR==2{print $2}'
10.0.0.8


#逻辑操作符：
与：&&，并且关系
或：||，或者关系
非：!，取反


#例：！取反
[root@centos8 ~]#awk 'BEGIN{print i}'
[root@centos8 ~]#awk 'BEGIN{print !i}'
1
[root@centos8 ~]#awk -v i=10 'BEGIN{print !i}'
0
[root@centos8 ~]#awk -v i=-3 'BEGIN{print !i}'
0
[root@centos8 ~]#awk -v i=0 'BEGIN{print !i}'
1
[root@centos8 ~]#awk -v i=abc 'BEGIN{print !i}'
0
[root@centos8 ~]#awk -v i='' 'BEGIN{print !i}'
1


#例：
awk -F:  '$3>=0 && $3<=1000 {print $1,$3}' /etc/passwd
awk -F:  '$3==0 || $3>=1000 {print $1,$3}' /etc/passwd
awk -F:  '!($3==0) {print $1,$3}'   /etc/passwd
awk -F:  '!($3>=500) {print $1,$3}' /etc/passwd


#条件表达式（三目表达式）：
selector?if-true-expression:if-false-expression

#例：
awk -F: '{$3>=1000?usertype="Common User":usertype="SysUser";printf
"%-20s:%12s\n",$1,usertype}'  /etc/passwd
[root@centos8 ~]#df | awk -F"[ %]+" '/^\/dev\/sd/{$(NF-1)>10?
disk="full":disk="OK";print $(NF-1),disk}'
3 OK
1 OK
13 full

*PATTERN模式

#PATTERN:根据pattern条件，过滤匹配的行，再做处理
如果未指定：空模式，匹配每一行

#例：
[root@centos8 ~]#awk -F: '{print $1,$3}' /etc/passwd



#/regular expression/：仅处理能够模式匹配到的行，需要用/ /括起来

#例：
[root@centos8 ~]#awk  '/^UUID/{print $1}'   /etc/fstab
[root@centos8 ~]#awk  '!/^UUID/{print $1}'  /etc/fstab
[root@centos8 ~]#df | awk '/^\/dev\/sd/'
/dev/sda2    104806400 4935924  99870476  5% /
/dev/sda3    52403200  398876  52004324  1% /data
/dev/sda1     999320  848572   81936  92% /boot


#relational expression: 关系表达式，结果为“真”才会被处理
真：结果为非0值，非空字符串
假：结果为空字符串或0值

#例：
[root@centos8 ~]#seq 10 | awk '1'
1
2
3
4
5
6
7
8
9
10
[root@centos8 ~]#seq 10 | awk '0'
[root@centos8 ~]#seq 10 | awk '"false"'
1
2
3
4
5
6
7
8
9
10
[root@centos8 ~]#seq 10 | awk '""'
[root@centos8 ~]#seq 10 | awk '"0"'
1
2
3
4
5
6
7
8
9
10
[root@centos8 ~]#seq 10 | awk 'true'
[root@centos8 ~]#seq 10 | awk 'false'
[root@centos8 ~]#seq 10 | awk 'wen'
[root@centos8 ~]#seq 10 | awk 'makejon'
[root@centos8 ~]#seq 10 | awk '0'
[root@centos8 ~]#seq 10 | awk '" "'
1
2
3
4
5
6
7
8
9
10
[root@centos8 ~]#seq 10 | awk 'makejon'
[root@centos8 ~]#seq 10 | awk -v makejon=0 'makejon'
[root@centos8 ~]#seq 10 | awk -v makejon="" 'makejon'
[root@centos8 ~]#seq 10 | awk -v makejon="0" 'makejon'
[root@centos8 ~]#seq 10 | awk -v makejon="abc" 'makejon'
1
2
3
4
5
6
7
8
9
10


#例：
seq 10 | awk  'i=0'
seq 10 | awk  'i=1'
seq 10 | awk  'i=!i'
seq 10 | awk  '{i=!i;print i}'
seq 10 | awk  '!(i=!i)'       
seq 10 | awk  -v  i=1 'i=!i'
[root@centos8 ~]#seq 10 | awk  'i=0'
[root@centos8 ~]#seq 10 | awk  'i=1'
1
2
3
4
5
6
7
8
9
10
[root@centos8 ~]#seq 10 | awk  'i=1'
1
2
3
4
5
6
7
8
9
10
[root@centos8 ~]#seq 10 | awk  'i=0'
[root@centos8 ~]#seq 10 | awk  'i=!i'
1
3
5
7
9
[root@centos8 ~]#seq 10 | awk  '!(i=!i)'
2
4
6
810
[root@centos8 ~]#seq 10 | awk -v i=1 'i=!i'
2
4
6
8
10
[root@centos8 ~]#seq 10 | awk -v i=0 'i=!i'
1
3
5
7
9
[root@centos8 ~]#seq 10 | awk  '{i=!i;print i}'
1
0
1
0
1
0
1
0
1
0
[root@centos8 ~]#



#例：
awk  -F:  'i=1;j=1{print i,j}' /etc/passwd
Awk  -F: '$3>=1000{print $1,$3}' /etc/passwd
awk  -F: '$3<1000{print $1,$3}' /etc/passwd
awk  -F: '$NF=="/bin/bash"{print $1,$NF}' /etc/passwd
[root@centos8 ~]#awk -F: '$NF=="/bin/bash"{print $1,$NF}' /etc/passwd
root /bin/bash
wen  /bin/bash
make /bin/bash
[root@centos8 ~]#awk -F: '$NF ~ /bash$/{print $1,$NF}' /etc/passwd
root /bin/bash
wen  /bin/bash
make /bin/bash




#line ranges：行范围
不支持直接用行号，但可以使用变量NR间接指定行号
/pat1/,/pat2/ 不支持直接给出数字格式

#例：
[root@centos8 ~]#seq 10 | awk 'NR>=3 && NR<=6'
3
4
5
6
[root@centos8 ~]#awk 'NR>=3 && NR<=6{print NR,$0}' /etc/passwd
3 daemon:x:2:2:daemon:/sbin:/sbin/nologin
4 adm:x:3:4:adm:/var/adm:/sbin/nologin
5 lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
6 sync:x:5:0:sync:/sbin:/bin/sync
[root@centos8 ~]#sed -n '3,6p' /etc/passwd
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
sync:x:5:0:sync:/sbin:/bin/sync
[root@centos8 ~]#awk '/^bin/,/^adm/' /etc/passwd
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
[root@centos8 ~]#sed -n '/^bin/,/^adm/p' /etc/passwd
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin



#BEGIN/END模式
BEGIN{}：仅在开始处理文件中的文本之前执行一次
END{}：仅在文本处理完成之后执行一次

#例：
awk -F : 'BEGIN {print "USER USERID"} {print $1":"$3} END{print "END FILE"}'
/etc/passwd
awk -F: '{print "USER USERID";print $1":"$3} END{print "END FILE"}' /etc/passwd
awk -F: 'BEGIN{print "USER UID \n--------------- "}{print $1,$3}' /etc/passwd
awk -F: 'BEGIN{print "USER UID \n--------"}{print $1,$3}END{print
"=========="}' /etc/passwd
[root@centos8 ~]#awk -F:  'BEGIN{printf "--------------------------------
\n%-20s|%10s|\n--------------------------------\n","username","uid"}{printf
"%-20s|%10d|\n--------------------------------\n",$1,$3}' /etc/passwd
--------------------------------
username    |    uid|
--------------------------------
root        |     0
bin         |     1|
daemon      |     2|
adm         |     3|
lp          |     4|

2.3.3条件判断 if-else

#语法：
if(condition){statement;…}[else statement]
if(condition1){statement1}else if(condition2){statement2}else if(condition3)
{statement3}...... else {statementN}

#使用场景：对awk取得的整行或某个字段做条件判断
#例：
awk -F: '{if($3>=1000)print $1,$3}' /etc/passwd
awk -F: '{if($3<=100){print "<=100",$3}else if ($3<=1000) {print "<=1000",$3}
else{print ">=1000",$3}}' /etc/passwd
awk -F: '{if($NF=="/bin/bash") print $1}' /etc/passwd
awk '{if(NF>5) print $0}' /etc/fstab
awk -F: '{if($3>=1000) {printf "Common user: %s\n",$1} else {printf "root or
Sysuser: %s\n",$1}}' /etc/passwd
awk -F: '{if($3>=1000) printf "Common user: %s\n",$1; else printf "root or
Sysuser: %s\n",$1}' /etc/passwd
df -h|awk -F% '/^\/dev\/sd/{print $1}'| awk '$NF>=80{print $1,$5}'
df | awk -F"[[:space:]]+|%" '/^\/dev\/sd/{if($5>80)print $1,$5}'
[root@centos8 ~]#df | awk -F' +|%' '/^\/dev\/sd/{if($5>=10)print $1,$5}'
/dev/sda1 15
awk 'BEGIN{ test=100;if(test>90){print "very good"}
else if(test>60){ print "good"}else{print "no pass"}}'


#例：
root@ubuntu2004:~# df | awk -F'[ %]+' '/\/dev\/.d./{if($5>=10){print $1,$5}}'
/dev/sda3 22
root@ubuntu2004:~# df | awk -F' +|%' '/^\/dev\/sd/{if($5>=10)print $1,$5}'
/dev/sda3 22
root@ubuntu2004:~# df | awk -F'[[:space:]]+|%' '/^\/dev\/sd/{if($5>=10)print
$1,$5}'
/dev/sda3 22
[root@centos8 ~]#df | awk -F"[ %]+" '/^\/dev\/sd/{if($(NF-1)>10)print $(NF-1)"
full";else {print $(NF-1)" OK"}}'
3 OK
1 OK
13 full
[root@centos8 ~]#df | awk -F"[ %]+" '/^\/dev\/sd/{if($(NF-1)>10){print $(NF-1)"
full"}else {print $(NF-1)" OK"}}'
3 OK
1 OK
13 full

2.3.4条件判断 switch

#语法：
switch(expression) {case VALUE1 or /REGEXP/: statement1; case VALUE2 or
/REGEXP2/: statement2; ...; default: statementn}

2.3.5循环 while

#语法：
while (condition) {statement;…}

#条件“真”，进入循环；条件“假”，退出循环
使用场景：
  对一行内的多个字段逐一类似处理时使用
  对数组中的各元素逐一处理时使用


#例：
root@ubuntu2004:~# awk -v i=1 -v sum=0 'BEGIN{while(i<=100){sum+=i;i++};print
sum}'
5050



#例：
#内置函数length()返回字符数，而非字节数
[root@centos8 ~]#awk 'BEGIN{print length("hello")}'
5
[root@centos8 ~]#awk 'BEGIN{print length("马哥教育")}'
4
[root@centos7 ~]#awk '/^[[:space:]]*linux16/{i=1;while(i<=NF){print
$i,length($i); i++}}' /etc/grub2.cfg
linux16 7
/vmlinuz-3.10.0-1062.el7.x86_64 31
root=UUID=bebb9244-bbb8-4c69-9249-54a36c75155e 46
ro 2
crashkernel=auto 16
rhgb 4
quiet 5
net.ifnames=0 13
linux16 7
/vmlinuz-0-rescue-b12558570741487c9328c996e3265b09 50
root=UUID=bebb9244-bbb8-4c69-9249-54a36c75155e 46
ro 2
crashkernel=auto 16
rhgb 4
quiet 5
net.ifnames=0 13
[root@centos7 ~]#awk '/^[[:space:]]*linux16/{i=1;while(i<=NF)
{if(length($i)>=10){print $i,length($i)}; i++}}' /etc/grub2.cfg
/vmlinuz-3.10.0-1062.el7.x86_64 31
root=UUID=bebb9244-bbb8-4c69-9249-54a36c75155e 46
crashkernel=auto 16
net.ifnames=0 13
/vmlinuz-0-rescue-b12558570741487c9328c996e3265b09 50
root=UUID=bebb9244-bbb8-4c69-9249-54a36c75155e 46
crashkernel=auto 16
net.ifnames=0 13
[root@centos8 ~]#awk 'BEGIN{ total=0;i=1;while(i<=100){total+=i;i++};print
total}'
5050

2.3.6循环 do-while

#语法：
do {statement;…}while(condition)


#意义：无论真假，至少执行一次循环体
do-while循环
语法：do {statement;…}while(condition)
意义：无论真假，至少执行一次循环体


#例：
[root@centos8 ~]#awk 'BEGIN{ total=0;i=1;do{ total+=i;i++;}while(i<=100);print
total}'
5050

2.3.7循环 for

#语法：
for(expr1;expr2;expr3) {statement;…}

#用法：
for(variable assignment;condition;iteration process) {for-body}

#特殊用法：能够遍历数组中的元素
for(var in array) {for-body}


#例：
root@ubuntu2004:~# awk 'BEGIN{sum=0;for(i=1;i<=100;i++){sum+=i};print sum}'
5050
root@ubuntu2004:~# for((i=1,sum=0;i<=100;i++));do let sum+=i;done;echo $sum
5050


#例：
[root@centos8 ~]#awk 'BEGIN{total=0;for(i=1;i<=100;i++){total+=i};print total}'
5050


#例：
文件abc,txt只有一行数字，计算其总和
[root@centos8 ~]#cat abc.txt
1 2 3 4 5
[root@centos8 ~]#cat abc.txt |awk '{for(i=1;i<=NF;i++){sum+=i};print sum}'
15
[root@centos8 ~]#cat abc.txt|tr ' ' + |bc
15
[root@centos8 ~]#sum=0;for i in `cat abc.txt`;do let sum+=i;done;echo $sum
15


#例：
[root@centos7 ~]#awk '/^[[:space:]]*linux16/{for(i=1;i<=NF;i++) {print
$i,length($i)}}' /etc/grub2.cfg
linux16 7
/vmlinuz-3.10.0-1062.el7.x86_64 31
root=UUID=bebb9244-bbb8-4c69-9249-54a36c75155e 46
ro 2
crashkernel=auto 16
rhgb 4
quiet 5
net.ifnames=0 13
linux16 7
/vmlinuz-0-rescue-b12558570741487c9328c996e3265b09 50
root=UUID=bebb9244-bbb8-4c69-9249-54a36c75155e 46
ro 2
crashkernel=auto 16
rhgb 4
quiet 5
net.ifnames=0 13


#性能比较
time (awk 'BEGIN{ total=0;for(i=0;i<=10000;i++){total+=i;};print total;}')
time (total=0;for i in {1..10000};do total=$(($total+i));done;echo $total)
time (for ((i=0;i<=10000;i++));do let total+=i;done;echo $total)
time (seq –s ”+” 10000|bc)


#例: 取出字符串中的数字
echo 'dsFUs34tg*fs5a%8ar%$#@' |awk -F "" '
{
 for(i=1;i<=NF;i++)
{ 
  if ($i ~ /[0-9]/)      
 {
   str=(str $i)
 } 
}
print str
}'

2.3.8continue 和 break

# continue 中断本次循环
# break 中断整个循环


#格式：
continue [n]
break [n]


#例：
[root@centos8 ~]#awk 'BEGIN{for(i=1;i<=100;i++){if(i==50)continue;sum+=i};print
sum}'
5000
[root@centos8 ~]#awk 'BEGIN{for(i=1;i<=100;i++){if(i==50)break;sum+=i};print
sum}'
1225
[root@centos8 ~]#awk 'BEGIN{sum=0;for(i=1;i<=100;i++)
{if(i%2==0)continue;sum+=i}print sum}'
2500
[root@centos8 ~]#awk 'BEGIN{sum=0;for(i=1;i<=100;i++){if(i==50)break;sum+=i}print
sum}'
1225

2.3.9next

#next 可以提前结束对本行处理而直接进入下一行处理（awk自身循环）


#例：
[root@centos8 ~]#awk -F: '{if($3%2!=0) next; print $1,$3}' /etc/passwd
root 0
daemon 2
lp 4
shutdown 6
mail 8
games 12
ftp 14
nobody 65534
polkitd 998
gluster 996
rtkit 172
rpc 32
chrony 994
saslauth 992
clevis 984
pegasus 66
colord 982
setroubleshoot 980
gdm 42
gnome-initial-setup 978
sshd 74
avahi 70
tcpdump 72
wen 1000

2.3.10数组

#awk的数组为关联数组


#格式：
array_name[index-expression]

#例：
weekdays["mon"]="Monday"


#index-expression
    利用数组，实现 k/v 功能
    可使用任意字符串；字符串要使用双引号括起来
    如果某数组元素事先不存在，在引用时，awk会自动创建此元素，并将其值初始化为“空串”
    若要判断数组中是否存在某元素，要使用“index in array”格式进行遍历


#例：
[root@centos8 ~]#awk
'BEGIN{weekdays["mon"]="Monday";weekdays["tue"]="Tuesday";print
weekdays["mon"]}'
Monday



#例：
awk '!line[$0]++' dupfile
awk '{print !line[$0]++, $0, line[$0]}' dupfile
awk '{!line[$0]++;print $0, line[$0]}' dupfile


#例：判断数组索引是否存在
[root@centos8 ~]# awk 'BEGIN{array["i"]="x"; array["j"]="y" ; print "i" in array,
"y" in array }'
1 0
[root@centos8 ~]#awk 'BEGIN{array["i"]="x"; array["j"]="y" ;if ("i" in array )
{print "存在"}else{print "不存在"}}'
存在
[root@centos8 ~]#awk 'BEGIN{array["i"]="x"; array["j"]="y" ;if ("abc" in array )
{print "存在"}else{print "不存在"}}'
不存在


#若要遍历数组中的每个元素，要使用 for 循环
for(var in array) {for-body}


#注意：var 会遍历array的每个索引
#例：遍历数组
[root@centos8 ~]#awk
'BEGIN{weekdays["mon"]="Monday";weekdays["tue"]="Tuesday";for(i in weekdays)
{print i,weekdays[i]}}'
tue Tuesday
mon Monday
[root@centos8 ~]#awk
'BEGIN{students[1]="daizong";students[2]="junzong";students[3]="kunzong";for(x in
students){print x":"students[x]}}'
1:daizong
2:junzong
3:kunzong
[root@centos8 ~]#awk 'BEGIN {
a["x"] = "welcome"
a["y"] = "to"
a["z"] = "Makejon"
for (i in a) {
  print i,a[i]
}
}'
x welcome
y to
z Makejon
[root@centos8 ~]#awk -F: '{user[$1]=$3}END{for(i in user){print "username:
"i,"uid: "user[i]}}' /etc/passwd
username: adm uid: 3
username: rpc uid: 32
username: dnsmasq uid: 985
username: radvd uid: 75
username: sync uid: 5
username: mail uid: 8
username: exim uid: 93
username: tss uid: 59
username: gluster uid: 996
username: unbound uid: 995
username: halt uid: 7


#例：显示主机的连接状态出现的次数
[root@centos8 ~]#awk 'NR!=1{print $1}' ss.log |sort |uniq -c
  118 ESTAB
   1 FIN-WAIT-1
  11 LAST-ACK
[root@centos8 ~]#cat ss.log | sed -nr '1!s/^([^0-9]+) .*/\1/p'|sort |uniq -c
  529 ESTAB  
   9 LISTEN 
  128 SYN-RECV
  95 TIME-WAIT
 
[root@centos8 ~]#ss -ant | awk 'NR!=1{state[$1]++}END{for(i in state){print
i,state[i]}}'
SYN-RECV 128
LISTEN 9
ESTAB 529
TIME-WAIT 95
[root@centos8 ~]#netstat -tan | awk '/^tcp/{state[$NF]++}END{for(i in state)
{print i,state[i]}}'
LISTEN 9
SYN_RECV 126
ESTABLISHED 523
FIN_WAIT2 40


#例：
[root@centos8 ~]#awk '{ip[$1]++}END{for(i in ip){print i,ip[i]}}'
/var/log/httpd/access_log
172.20.0.200 1482
172.20.21.121 2
172.20.30.91 29
172.16.102.29 864
172.20.0.76 1565
172.20.9.9 15
172.20.1.125 463
172.20.61.11 2
172.20.73.73 198
[root@centos8 ~]#awk '{ip[$1]++}END{for(i in ip){print ip[i],i}}' access_log
|sort -nr| head -3
4870 172.20.116.228
3429 172.20.116.208
2834 172.20.0.222
[root@centos8 ~]#awk '{ip[$1]++}END{for(i in ip){print i,ip[i]}}' access_log
|sort -k2 -nr|head -3
172.20.116.228 4870
172.20.116.208 3429
172.20.0.222 2834


#例：封掉查看访问日志中连接次数超过1000次的IP
[root@centos8 ~]#awk '{ip[$1]++}END{for(i in ip){if(ip[i]>=1000){system("iptables
-A INPUT -s "i" -j REJECT")}}}' nginx.access.log-20200428


#例：多维数组[root@centos8 ~]#awk 'BEGIN{
> array[1][1]=11
> array[1][2]=12
> array[1][3]=13
> array[2][1]=21
> array[2][2]=22
> array[2][3]=23
> for (i in array)
>   for (j in array[i])
>     print array[i][j]
> }'
11
12
13
21
22
23


#例：
root@ubuntu2004:~# cat score.txt
name sex score
alice f  100
bob  m  90
ming m  95
hong f  90
root@ubuntu2004:~# awk 'NR!=1{if($2=="m")
{m_sum+=$3;m_num++}else{f_sum+=$3;f_num++}}END{print "男生平均成绩="m_sum/m_num,"女
生平均成绩="f_sum/f_num}' score.txt
男生平均成绩=92.5 女生平均成绩=95
root@ubuntu2004:~# awk 'NR!=1{score[$2]+=$3;num[$2]++}END{for(i in score){print
i,score[i]/num[i]}}' score.txt
m 92.5
f 95
root@ubuntu2004:~# awk 'NR!=1{score[$2]+=$3;num[$2]++}END{for(i in score)
{if(i=="m"){print "男生平均成绩=",score[i]/num[i]}else{print "女生平均成绩
=",score[i]/num[i]}}}' score.txt
男生平均成绩= 92.5
女生平均成绩= 95

2.3.11awk 函数

awk 的函数分为内置和自定义函数

内置函数：

#官方文档：
https://www.gnu.org/software/gawk/manual/gawk.html#Functions


#常见内置函数
数值处理：
rand()：返回0和1之间一个随机数
srand()：配合rand() 函数,生成随机数的种子
int()：返回整数


#例：
[root@centos8 ~]#awk 'BEGIN{srand();print rand()}'
0.790437
[root@centos8 ~]#awk 'BEGIN{srand();print rand()}'
0.283736
[root@centos8 ~]#awk 'BEGIN{srand();print rand()}'
0.948082
[root@centos8 ~]#awk 'BEGIN{srand();print rand()}'
0.371798
[root@centos8 ~]#awk 'BEGIN{srand(); for (i=1;i<=10;i++)print int(rand()*100) }'
35
17
35
95
19
15
70
54
46
93


#字符串处理：
length([s])：返回指定字符串的长度
sub(r,s,[t])：对t字符串搜索r表示模式匹配的内容，并将第一个匹配内容替换为s
gsub(r,s,[t])：对t字符串进行搜索r表示的模式匹配的内容，并全部替换为s所表示的内容
split(s,array,[r])：以r为分隔符，切割字符串s，并将切割后的结果保存至array所表示的数组中，第
一个索引值为1,第二个索引值为2,…


#例: 统计用户名的长度
root@ubuntu2004:~# cut -d: -f1 /etc/passwd | awk '{print length()}'
root@ubuntu2004:~# awk -F: '{print length($1)}' /etc/passwd


#例：
[root@centos8 ~]#echo "2008:08:08 08:08:08" | awk 'sub(/:/,"-",$1)'
2008-08:08 08:08:08
[root@centos8 ~]#echo "2008:08:08 08:08:08" | awk '{sub(/:/,"-",$1);print $0}'
2008-08:08 08:08:08

#例：
[root@centos8 ~]#echo "2008:08:08 08:08:08" | awk 'gsub(/:/,"-",$0)'
2008-08-08 08-08-08
[root@centos8 ~]#echo "2008:08:08 08:08:08" | awk '{gsub(/:/,"-",$0);print $0}'
2008-08-08 08-08-08

#例：
[root@centos8 ~]#netstat -tn | awk
'/^tcp/{split($5,ip,":");count[ip[1]]++}END{for(i in count){print i,count[i]}}'
10.0.0.1 1
10.0.0.6 1
10.0.0.7 673    



#可以awk中调用shell命令
    system('cmd')
#空格是awk中的字符串连接符，如果system中需要使用awk中的变量可以使用空格分隔，或者说除
了awk的变量外其他一律用""引用起来



#例：
awk 'BEGIN{system("hostname")}'
awk 'BEGIN{score=100; system("echo your score is " score) }'
[root@centos8 ~]#netstat -tn | awk
'/^tcp/{split($5,ip,":");count[ip[1]]++}END{for(i in count){if(count[i]>=10)

#时间函数：
systime() 当前时间到1970年1月1日的秒数
strftime() 指定时间格式 

#官方文档: 时间函数：
https://www.gnu.org/software/gawk/manual/gawk.html#Time-Functions
{system("iptables -A INPUT -s "i" -j REJECT")}}}'

自定义函数：

#自定义函数格式：
function name ( parameter, parameter, ... ) {
 statements
 return expression
}


#例：
[root@centos8 ~]#cat func.awk
function max(x,y) {
x>y?var=x:var=y
return var
}
BEGIN{print max(a,b)}
[root@centos8 ~]#awk -v a=30 -v b=20 -f func.awk
30

2.3.12awk 脚本

#将awk程序写成脚本，直接调用或执行

#例：
[root@centos8 ~]#cat passwd.awk
{if($3>=1000)print $1,$3}
[root@centos8 ~]#awk -F: -f passwd.awk /etc/passwd
nobody 65534
wen 1000
make 1001


#例：
[root@centos8 ~]#cat test.awk
#!/bin/awk -f
#this is a awk script
{if($3>=1000)print $1,$3}
[root@centos8 ~]#chmod +x test.awk
[root@centos8 ~]#./test.awk -F: /etc/passwd
nobody 65534
wen 1000
make 1001


#向awk脚本传递参数
#格式：
awkfile  var=value  var2=value2... Inputfile     

#注意：
上面格式变量在BEGIN过程中不可用。直到首行输入完成以后，变量才可用
可以通过-v 参数，让awk在执行BEGIN之前得到变量的值
命令行中每一个指定的变量都需要一个-v参数    


#例：
[root@rocky8 ~]#awk -v x=100 'BEGIN{print x}{print x+100}' /etc/hosts
100
200
200
[root@rocky8 ~]#awk 'BEGIN{print x}{print x+100}' x=200 /etc/hosts
300
300


#例：
[root@centos8 ~]#cat test2.awk
#!/bin/awk -f
{if($3 >=min && $3<=max)print $1,$3}
[root@centos8 ~]#chmod +x test2.awk
[root@centos8 ~]#./test2.awk -F: min=100 max=200 /etc/passwd
systemd-resolve 193
rtkit 172
pulse 171
qemu 107
usbmuxd 113
abrt 173


#例: 检查出最近一小时内访问nginx服务次数超过3次的客户端IP
[root@VM_0_10_centos ~]# cat check_nginx_log.awk
#!/usr/bin/awk -f
BEGIN {
beg=strftime("%Y-%m-%dT%H:%M",systime()-3600) ;
#定义一个小时前的时间，并格式化日期格式
end=strftime( "%Y-%m-%dT%H:%M",systime()-60) ;
#定义结束时间
#print beg;
#print end;
}
$4 > beg && $4 < end {#定义取这个时间段内的日志
count[$12]+=1;#利用ip当做数组下标，次数当做数组内容
}
END {
for(i in count){#结束从数组取数据代表数组的下标，也就是ip
if(count[i]>3) { #如果次数大于3次，做操作
print count [i]" "i;
#system("iptables -I INPUT -S”i”j DROP" )
}
}
}
#awk -F'"' -f check_nginx_log.awk /apps/nginx/logs/access.log
[root@VM_0_10_centos ~]# head /apps/nginx/logs/access_json.log -n3
{"@timestamp":"2020-06-
09T17:12:13+08:00","host":"172.21.0.10","clientip":"58.87.87.99","size":0,"respo
nsetime":0.001,"upstreamtime":"0.001","upstreamhost":"127.0.0.1:9000","http_host
":"www.wen.com","uri":"/wp-
cron.php","domain":"www.wen.com","xff":"-","referer":"-
","tcp_xff":"","http_user_agent":"WordPress/5.3.2;
http://www.wen.com","status":"499"}
{"@timestamp":"2020-06-
09T17:12:13+08:00","host":"127.0.0.1","clientip":"127.0.0.1","size":0,"responset
ime":0.060,"upstreamtime":"0.060","upstreamhost":"127.0.0.1:9000","http_host":"1
27.0.0.1","uri":"/index.php","domain":"127.0.0.1","xff":"-","referer":"-
","tcp_xff":"","http_user_agent":"curl/7.29.0","status":"200"}
{"@timestamp":"2020-06-
09T17:12:14+08:00","host":"127.0.0.1","clientip":"127.0.0.1","size":0,"responset
ime":0.022,"upstreamtime":"0.022","upstreamhost":"127.0.0.1:9000","http_host":"1
27.0.0.1","uri":"/index.php","domain":"127.0.0.1","xff":"-","referer":"-
","tcp_xff":"","http_user_agent":"curl/7.29.0","status":"200"}
[root@VM_0_10_centos ~]# awk -F'"' -f check_nginx_log.awk
/apps/nginx/logs/access_json.log
4 127.0.0.1
56 172.105.120.92
5 58.87.87.99
11 111.199.184.16

#例：
统计/etc/fstab文件中每个文件系统类型出现的次数
[root@ubuntu1804 ~]#awk -F' +' '/^UUID/{fs[$3]++}END{for(i in fs){print
i,fs[i]}}' /etc/fstab
swap 1
ext4 3
[root@ubuntu1804 ~]#awk -F' +' '/^UUID/{print $3}' /etc/fstab |uniq -c
   3 ext4
   1 swap


#例：
统计/etc/fstab文件中每个单词出现的次数
[root@ubuntu1804 ~]#awk -F"[^[:alpha:]]" '{for(i=1;i<=NF;i++)word[$i]++}END{for
(a in word)if(a !="") print a,word[a]}'  /etc/fstab

#例：
提取出字符串Yd$C@M05MB%9&Bdh7dq+YVixp3vpw中的所有数字
[root@ubuntu1804 ~]#echo 'Yd$C@M05MB%9&Bdh7dq+YVixp3vpw' | awk '{gsub(/[^0-
9]/,"");print $0}'
05973
[root@ubuntu1804 ~]#echo 'Yd$C@M05MB%9&Bdh7dq+YVixp3vpw' |awk -F ""
'{for(i=1;i<=NF;i++){if ($i ~ /[[:digit:]]/){str=$i; str1=(str1 str)}};print
str1}'
[root@ubuntu1804 ~]#echo 'Yd$C@M05MB%9&Bdh7dq+YVixp3vpw' | awk -F'[^0-9]'
'{for(i=1;i<=NF;i++){printf "%s",$i }}'
05973

#例：
文件random.txt记录共5000个随机的整数，存储的格式100,50,35,89…请取出其中最大和最小的整
数
[root@ubuntu1804 ~]#str="";for((i=1;i<=5000;i++));do if [ $i -ne 5000 ];then
str+="$RANDOM,";else str+=$RANDOM;fi;done;echo $str > random.txt
[root@ubuntu1804 ~]#awk -F, '{max=$1;min=$1;for(i=1;i<=NF;i++){if($i>max)
{max=$i}else{if($i<min){min=$i}}}}END{print "最大值："max,"最 小值："min}'
random.txt


#例：
解决Dos攻击生产案例：监控当某个IP并发连接数超过100时，即调用防火墙命令封掉对应的IP，监
控频率每隔5分钟。防火墙命令为：iptables -A INPUT -s IP -j REJECT

[root@ubuntu1804 ~]#ss -nt | awk -F " +|:" 'NR!=1{ip[$(NF-2)]++}END{for(i in ip)
{if(ip[i]>100){system("iptables -A INPUT -s "i" -j REJECT")}}}'


#例：
将以下文本文件awktest.txt中 以inode列为标记，对inode列相同的counts列进行累加，并且统计出
同一inode中，beginnumber列中的最小值和endnumber列中的最大值
inode|beginnumber|endnumber|counts|
106|3363120000|3363129999|10000|
106|3368560000|3368579999|20000|
310|3337000000|3337000100|101|
310|3342950000|3342959999|10000|
310|3362120960|3362120961|2|
311|3313460102|3313469999|9898|
311|3313470000|3313499999|30000|
311|3362120962|3362120963|2|

输出的结果格式为：
106|3363120000|3368579999|30000|
310|3337000000|3362120961|10103|
311|3313460102|3362120963|39900|


[root@centos8 ~]#cat awktest.txt
inode|beginnumber|endnumber|counts|
106|3363120000|3363129999|10000|
106|3368560000|3368579999|20000|
310|3337000000|3337000100|101|
310|3342950000|3342959999|10000|
310|3362120960|3362120961|2|
311|3313460102|3313469999|9898|
311|3313470000|3313499999|30000|
311|3362120962|3362120963|2|
[root@centos8 ~]#awk -F '|' '!/^inode/{sum[$1]+=$4;
if(!begin[$1])begin[$1]=$2;else if(begin[$1]>$2)begin[$1]=$2;
if(!end[$1])end[$1]=$3;else if(end[$1]<$3)end[$1]=$3}
END{for(i in sum)print i"|"begin[i]"|"end[i]"|"sum[i]}' awktest.txt


[root@centos8 ~]#cat awktest.txt
inode|beginnumber|endnumber|counts|
106|3363120000|3363129999|10000|
106|3368560000|3368579999|20000|
310|3337000000|3337000100|101|
310|3342950000|3342959999|10000|
310|3362120960|3362120961|2|
311|3313460102|3313469999|9898|
311|3313470000|3313499999|30000|
311|3362120962|3362120963|2|
[root@centos8 ~]#awk -F '|' '!/^inode/{sum[$1]+=$4;
if(!begin[$1])begin[$1]=$2;else if(begin[$1]>$2)begin[$1]=$2;
if(!end[$1])end[$1]=$3;else if(end[$1]<$3)end[$1]=$3}
END{for(i in sum)print i"|"begin[i]"|"end[i]"|"sum[i]}' awktest.txt

2.4 按列抽取文本 cut

cut 命令可以提取文本文件或STDIN数据的指定列

#格式：
cut [OPTION]... [FILE]...


#常用选项
-d DELIMITER: 指明分隔符，默认tab
-f FILEDS:
  #: 第#个字段,例如:3
  #,#[,#]：离散的多个字段，例如:1,3,6
  #-#：连续的多个字段, 例如:1-6
  混合使用：1-3,7
-c　按字符切割
--output-delimiter=STRING指定输出分隔符



#[root@centos8 ~]#cut -d: -f1,3-4,7 /etc/passwd
[root@centos8 ~]#ifconfig |head -n2 |tail -n1|cut -d" " -f10
10.0.0.8
[root@centos8 ~]#ifconfig |head -n2 |tail -n1|tr -s " " |cut -d " " -f3
10.0.0.8
[root@rocky8 ~]#df|tail -n +2 |cut -d% -f1|rev|cut -d" " -f1 |rev
[root@rocky8 ~]#df|tail -n +2| cut -c44-46|tr -d ' '
[root@centos8 ~]#df | tr -s ' '|cut -d' ' -f5 |tr -dc "[0-9\n]"
0
0
1
0
5
1
15
1
[root@centos8 ~]#df | tr -s ' ' % |cut -d% -f5 |tr -d '[:alpha:]'
0
0
1
0
5
1
15
1
[root@centos8 ~]#df | cut -c44-46 |tr -d '[:alpha:]'
 0
 0
 1
 0
 5
 1
15
 1
[root@centos8 ~]#cut -d: -f1,3,7 --output-delimiter="---" /etc/passwd
root---0---/bin/bash
bin---1---/sbin/nologin
daemon---2---/sbin/nologin
cat /etc/passwd | cut -d: -f7
cut -c2-5 /usr/share/dict/words
[root@centos8 ~]#echo {1..10}| cut -d ' ' -f1-10 --output-delimiter="+" |bc
55


#例: 取分区利用率
#取分区利用率
[root@centos8 ~]#df|tr -s ' ' |cut -d' ' -f5 |tr -d %
[root@centos8 ~]#df|tr -s ' ' '%'|cut -d% -f5
Use
0
0
2
0
3
1
15
0
100
[root@centos8 ~]#df |cut -c 44-46|tail -n +2
 0
 0
 3
 0
 3
 1
13
0
[root@centos8 ~]#df | tail -n +2|tr -s ' ' % |cut -d% -f5
0
0
1
0
3
1
19
0
100
[root@centos8 ~]#df | tail -n +2|tr -s ' ' |cut -d' ' -f5 |tr -d %
0
0
1
0
3
1
19
0
100

2.5 合并多个文件 paste

paste 合并多个文件同行号的列到一行

#格式：
paste [OPTION]... [FILE]...


#常用选项：
-d  #分隔符：指定分隔符，默认用TAB
-s  #所有行合成一行显示


#例：
[root@centos8 ~]#cat alpha.log
a
b
c
d
e
f
g
h
[root@centos8 ~]#cat seq.log
1
2
3
4
5
[root@centos8 ~]#cat alpha.log seq.log
a
b
c
d
e
f
g
h
1
2
3
4
5
[root@centos8 ~]#paste alpha.log seq.log
a 1
b 2
c 3
d 4
e 5
f
g
h
[root@centos8 ~]#paste -d":" alpha.log seq.log
a:1
b:2
c:3
d:4
e:5
f:
g:
h:
[root@centos8 ~]#paste -s seq.log
1 2 3 4 5
[root@centos8 ~]#paste -s alpha.log
a b c d e f g h
[root@centos8 ~]#paste -s alpha.log seq.log
a b c d e f g h
1 2 3 4 5
[root@centos8 ~]#cat title.txt
ceo
coo
cto
[root@centos8 ~]#cat emp.txt
xin 
zhang
liang 
liu
[root@centos8 ~]#paste title.txt emp.txt
ceo xin 
coo zhang
cto liang 
liu
[root@centos8 ~]#paste -s title.txt emp.txt
ceo coo cto
xin zhang liang liu


[root@centos8 ~]#paste -s -d: f1.log f2.log
1:2:3:4:5:6:7:8:9:10
a:b:c:d:e:f:g:h:i:j
[root@centos8 ~]#seq 10
1
2
3
4
5
6
7
8
9
10
[root@centos8 ~]#seq 10 |paste -s -d+|bc
55



#例: 批量修改密码
[root@centos8 ~]#cat user.txt
liang
xin
[root@centos8 ~]#cat pass.txt
123456
liangjia
[root@centos8 ~]#paste -d: user.txt pass.txt
xin:123456
liang:liangjia
[root@centos8 ~]#paste -d: user.txt pass.txt|chpasswd

2.6 分析文本的工具

文本数据统计：wc
整理文本：sort
比较文件：diff和patch

2.6.1 收集文本统计数据 wc

wc 命令可用于统计文件的行总数、单词总数、字节总数和字符总数
可以对文件或STDIN中的数据统计

#常用选项
-l 只计数行数
-w 只计数单词总数
-c 只计数字节总数
-m 只计数字符总数
-L 显示文件中最长行的长度


#例：
wc story.txt
39   237   1901 story.txt
行数  单词数  字节数



#例：
[root@centos8 ~]#ll title.txt
-rw-r--r-- 1 root root 30 Dec 20 11:05 title.txt
[root@centos8 ~]#ll title1.txt
-rw-r--r-- 1 root root 28 Dec 20 11:06 title1.txt
[root@centos8 ~]#cat title.txt
ceo mage
coo zhang
cto 老王
[root@centos8 ~]#cat title1.txt
ceo mage
coo zhang
cto wang
[root@centos8 ~]#wc title.txt
3  6 30 title.txt
[root@centos8 ~]#wc title1.txt
3  6 28 title1.txt


[root@centos8 ~]#wc -l title.txt
3 title.txt
[root@centos8 ~]#cat title.txt | wc -l
3


[root@centos8 ~]#df | tail -n $(echo `df | wc -l`-1|bc)
devtmpfs      910220    0   910220  0% /dev
tmpfs       924728    0   924728  0% /dev/shm
tmpfs       924728   9224   915504  1% /run
tmpfs       924728    0   924728  0% /sys/fs/cgroup
/dev/sda2    104806400 4836160  99970240  5% /
/dev/sda3    52403200  398580  52004620  1% /data
/dev/sda1     999320  131764   798744  15% /boot
tmpfs       184944    4   184940  1% /run/user/0


#例：单词文件
[root@centos8 ~]#yum -y install words
[root@centos8 ~]#wc -l /usr/share/dict/linux.words
479829 /usr/share/dict/linux.words

2.6.2 文本排序 sort

把整理过的文本显示在STDOUT，不改变原始文件

#格式：
sort [options] file(s)



#常用选项
-r　执行反方向（由上至下）整理
-R　随机排序
-n　执行按数字大小整理
-h 人类可读排序,如: 2K 1G
-f　选项忽略（fold）字符串中的字符大小写
-u　选项（独特，unique），合并重复项，即去重
-t c　选项使用c做为字段界定符
-k #　选项按照使用c字符分隔的 # 列来整理能够使用多次



#例：
[root@centos8 data]#cut -d: -f1,3 /etc/passwd|sort -t: -k2 -nr |head -n3
nobody:65534
xiaoming:1002
mage:1001


#统计日志访问量
[root@centos8 data]#cut -d" " -f1 /var/log/nginx/access_log |sort -u|wc -l
201


#例：统计分区利用率
[root@centos8 ~]#df
Filesystem   1K-blocks  Used Available Use% Mounted on
devtmpfs      391676    0   391676  0% /dev
tmpfs       408092    0   408092  0% /dev/shm
tmpfs       408092   5816   402276  2% /run
tmpfs       408092    0   408092  0% /sys/fs/cgroup
/dev/sda2    104806400 2259416 102546984  3% /
/dev/sda3    52403200  398608  52004592  1% /data
/dev/sda1     999320  130848   799660  15% /boot
tmpfs        81616    0   81616  0% /run/user/0
/dev/sr0     7377866 7377866     0 100% /misc/cd


##查看分区利用率最高值
[root@centos8 ~]#df| tr -s ' ' '%'|cut -d% -f5|sort -nr|head -1
100


[root@centos8 ~]#df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort
0
0
0
1
1
1
15
5
[root@centos8 ~]#df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort -n
0
0
0
1
1
1
5
15
[root@centos8 ~]#df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort -n |tail
-n1
15
[root@centos8 ~]#df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort -nr
15
5
1
1
1
0
0
0
[root@centos8 ~]#df | tr -s " " %|cut -d% -f5|tr -d '[:alpha:]' | sort -nr|head
-n1
15



#例：
有两个文件，a.txt与b.txt ，合并两个文件，并输出时确保每个数字也唯一
#a.txt中的每一个数字在本文件唯一
200
100
34556
23
...
#b.txt中的每一个数字在本文件唯一
123
43
200
3321
...
#就是将两个文件合并后重复的行去除，不保留
100
345563
123
43
3321
...

2.6.3 去重 uniq

uniq命令从输入中删除前后相接的重复的行，uniq常和sort 命令一起配合使用：

#格式：
uniq [OPTION]... [FILE]...


#常见选项：
-c: 显示每行重复出现的次数
-d: 仅显示重复过的行
-u: 仅显示不曾重复的行


#uniq常和sort 命令一起配合使用：
#例：
sort userlist.txt | uniq -c


#例：统计日志访问量最多的请求
[root@centos8 data]#cut -d" " -f1 access_log |sort |uniq -c|sort -nr |head -3
 4870 172.20.116.228
 3429 172.20.116.208
 2834 172.20.0.222
[root@centos8 data]#lastb -f btmp-34 | tr -s ' ' |cut -d ' ' -f3|sort |uniq -c
|sort -nr | head -3
 86294 58.218.92.37
 43148 58.218.92.26
 18036 112.85.42.201


#例：并发连接最多的远程主机IP
[root@centos8 ~]#ss -nt|tail -n+2 |tr -s ' ' : |cut -d: -f6|sort|uniq -c|sort -
nr |head -n2
   7 10.0.0.1
   2 10.0.0.7


#例：取两个文件的相同和不同的行
[root@centos8 data]#cat test1.txt
a
b
1
c
[root@centos8 data]#cat test2.txt
b
e
f
c
1
2
#取文件的共同行
[root@centos8 data]#cat test1.txt test2.txt | sort |uniq -d
1
b
c
#取文件的不同行
[root@centos8 data]#cat test1.txt test2.txt | sort |uniq -u
2
a
e
f

2.6.4 比较文件

2.6.4.1 diff

diff 命令比较两个文件之间的区别

-u 选项来输出“统一的（unified）”diff格式文件，最适用于补丁文件

#例：
[root@centos8 ~]#cat f1.txt
xin
zhang
liang
xu
[root@centos8 ~]#cat f2.txt
xinxin
zhangmiao
liang
xu
shi


[root@centos8 ~]#diff f1.txt f2.txt
1,2c1,2
< xin
< zhang
---
> xinxin
> zhangmiao
4a5
> shi
[root@centos8 ~]#diff -u f1.txt f2.txt
--- f1.txt 2019-12-13 21:31:30.892775671 +0800
+++ f2.txt 2019-12-13 22:00:14.373677728 +0800
@@ -1,4 +1,5 @@
-xin
-zhang
+xinxin
+zhangmiao
liang
xu
+shi
[root@centos8 ~]#diff -u f1.txt f2.txt > f.patch
[root@centos8 ~]#rm -f f2.txt
[root@centos8 ~]#patch -b f1.txt f.patch
patching file f1.txt
[root@centos8 ~]#cat f1.txt
xinxin
zhangmiao
liang
xu
shi
[root@centos8 ~]#cat f1.txt.orig
xin
zhang
liang
xu

2.6.4.2 patch

patch 复制在其它文件中进行的改变（要谨慎使用）

-b 选项来自动备份改变了的文件
#例：
diff -u foo.conf foo2.conf > foo.patch
patch -b foo.conf foo.patch

2.6.4.3 vimdiff

相当于 vim -d

[root@centos8 ~]#cat f1.txt
xin
zhangmiao
liang
lilaoshi
zhao
[root@centos8 ~]#cat f2.txt
xin
zhang
liang
li
zhao
[root@centos8 ~]#which vimdiff
/usr/bin/vimdiff
[root@centos8 ~]#ll /usr/bin/vimdiff
lrwxrwxrwx. 1 root root 3 Nov 12  2019 /usr/bin/vimdiff -> vim
[root@centos8 ~]#vimdiff f1.txt f2.txt

2.6.4.4 cmp

#例：查看二进制文件的不同
[root@centos8 data]#ll /usr/bin/dir /usr/bin/ls
-rwxr-xr-x. 1 root root 166448 May 12  2019 /usr/bin/dir
-rwxr-xr-x. 1 root root 166448 May 12  2019 /usr/bin/ls
[root@centos8 data]#ll /usr/bin/dir /usr/bin/ls -i
201839444 -rwxr-xr-x. 1 root root 166448 May 12  2019 /usr/bin/dir
201839465 -rwxr-xr-x. 1 root root 166448 May 12  2019 /usr/bin/ls
[root@centos8 data]#diff /usr/bin/dir /usr/bin/ls
Binary files /usr/bin/dir and /usr/bin/ls differ

[root@centos8 ~]#cmp /bin/dir /bin/ls
/bin/dir /bin/ls differ: byte 737, line 2

#跳过前735个字节,观察后面30个字节
[root@centos8 ~]#hexdump -s 735 -Cn 30 /bin/ls
000002df  00 05 6d da 3f 1b 77 91  91 63 a7 de 55 63 a2 b9 |..m.?.w..c..Uc..|
000002ef d9 d2 45 55 4c 00 00 00  00 03 00 00 00 7d    |..EUL........}|
000002fd
[root@centos8 ~]#hexdump -s 735 -Cn 30 /bin/dir
000002df  00 f1 21 4e f2 19 7e ef  38 0d 9b 3e d7 54 08 39 |..!N..~.8..>.T.9|
000002ef e4 74 4d 69 25 00 00 00  00 03 00 00 00 7d    |.tMi%........}|
000002fd