每天一个Linux命令之三剑客awk
awk:模式扫描和数据处理语言
描述: awk是一种编程语言,用于Linux/unix下对文本和数据进行扫描与处理,数据可以来自标准输入、文件、管道。工作流程是:逐行扫描文件,寻找特定匹配模式的行,并进行相应的处理动作。awk读取文件文件内容每一行时,将对比该行是否与给定的模式相匹配,如果匹配,则执行相应处理动作,否则不对该行进行处理。如果没有指定的处理脚本,则把匹配的行显示到标准输出(默认print动作),如果没有指定模式匹配,则默认匹配所有数据
语法:
awk [ POSIX or GNU style options ] -f program-file [ -- ] file ... awk [ POSIX or GNU style options ] [ -- ] program-text file ...
选项:
-F fs 使用fs作为输入行的分隔符(默认是空格或者制表符)
-v var=val 在处理过程之前,给var设置一个变量val
-f program-file 从文件中读取awk的处理内容
内置变量:
ARGC 命令行参数个数
ARGV 命令行参数的一个排列,索引从0到ARGC-1
ARGIND ARGV最近处理文件的索引
FILENAME 当前输入文档的名称
FNR 当前输入文档的记录编号
NR 输入流的当前记录编号(行号)
FS 字段分隔符
NF 当前记录的字段个数
OFS 输出字段分隔符,默认为空格
RS 输出记录分隔符默认是换行符\n
ORS 输出记录分隔符,默认是换行符\n
AWK patterns may be one of the following:
BEGIN END /regular expression/ relational expression pattern && pattern pattern || pattern pattern ? pattern : pattern (pattern) ! pattern pattern1, pattern2
例子:
-F 指定分隔符
[root@python ~]# cat test.txt 3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol isnetserv 48128/udp # Image Systems Network Services blp5 48129/tcp # Bloomberg locator blp5 48129/udp # Bloomberg locator com-bardac-dw 48556/udp # com-bardac-dw iqobject 48619/tcp # iqobject [root@python ~]# awk '{print $2}' test.txt 48049/tcp 48128/udp 48129/tcp 48129/udp 48556/udp 48619/tcp [root@python ~]# awk -F'/' '{print $2}' test.txt tcp # 3GPP Cell Broadcast Service Protocol udp # Image Systems Network Services tcp # Bloomberg locator udp # Bloomberg locator udp # com-bardac-dw tcp # iqobject #指定空格或者/作为分分隔符。+代表重复前面的字符一次或者多次 [root@python ~]# awk -F'[ /]+' '{print $2}' test.txt 48049 48128 48129 48129 48556 48619
-v 变量赋值
[root@python ~]# awk -v a=2 '{print $a}' test.txt 48049/tcp 48128/udp 48129/tcp 48129/udp 48556/udp 48619/tcp [root@python ~]# awk -v a=342 'BEGIN{print a}' 342
-f 从文件中读取awk的内容
#编辑awk脚本文件 [root@python ~]# cat a.txt /^$/ {print "BLANK LINE"} #有几个空行就打印多少行的BLANK LINE [root@python ~]# awk -f a.txt /etc/ssh/ssh_config BLANK LINE BLANK LINE BLANK LINE BLANK LINE
记录和字段
#$0表示将匹配的内容完全输出 [root@python ~]# echo "I am a bird"|awk '{print $0}' I am a bird [root@python ~]# echo "I am a bird"|awk '{print $1,$2}' I am #NF表达总的字段的个数(可以理解为列数) [root@python ~]# echo "I am a bird"|awk '{print NF}' 4 #$NF表示最后一个字段 [root@python ~]# echo "I am a bird"|awk '{print $NF}' bird #NR表示输入流的当前记录编号,可以理解为(匹配的行编号) [root@python ~]# echo "I am a bird"|awk '{print NR}' 1 [root@python ~]# echo -e "I am a bird\nhello"|awk '{print NR}' 1 2
OFS指定输出的分隔符
echo "I am a bird"|awk 'BEGIN{OFS="#"}{print $1,$2,$3,$4}' I#am#a#bird
正则匹配
#匹配含有tcp的行 [root@python ~]# awk '/tcp/{print $0}' test.txt 3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol blp5 48129/tcp # Bloomberg locator iqobject 48619/tcp # iqobject #匹配以blp5开头的行 [root@python ~]# awk '/^blp5/{print $0}' test.txt blp5 48129/tcp # Bloomberg locator blp5 48129/udp # Bloomberg locator #匹配以tor结束的行 [root@python ~]# awk '/tor$/{print $0}' test.txt blp5 48129/tcp # Bloomberg locator blp5 48129/udp # Bloomberg locator #逻辑或|| [root@python ~]# awk '/blp5/||/3gpp/{print $0}' test.txt 3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol blp5 48129/tcp # Bloomberg locator blp5 48129/udp # Bloomberg locator #逻辑与&& [root@python ~]# awk '/blp5/&&/tcp/{print $0}' test.txt blp5 48129/tcp # Bloomberg locator #逻辑非 [root@python ~]# awk '!/blp5/&& !/3gpp/{print $0}' test.txt isnetserv 48128/udp # Image Systems Network Services com-bardac-dw 48556/udp # com-bardac-dw iqobject 48619/tcp # iqobject
匹配范围
[root@python ~]# awk '/3gpp/,/blp5/{print $0}' test.txt 3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol isnetserv 48128/udp # Image Systems Network Services blp5 48129/tcp # Bloomberg locator
BEGIN和END格式(打印标签)
#打印行首 [root@python ~]# awk 'BEGIN{print "SERVICE\t\tPORT\t\t\tDESCRIPION"}{print $0}' test.txt SERVICE PORT DESCRIPION 3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol isnetserv 48128/udp # Image Systems Network Services blp5 48129/tcp # Bloomberg locator blp5 48129/udp # Bloomberg locator com-bardac-dw 48556/udp # com-bardac-dw iqobject 48619/tcp # iqobject #打印行尾 [root@python ~]# awk '{print $0}END{print "The ending..."}' test.txt 3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol isnetserv 48128/udp # Image Systems Network Services blp5 48129/tcp # Bloomberg locator blp5 48129/udp # Bloomberg locator com-bardac-dw 48556/udp # com-bardac-dw iqobject 48619/tcp # iqobject The ending... #行首行尾都打印 [root@python ~]# awk 'BEGIN{print "SERVICE\t\tPORT\t\t\tDESCRIPION"}{print $0}END{print "The ending..."}' test.txt SERVICE PORT DESCRIPION 3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol isnetserv 48128/udp # Image Systems Network Services blp5 48129/tcp # Bloomberg locator blp5 48129/udp # Bloomberg locator com-bardac-dw 48556/udp # com-bardac-dw iqobject 48619/tcp # iqobject The ending...
表达式与操作符:如果在awk定义的变量中没有初始化,则初始值为空字符串或者0,字符操作一定要加引号(a="How are you")。
#统计所有的空白行 [root@python ~]# awk '/^$/{print x+=1}' /etc/ssh/ssh_config 1 2 3 4 #打印出空白行的总个数 [root@python ~]# awk '/^$/{x+=1}END{print x}' /etc/ssh/ssh_config 4 #~(匹配) 、!~(不匹配).打印出root的ID号。匹配第一个字段为root的行,打印出其UID [root@python ~]# awk -F':' '$1~/root/{print $3}' /etc/passwd 0 #打印出UID大于400的用户 [root@python ~]# awk -F':' '$3>400{print $1}' /etc/passwd saslauth mysql dianel [root@python ~]# awk -F':' '$3>400{x+=1}END{print x}' /etc/passwd 3
awk的高级应用
if条件判断
#判断boot分区可用容量小于20M时报警,否则就显示OK [root@python ~]# df|grep 'boot'|awk '{if($4<20000)print"alart";else print "ok"}' ok [root@python ~]# seq 5|awk '{if($0==3)print $0}' 3 [root@python ~]# seq 5|awk '{if($0==3)print $0;else print "no"}' no no 3 no no
while循环:语法格式
1.while (condition) statement 2.do statement while (condition)
两种格式:
#因为i和total都初始化,默认为0 [root@python ~]# awk 'BEGIN{do {i++;total+=i}while(i<100)print total}' 5050 [root@python ~]# awk 'BEGIN{while(i<100){i++;total+=i}print total}' 5050
for循环:语法格式
1.for (expr1; expr2; expr3) statement 2.for (var in array) statement [root@python ~]# awk 'BEGIN{for(i=0;i<101;i++){total+=i} print total}' 5050
break和continue
break:跳出循环
continue:终止当前循环
打印IP:
[root@python ~]# ifconfig eth0|awk '/Bcast/'|awk -F'[ :]+' '{print $4}' 192.168.1.13
2017/4/24 21:13:01
转载于:https://blog.51cto.com/dianel/1950132