linux三剑客之awk
打印整个文件:awk '{print}' test.txt
打印第一列:awk '{print $1}' test.txt
打印第一、二列以空格分割:awk '{print $1" "$2}' test.txt
打印第一、二列以TAB分割:awk '{print $1"\t"$2}' test.txt
打印所有包含"m"的行:awk '/m/ {print $0}' test.txt
计数匹配到的"m"的次数并输出:awk '/m/{count++} END{print count}' test.txt
打印第2列大于200的行:awk '$2>300' test.txt
打印字符长度大于8的行:awk 'length($4)>8' test.txt
ARGC:
统计命令行中传入的文件个数:awk 'BEGIN{print ARGC-1}' file1 file2 file3
在awk脚本中指定文件名:awk 'BEGIN{ARGV[1]="file1.txt"} {print}'
处理命令行参数:awk '{print "The argument is:", ARGV[1]}' file.txt
ARGV:
对命令行中传入的每个文件名进行操作:
awk '{
for (i = 1; i < ARGC; i++) {
print "Processing file: " ARGV[i]
# do something with each file
}
}' file1.txt file2.txt
FNR:
打印文件并且打印行数,新文件从头开始计数:awk '{print $0,FNR}' test.txt test2.txt
NR:
打印文件并且打印行数,新文件不从头计数:awk '{print $0,FNR}' test.txt test2.txt
NF:
打印文件并且打印列数:awk '{print $0,NF}' test.txt
RS:
将文件以tab为换行符,只要有tab就换行:awk -v RS='\t' '{print $0,NR}' test.txt
ORS:
将换行符更改为tab符号,根据自己需要更改:awk -v ORS='\t' '{print $0,NR}' test.txt
FILENAME:
显示正在处理的文件的名字:awk '/m/ { print FILENAME, $0 }' test.txt test2.txt
CONVFMT:
将浮点数的输出精度改为2位小数:
awk 'BEGIN {
CONVFMT = "%.2f"
print 123456789.1234567, 1.23456789
}'
ENVIRON:
输出当前用户的主目录和PATH环境变量的值:awk 'BEGIN { print "HOME=" ENVIRON["HOME"]; print "PATH=" ENVIRON["PATH"] }'
FS:
以","分割文本:awk 'BEGIN{FS=","}{print $1}' test3.txt
RLENGTH、RSTART:
使用match()函数操作符进行正则表达式匹配时,RSTART此变量表示由 match 函数匹配的字符串的第一个字符的位置,RLENGTH会被设置为最近一次匹配的长度:
awk 'BEGIN {
str = "hello world"
if (match(str, /world/)) {
print "Matched string: " substr(str, RSTART, RLENGTH)
print "Matched length: " RLENGTH
}
}'
输出结果为:
Matched string: world
Matched length: 5
ARGIND
表示当前正在处理的文件的序号:
awk '{ print FNR, ARGIND $0 }' test.txt test2.txt
后续会写一些实际操作文件的案例。