格式
awk ' BEGIN{ print "start" } pattern { commands } END{ print "end" } file
特殊变量
[clz@localhost ~]$ echo -e "line1 f2 f3\nline2 f4 f5\nline3 f6 f7" | awk '{
print "Line no:"NR,"No of fields:"NF , "$0="$0, "$1="$1, "$2="$2, "$3="$3
}'
Line no:1 No of fields:3 $0=line1 f2 f3 $1=line1 $2=f2 $3=f3
Line no:2 No of fields:3 $0=line2 f4 f5 $1=line2 $2=f4 $3=f5
Line no:3 No of fields:3 $0=line3 f6 f7 $1=line3 $2=f6 $3=f7
外部值传递给awk
# -v将外部数值传入
[clz@localhost ~]$ echo | awk -v VARIABLE=$VAR '{ print VARIABLE }'
10000
# 多个外部变量传递给awk
[clz@localhost ~]$ var1="Variable1";var2="Variable2"
[clz@localhost ~]$ echo | awk '{ print v1, v2 }' v1=$var1 v2=$var2
Variable1 Variable2
上面的方法中,变量之间用空格分割,以键—值对的形式作为awk的命名行参数紧随在BEGIN,{},和END语句块之后
用getline读取行
[clz@localhost ~]$ seq 5 | awk 'BEGIN { getline; print "Read ahead first line", $0} { print $0 }'
Read ahead first line 1
2
3
4
5
使用过滤对awk处理的行进行过滤
字段定界符
awk -F: '{ print $NF }' /etc/passwd
awk 'BEGIN { FS=":" }{ print $NF }' /etc/passwd
从awk中读取命令输出
"command" | getline output;
ex:
[clz@localhost ~]$ echo | awk '{ "grep root /etc/passwd" | getline cmdout; print cmdout }'
root:x:0:0:root:/root:/bin/bash
通过使用getline , 将外部命令输出读入变量cmdout
在awk中使用循环
awk 内建字符串控制函数
统计词频
#!/bin/bash
# Filename: word_freq.sh
if [ $# -ne 1 ]
then
echo "Usage: $0 filename";
exit -1
fi
filename=$1
# 正则出单词安行输出
egrep -o "\b[[:alpha:]]+\b" $filename | \
awk '{ count[$0]++}
END{ printf("%-14s%s\n", "Word","count") ;
for ( ind in count)
{
printf("%-14s%s\n",ind,count[ind]); }
}'