【Linux】深入理解Awk

相关文章

简介

man awk

awk

NAME
       awk - pattern-directed scanning and processing language

SYNOPSIS
       awk [ -F fs ] [ -v var=value ] [ 'prog' | -f progfile ] [ file ...  ]

DESCRIPTION
       Awk  scans  each  input file for lines that match any of a set of patterns specified literally in prog or in one or more
       files specified as -f progfile.  With each pattern there can be an associated action that will be performed when a  line
       of  a file matches the pattern.  Each line is matched against the pattern portion of every pattern-action statement; the
       associated action is performed for each matched pattern. 
  • awk 最早由Alfred V. Aho, Peter J. Weinberger, and Brian W. Kernighan,最早写于1977年。
  • awk is a programming language.

Benchmarks

bbs_list

each row is

  • a computer bulletin board
  • phone number
  • baud rate
  • a code.
 aardvark    555-5553    1200/300          B
 alpo-net    555-3412    2400/1200/300     A
 barfly      555-7685    1200/300          A
 bites       555-1675    2400/1200/300     A
 camelot     555-0542    300               C
 core        555-2912    1200/300          C
 fooey       555-1234    2400/1200/300     B
 foot        555-6699    1200/300          B
 macfoo      555-6480    1200/300          A
 sdace       555-3430    2400/1200/300     A
 sabafoo     555-2127    1200/300          C

inventory-shipped

each row is

  • year
  • green crates shipped
  • red boxes shipped
  • orange bags shipped
  • blue packages shipped
 Jan  13  25  15 115
 Feb  15  32  24 226
 Mar  15  24  34 228
 Apr  31  52  63 420
 May  16  34  29 208
 Jun  31  42  75 492
 Jul  24  34  67 436
 Aug  15  34  47 316
 Sep  13  55  37 277
 Oct  29  54  68 525
 Nov  20  87  82 577
 Dec  17  35  61 401
 Jan  21  36  64 620
 Feb  26  58  80 652
 Mar  24  75  70 495
 Apr  21  70  74 514

e1. find lines contain foo

MacBook-Pro-3:benchmarks sunquan$ awk '/foo/{print $0}' bbs_list 
fooey       555-1234    2400/1200/300     B
foot        555-6699    1200/300          B
macfoo      555-6480    1200/300          A
sabafoo     555-2127    1200/300          C
  • /foo/ :pattern
  • {print $0} :action

e2. find lines contain 12 or 21 str

MacBook-Pro-3:benchmarks sunquan$ awk '/12/{print $0}/21/{print $0}' bbs_list inventory-shipped 
aardvark    555-5553    1200/300          B
alpo-net    555-3412    2400/1200/300     A
barfly      555-7685    1200/300          A
bites       555-1675    2400/1200/300     A
core        555-2912    1200/300          C
fooey       555-1234    2400/1200/300     B
foot        555-6699    1200/300          B
macfoo      555-6480    1200/300          A
sdace       555-3430    2400/1200/300     A
sabafoo     555-2127    1200/300          C
sabafoo     555-2127    1200/300          C
Jan  21  36  64 620
Apr  21  70  74 514
  • if a line contains both strings, it is printed twice,once by each rule.

e3. 统计当前目录下7月份更改的文件size总和

MacBook-Pro-3:benchmarks sunquan$ ls -l
total 16
-rw-r--r--  1 sunquan  staff  484  7 13 11:05 bbs_list
-rw-r--r--  1 sunquan  staff  320  7 13 11:22 inventory-shipped
MacBook-Pro-3:benchmarks sunquan$ ls -l | awk '$6 == "7" {sum += $5} END {print sum}'
804
MacBook-Pro-3:benchmarks sunquan$

Program

You write an awk program that consists of a series of rules to tell awk what to do

  • awk 接受文件,按行检索是否匹配程序中的规则。
  • 如果某行匹配某个规则,则执行该规则后的action,直到最后一行

Program format

  1. awk ‘program’ input-file1 input-file2

  2. awk -f program-file input-file1 input-file2

  3. program like pattern { action } pattern { action } …

Record

awk会把输入文件的内容划分为一个个record。通过内置的RS变量,默认值为\n

awk '{ print $0 } RS="/" bbs_list
or
awk ’BEGIN { RS = "/" } ; { print $0 }’ bbs_list

Field

record 会自动被解析为多个field,你可以通过$来引用field

This seems like a pretty nice example
  • $1 :This

e4. 计算

awk '{ total = ($5 + $4 + $3 + $2) ; print total }' inventory-shipped

内置变量说明

术语解释用法
RSrecord分隔符awk '{ print $0 } RS="/" bbs_list
$NF最后一个fieldawk ‘$1 ~ /foo/ {print 1 , 1, 1,NF}’ bbs_list
$0当前recordawk ‘{print $0}’ bbs_list
NFthe number of fieldsawk ‘END {print "the num of fields is " NF}’ bbs_list
NRline numawk ‘{print NR , $0}’ bbs_list
FSfield分隔符awk ‘BEGIN {FS = “/”} ; {print $1}’ bbs_list
使用-F-指定FSawk -F- ‘{print $1}’ bbs_list
OFS输出Field分隔符
ORS输出Record分隔符awk ‘BEGIN { OFS = “;” ; ORS = “\n\n” } ; {print $1, $2}’ bbs_list
>输出到文件中awk ‘BEGIN {OFS = “;”} ; {print $1, $2 > “result”}’ bbs_list

Patterns

术语解释awk用法
exp ~ /regexp/record分隔符awk ‘$3 ~ /200/ {print $1, $3}’ bbs_list
exp !~ /regexp/最后一个fieldawk ‘$3 !~ /200/ {print $1, $3}’ bbs_list
< <= > >= == != ~ !~Comparison Expressionsawk ‘$1 ~ /foo/ {print $1}’ bbs_list
&& || !Boolean Operatorsawk ‘$1 ~ /foo/ && $3 ~ 300 {print $1, $3}’ bbs_list
BEGIN开头
END结尾
Empty Pattern匹配任何一行awk ’{ print $1 }’ BBS-list

Action

Expressions as Action

术语解释awk用法
\\ \a \b \f \n \r \tConstant Expressionsawk ‘BEGIN {print “hello \n world”}’ bbs_list
var=textVariableawk ‘{print $var}’ var=1 bbs_list
x opt yOptawk ‘{ total = $2 + $3 / $4 ; print total}’ inventory-shipped
fun(args)invoke funsawk ‘BEGIN {result = rand() ; print result}’ bbs_list

Control Statement in Action

术语解释awk用法
if (cod) then-body else else-bodyIf elseawk ‘{if ($2 % 2 == 0) print $2 “is evel” ; else print $2 “is odd”}’ inventory-shipped
while (cod) bodywhile#! /usr/bin/awk -f
{
i = 1
while (i <= 3){
print $i
i++
}
}
for (initialization; condition; increment) bodyforawk ‘{ for (i=1; i<=3; i++) print $i }’ bbs_list
break continue
nextstop cur record, go to next record.

Array in Awk

Arr in awk looks like map。

  • array [index ] to get a value
  • array[subscript] = value to put a value
  • you don’t need to init size for arr
#! /usr/bin/awk -f
{
		if ($1 > max)
         max = $1
    arr[$1] = $0
}
END{
    for (x = 1; x <= max; x++)
        	print arr[x]
}

## input data 
5 IamtheFiveman
2 Who are you? The new number two! 4 ...Andfouronthefloor
1 Who is number one?
3 I three you.

## output result
1 Who is number one?
2 Who are you? The new number two! 3 I three you.
4 ...Andfouronthefloor
5 IamtheFiveman

e5. awk scripts:hello world

MacBook-Pro-3:shell sunquan$ which awk
/usr/bin/awk
# demo: awkHelloWorld 
#! /usr/bin/awk -f
BEGIN {print "hello, world"}

MacBook-Pro-3:shell sunquan$ awkHelloWorld
"hello, world"

Func in Awk

 #! /usr/bin/awk -f
 #file name is awkHelloWorld
 function myFunc (win) {
     print "the value is " , win
 }
 function doBefore (){
     print "hello world"
 }
 function doLast (){
     print "the num of lines is ", NR
 }
 
 BEGIN {doBefore()} ; {myFunc($1)} ; END {doLast()}
 
 # result 
awkHelloWorld benchmarks/bbs_list 
hello world
the value is  aardvark
the value is  alpo-net
the value is  barfly
the value is  bites
the value is  camelot
the value is  core
the value is  fooey
the value is  foot
the value is  macfoo
the value is  sdace
the value is  sabafoo
the num of lines is  11

个人简介

工作:Senior Engineer Alibaba
email:sunquan9301@163.com
WX:sunquan97
HomePage:qsun97.com

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值