AWK
语法
awk [options] ‘BEGIN{action}pattern{action}…END{action}’ file
awk [options] -f program.awk file
options
-F fs use fs for the input field separator
-v val=val assign the value to the variable var, before execution of the program begins.
pattern
/regex/: extended regular expression
relational expression: if…else…
pattern1, pattern2: pattern range
Built-in Variables
1. NF, NR
NF The number of fields
NR The total number of input records seen so far
2. FS, RS, OFS, ORS
FS: The input field separator, a space by default
RS: The input record separator, by default a newline
OFS: The output field separator, a space by default
ORS: The output record separator, by default a newline
3. IGNORECASE
Not case-sensitivity
4. ENVIRON
awk 'BEGIN{for(i in ENVIRON) print i, ENVIRON[i]}'
awk 'BEGIN{print ENVIRON["JAVA_HOME"]}'
5. ARGC, ARGV, ARGIND
ARGC: the number of arguments
ARGIND: the index in ARGV of the current file being processed
ARGV: Array of arguments
awk 'BEGIN{print "ARGC="ARGC; for(i in ARGV) print i"="ARGV[i]}' /etc/passwd
ARGC=2, 0=awk, 1=/etc/passwd
6. FILENAME
the name of the current input file
awk 'BEGIN{print FILENAME}{print FILENAME; exit}'
7. OFMT
number output format “.6g”
awk 'BEGIN{printf("%.2f %.2f\n", 1/6, 3.1415926)}'
awk 'BEGIN{OFMT="%.2f"; print 1/6, 3.1415926)}'
8. FIELDWIDTH
set fields by fixed width
date +"%Y%m%d%H%M%S" | awk 'BEGIN{FIELDWIDTH="4 2 2 2 2 2"}{print $1"-"$2"-"$3, $4":"$5":"$6}'
9. RSTART, RLENGTH
RSTART: the index of the first character matched by match(), 0
RLENGTH: the length of the string matched by matched by match(), -1
awk 'BEGIN{start=match("this is match test", /m[a-z]+/); print start, RSTART, RLENGTH}'
9 9 5
Built-in Functions:
1. Numeric
int(x)
sqrt(x)
rand(): return a random number, [0-1)
srand([expr]): use expr as a seed for random generator, if not provided, use system current time
awk 'BEGIN{print int(2.3), int(012), int(0xFF), int(3a), int(a3)}'
2 10 255 3 0
awk 'BEGIN{print rand(), 10*rand()}'
awk 'BEGIN{srand(); print rand(), 10*rand()}'
2. String
sub(regex, replacement[, target]): use the replacement to replace the regex matched in string target(by default $0)
gsub(regex, replacement[, target]): global sub
gensub(regex, replacement, how[, target]): gawk
gensub(regex, replacement, “g|G” target) => gsub
gensub(regex, replacement, 0, target) => gsub, but with warning
gensub(regex, replacement, N, target) => N is a digit from 1 to 9, index of the matched sub-expression
awk 'BEGIN{info="this is a test2010test!"; gsub(/[0-9]+/,info); print info}'
gawk 'BEGIN{a="abc def"; b=gensub(/(.+) (.+)/, "\\2 \\1", "g", a); print b}'
def abc
echo "a b c a b c" | gawk '{print gensub(/a/,"AA",2)}'
a b c AA b c
index(string, find): index of the find in the string, or 0 if not present
match(string, regex[, array]): position of the regex occurring in the string
length([string])
substr(string, position[, len])
split(string, array[, regexp]): split the string into the array on the regex
awk 'BEGIN{s="this is a test"; print index(s, "a")}'
awk 'BEGIN{s="this is a test"; print index(s, "a") ? "ok" : :no found"}'
awk 'BEGIN{s="this is a match test"; pos=match(s, /m[a-z]+/, array); print pos; for(i in array) print i, array[i]}'
11
0start 11
0length 5
0 match
awk 'BEGIN{s="this is a test"; print substr(s, 9, 1)}'
a
awk 'BEGIN{s="this is a test"; print substr(s, 9)}'
a test
awk 'BEGIN{s="this is a split test"; len=split(s,array); print len; for(i in array) print i, array[i]}'
5
4 split
5 test
1 this
2 is
3 a
awk 'BEGIN{FS=":"}/^root/{split($0,array); for(i in array) print i, array[i]}' /etc/passwd
awk '/^root/{split($0,array,/:/); for(i in array) print i, array[i]}' /etc/passwd
4 0
5 root
6 /root
7 /bin/bash
1 root
2 x
3 0
Associative Arrays
a. sorting by values
len = asort(s) # s: changed, the indexes are replaced with sequential integers
len = asort(s, d) # s: unchanged; d: a sorted duplicate array of s
b. sorting by indexes
len = asorti(s) # s: changed, restorted by indexes
len = asorti(s, d) # s: unchanged; d: a new array of sorted indexes
awk '{a[$1]=$2}END{for(i in a) print i, a[i]}' abc.txt
10 35
12 30
22 13
24 20
awk '{a[$1]=$2}END{for(i=1;i<=asort(a,b);i++) print i, b[i]}' abc.txt
1 13
2 20
3 30
4 40
awk '{a[$1]=$2}END{for(i=1;i<=asorti(a,b);i++) print i, b[i]}' abc.txt
1 10
2 12
3 22
4 24
sprintf(format, expr): return the printed expr according to the format
tolower(string)
toupper(string)
awk 'BEGIN{s=sprintf("%.2g %s %d", 3.1415926, 3.1415926, 3.1415926); print s}'
3.1 3.14159 3
3. Time
mktime(“YYYY MM DD HH MM SS[ DST]”): return a time stamp
systime(): return current time stmap
strftime([format[, ts])
awk 'BEGIN{print mktime("2014 12 20 14 25 32")}'
awk 'BEGIN{print systime()}'
awk 'BEGIN{print strftime()}'
awk 'BEGIN{print strftime("%c", systime())}' # date +%c
4. IO
close(file[, how]): close file, pipe or co-process; how is either “from” or “to”
getline set $0 from next input record, set NF, NR, FNR
getline <file set $0 from next record of file, set NF
getline var set var from next input record, set NR, FNR
getline var <file set vat from next record of file
command | getline [var] run command piping the output either into $0 or var
command | & getline [var] run command as a co-process piping the output either into $0 or var. co-processes are a gawk extension
next: stop processing the current input record
print [expr-list [>file] ]
printf [format, expr-list [>file] ]
system(“cmd”) execute the command, and return the exit status
fflush([file]) flush any buffers
print … | command write on a pipe
print … |& command write on a co-process
awk 'BEGIN{while("cat /etc/passwd" | getline) print; close("/etc/passwd")}'
awk 'BEGIN{while(getline <"/etc/passwd") print; close("/etc/passwd")}'
awk 'BEGIN{"date" | getline d; print d}'
awk 'BEGIN{"date" | getline d; split(d,mon); print mon[2]}'
awk 'BEGIN{while("ls" | getline) print}'
awk 'BEGIN{printf("Enter your account: "); getline name; print name}'
awk 'BEGIN{l=system("ls -l"); print l}'
# prompting, wait for input
awk 'BEGIN{printf "What is your name? "; getline name <"/dev/tty"} $1~name {print "Found" name "on line ", NR".} END{print "See you," name "."}' /etc/passwd
# count number of file
awk 'BEGIN{while(getline <"/etc/passwd" >0) lc++; print lc}
# sort
awk '{print $1, $2 | "sort"} END{close("sort)}' abc.txt