awk命令示例详解

awk options program file

一种用于文本处理的编程语言工具

参数options通常可以有以下选项

  • F fs:指定文件分隔符
  • f file:指定awk脚本文件
  • v var=value:定义变量

使用变量

  • $0:表示整行
  • $1:表示第一个数据字段
  • $2:表示第二个数据字段
  • $n:表示第N个数据字段

假设我们有myfile定义如下:

icbc@ubuntu:~$ cat myfile
this is a test
this is the second test.
icbc@ubuntu:~$ 

执行命令:

awk '{print $1}' myfile

得到如下执行结果:

icbc@ubuntu:~$ awk '{print $1}' myfile
this
this
icbc@ubuntu:~$ 
icbc@ubuntu:~$ 

如果某些文件中分隔符不是空格或者Tab键,我们可以通过-F参数来指定文件分隔符:

awk -F: '{print $1}' /etc/passwd

得到如下结果:

icbc@ubuntu:~$ awk -F: '{print $1}' /etc/passwd
root
daemon
bin
sys
sync
games
man
lp
mail
news
uucp
proxy
www-data
backup
list
irc
gnats
nobody
systemd-timesync
systemd-network
systemd-resolve
syslog
_apt
messagebus
uuidd
lightdm
whoopsie
avahi-autoipd
avahi
dnsmasq
colord
speech-dispatcher
hplip
kernoops
pulse
rtkit
saned
usbmux
icbc
mysql
cups-pk-helper
geoclue
gdm
gnome-initial-setup
redis
jenkins
sshd
icbc@ubuntu:~$ 
icbc@ubuntu:~$ 

使用多个命令

echo "Hello Tom" | awk '{$2="Adam"; print $0}'

上述命令会将$2的值设置为Adam,打印整行得到如下结果:

icbc@ubuntu:~$ echo "Hello Tom" | awk '{$2="Adam"; print $0}'
Hello Adam
icbc@ubuntu:~$ 
icbc@ubuntu:~$ 

从文件中读取脚本文件

我们有testfile内容如下:

icbc@ubuntu:~$ 
icbc@ubuntu:~$ cat testfile
{print $1 " home at " $6}

icbc@ubuntu:~$ 
icbc@ubuntu:~$ 

调用脚本文件得到如下内容:

icbc@ubuntu:~$ 
icbc@ubuntu:~$ awk -F: -f testfile /etc/passwd
root home at /root
daemon home at /usr/sbin
bin home at /bin
sys home at /dev
sync home at /bin
games home at /usr/games
man home at /var/cache/man
lp home at /var/spool/lpd
mail home at /var/mail
news home at /var/spool/news
uucp home at /var/spool/uucp
proxy home at /bin
www-data home at /var/www
backup home at /var/backups
list home at /var/list
irc home at /var/run/ircd
gnats home at /var/lib/gnats
nobody home at /nonexistent
systemd-timesync home at /run/systemd
systemd-network home at /run/systemd/netif
systemd-resolve home at /run/systemd/resolve
syslog home at /home/syslog
_apt home at /nonexistent
messagebus home at /var/run/dbus
uuidd home at /run/uuidd
lightdm home at /var/lib/lightdm
whoopsie home at /nonexistent
avahi-autoipd home at /var/lib/avahi-autoipd
avahi home at /var/run/avahi-daemon
dnsmasq home at /var/lib/misc
colord home at /var/lib/colord
speech-dispatcher home at /var/run/speech-dispatcher
hplip home at /var/run/hplip
kernoops home at /
pulse home at /var/run/pulse
rtkit home at /proc
saned home at /var/lib/saned
usbmux home at /var/lib/usbmux
icbc home at /home/icbc
mysql home at /nonexistent
cups-pk-helper home at /home/cups-pk-helper
geoclue home at /var/lib/geoclue
gdm home at /var/lib/gdm3
gnome-initial-setup home at /run/gnome-initial-setup/
redis home at /var/lib/redis
jenkins home at /var/lib/jenkins
sshd home at /run/sshd
icbc@ubuntu:~$ 

我们将脚本文件内容修改如下:

icbc@ubuntu:~$ 
icbc@ubuntu:~$ cat testfile2
{
text = " home at "
print $1 $6
}
icbc@ubuntu:~$ 
icbc@ubuntu:~$ 

再次调用得到如下结果:

icbc@ubuntu:~$ awk -F: -f testfile2 /etc/passwd
root/root
daemon/usr/sbin
bin/bin
sys/dev
sync/bin
games/usr/games
man/var/cache/man
lp/var/spool/lpd
mail/var/mail
news/var/spool/news
uucp/var/spool/uucp
proxy/bin
www-data/var/www
backup/var/backups
list/var/list
irc/var/run/ircd
gnats/var/lib/gnats
nobody/nonexistent
systemd-timesync/run/systemd
systemd-network/run/systemd/netif
systemd-resolve/run/systemd/resolve
syslog/home/syslog
_apt/nonexistent
messagebus/var/run/dbus
uuidd/run/uuidd
lightdm/var/lib/lightdm
whoopsie/nonexistent
avahi-autoipd/var/lib/avahi-autoipd
avahi/var/run/avahi-daemon
dnsmasq/var/lib/misc
colord/var/lib/colord
speech-dispatcher/var/run/speech-dispatcher
hplip/var/run/hplip
kernoops/
pulse/var/run/pulse
rtkit/proc
saned/var/lib/saned
usbmux/var/lib/usbmux
icbc/home/icbc
mysql/nonexistent
cups-pk-helper/home/cups-pk-helper
geoclue/var/lib/geoclue
gdm/var/lib/gdm3
gnome-initial-setup/run/gnome-initial-setup/
redis/var/lib/redis
jenkins/var/lib/jenkins
sshd/run/sshd
icbc@ubuntu:~$ 

awk预处理

如果我们需要对我们的处理结果添加标题或者抬头,那么我们就可以使用BEGIN关键字来实现,BEGIN关键字会确保在数据处理前执行。

awk 'BEGIN {print "The File Contents:"}
 
{print $0}' myfile

得到执行结果如下:

icbc@ubuntu:~$ 
icbc@ubuntu:~$ awk 'BEGIN {print "The File Contents:"}
>  
> {print $0}' myfile
The File Contents:
this is a test
this is the second test.
icbc@ubuntu:~$ 
icbc@ubuntu:~$ 

###awk后处理
使用END关键字

awk 'BEGIN {print "The File Contents:"}
 
{print $0}
 
END {print "File footer"}' myfile

得到执行结果如下:

icbc@ubuntu:~$ 
icbc@ubuntu:~$ awk 'BEGIN {print "The File Contents:"}
>  
> {print $0}
>  
> END {print "File footer"}' myfile
The File Contents:
this is a test
this is the second test.
File footer
icbc@ubuntu:~$ 

组合起来一起使用

icbc@ubuntu:~$ cat testfile3
BEGIN{
print "USERS and their corresponding home"
print "UserName \t HomePath"
print "---\t---"
FS=":"
}
{
print $1 "\t" $6
}
END {
print "The end"
}
icbc@ubuntu:~$ 

调用执行得到如下结果

icbc@ubuntu:~$ awk -f testfile3  /etc/passwd
USERS and their corresponding home
UserName 	 HomePath
---	---
root	/root
daemon	/usr/sbin
bin	/bin
sys	/dev
sync	/bin
games	/usr/games
man	/var/cache/man
lp	/var/spool/lpd
mail	/var/mail
news	/var/spool/news
uucp	/var/spool/uucp
proxy	/bin
www-data	/var/www
backup	/var/backups
list	/var/list
irc	/var/run/ircd
gnats	/var/lib/gnats
nobody	/nonexistent
systemd-timesync	/run/systemd
systemd-network	/run/systemd/netif
systemd-resolve	/run/systemd/resolve
syslog	/home/syslog
_apt	/nonexistent
messagebus	/var/run/dbus
uuidd	/run/uuidd
lightdm	/var/lib/lightdm
whoopsie	/nonexistent
avahi-autoipd	/var/lib/avahi-autoipd
avahi	/var/run/avahi-daemon
dnsmasq	/var/lib/misc
colord	/var/lib/colord
speech-dispatcher	/var/run/speech-dispatcher
hplip	/var/run/hplip
kernoops	/
pulse	/var/run/pulse
rtkit	/proc
saned	/var/lib/saned
usbmux	/var/lib/usbmux
icbc	/home/icbc
mysql	/nonexistent
cups-pk-helper	/home/cups-pk-helper
geoclue	/var/lib/geoclue
gdm	/var/lib/gdm3
gnome-initial-setup	/run/gnome-initial-setup/
redis	/var/lib/redis
jenkins	/var/lib/jenkins
sshd	/run/sshd
The end
icbc@ubuntu:~$ 

使用内嵌变量

除了我们之前所提及的$1,$2等内嵌变量,awk还支持一些其他内嵌变量

  • FIELDWIDTHS :指定字段宽度
  • RS:指定记录分隔符
  • FS:指定字段分隔符
  • OFS:指定输出分隔符
  • ORS:指定输出分隔符

OFS默认为空格,也可以指定别的字符。

icbc@ubuntu:~$ awk 'BEGIN{FS=":"; OFS="-"} {print $1,$6,$7}' /etc/passwd
root-/root-/bin/bash
daemon-/usr/sbin-/usr/sbin/nologin
bin-/bin-/usr/sbin/nologin
sys-/dev-/usr/sbin/nologin
sync-/bin-/bin/sync
games-/usr/games-/usr/sbin/nologin
man-/var/cache/man-/usr/sbin/nologin
lp-/var/spool/lpd-/usr/sbin/nologin
mail-/var/mail-/usr/sbin/nologin
news-/var/spool/news-/usr/sbin/nologin
uucp-/var/spool/uucp-/usr/sbin/nologin
proxy-/bin-/usr/sbin/nologin
www-data-/var/www-/usr/sbin/nologin
backup-/var/backups-/usr/sbin/nologin
list-/var/list-/usr/sbin/nologin
irc-/var/run/ircd-/usr/sbin/nologin
gnats-/var/lib/gnats-/usr/sbin/nologin
nobody-/nonexistent-/usr/sbin/nologin
systemd-timesync-/run/systemd-/bin/false
systemd-network-/run/systemd/netif-/bin/false
systemd-resolve-/run/systemd/resolve-/bin/false
syslog-/home/syslog-/bin/false
_apt-/nonexistent-/bin/false
messagebus-/var/run/dbus-/bin/false
uuidd-/run/uuidd-/bin/false
lightdm-/var/lib/lightdm-/bin/false
whoopsie-/nonexistent-/bin/false
avahi-autoipd-/var/lib/avahi-autoipd-/bin/false
avahi-/var/run/avahi-daemon-/bin/false
dnsmasq-/var/lib/misc-/bin/false
colord-/var/lib/colord-/bin/false
speech-dispatcher-/var/run/speech-dispatcher-/bin/false
hplip-/var/run/hplip-/bin/false
kernoops-/-/bin/false
pulse-/var/run/pulse-/bin/false
rtkit-/proc-/bin/false
saned-/var/lib/saned-/bin/false
usbmux-/var/lib/usbmux-/bin/false
icbc-/home/icbc-/bin/bash
mysql-/nonexistent-/bin/false
cups-pk-helper-/home/cups-pk-helper-/usr/sbin/nologin
geoclue-/var/lib/geoclue-/usr/sbin/nologin
gdm-/var/lib/gdm3-/bin/false
gnome-initial-setup-/run/gnome-initial-setup/-/bin/false
redis-/var/lib/redis-/usr/sbin/nologin
jenkins-/var/lib/jenkins-/bin/bash
sshd-/run/sshd-/usr/sbin/nologin
icbc@ubuntu:~$ 

假设我们有文件内容如下:

icbc@ubuntu:~$ 
icbc@ubuntu:~$ cat cash
1235.96521
icbc@ubuntu:~$ 
icbc@ubuntu:~$ 

使用FIELDWIDTHS关键字得到如下结果:

icbc@ubuntu:~$ 
icbc@ubuntu:~$ awk 'BEGIN{FIELDWIDTHS="3 4 3"}{print $1,$2,$3}' cash
123 5.96 521
icbc@ubuntu:~$ 

假设我们有内容如下:

icbc@ubuntu:~$ 
icbc@ubuntu:~$ cat person
Person Name
123 High Street
(222)466-1234

Another person
487 High Street
(523)643-8754

icbc@ubuntu:~$ 

数据通过换行符区分,此时我们需要通过FS指定分隔符为换行符,RS为空

icbc@ubuntu:~$ 
icbc@ubuntu:~$ awk 'BEGIN{FS="\n"; RS=""} {print $1,$3}' person
Person Name (222)466-1234
Another person (523)643-8754
icbc@ubuntu:~$ 

其他变量

  • ARGC :传参个数

  • ARGV :命令行参数

  • ENVIRON :环境变量

  • FILENAME :awk处理目标文件

  • NF :正在处理行的记录数

  • NR :已处理记录数

  • FNR :被处理记录

  • IGNORECASE:忽略大小写

简单测试一下:

icbc@ubuntu:~$ 
icbc@ubuntu:~$ awk 'BEGIN{print ARGC,ARGV[1]}' myfile
2 myfile
icbc@ubuntu:~$ 

使用环境变量

icbc@ubuntu:~$  awk '
>  
> BEGIN{
>  
> print ENVIRON["PATH"]
>  
> }'
/usr/lib/jvm/java-8-openjdk-amd64/bin:/usr/lib/jvm/java-8-openjdk-amd64/jre/bin:/home/icbc/software/go/bin:/home/icbc/software/node-v9.4.0-linux-x64/bin:/home/icbc/bin:/home/icbc/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
icbc@ubuntu:~$ 

也可以直接使用bash变量

icbc@ubuntu:~$ 
icbc@ubuntu:~$ echo | awk -v home=$HOME '{print "My home is " home}'
My home is /home/icbc
icbc@ubuntu:~$ 
icbc@ubuntu:~$ 

NF变量表示的是记录中的最后一个字段

icbc@ubuntu:~$ awk 'BEGIN{FS=":"; OFS=":"} {print $1,$NF}' /etc/passwd
root:/bin/bash
daemon:/usr/sbin/nologin
bin:/usr/sbin/nologin
sys:/usr/sbin/nologin
sync:/bin/sync
games:/usr/sbin/nologin
man:/usr/sbin/nologin
lp:/usr/sbin/nologin
mail:/usr/sbin/nologin
news:/usr/sbin/nologin
uucp:/usr/sbin/nologin
proxy:/usr/sbin/nologin
www-data:/usr/sbin/nologin
backup:/usr/sbin/nologin
list:/usr/sbin/nologin
irc:/usr/sbin/nologin
gnats:/usr/sbin/nologin
nobody:/usr/sbin/nologin
systemd-timesync:/bin/false
systemd-network:/bin/false
systemd-resolve:/bin/false
syslog:/bin/false
_apt:/bin/false
messagebus:/bin/false
uuidd:/bin/false
lightdm:/bin/false
whoopsie:/bin/false
avahi-autoipd:/bin/false
avahi:/bin/false
dnsmasq:/bin/false
colord:/bin/false
speech-dispatcher:/bin/false
hplip:/bin/false
kernoops:/bin/false
pulse:/bin/false
rtkit:/bin/false
saned:/bin/false
usbmux:/bin/false
icbc:/bin/bash
mysql:/bin/false
cups-pk-helper:/us

我们看下NRFNR的区别

icbc@ubuntu:~$ 
icbc@ubuntu:~$ awk 'BEGIN{FS=","}{print $1,"FNR="FNR}' myfile myfile
this is a test FNR=1
this is the second test. FNR=2
this is a test FNR=1
this is the second test. FNR=2
icbc@ubuntu:~$ 

上面的例子中,我们定义了两个输入文件,同一个文件处理两次。输出是第一个字段值和FNR。看下NRFNR的区别:处理到新文件时,FNR会重新变成1,而NR会继续累加。

icbc@ubuntu:~$ awk '
>  
> BEGIN {FS=","}
>  
> {print $1,"FNR="FNR,"NR="NR}
>  
> END{print "Total",NR,"processed lines"}' myfile myfile
this is a test FNR=1 NR=1
this is the second test. FNR=2 NR=2
this is a test FNR=1 NR=3
this is the second test. FNR=2 NR=4
Total 4 processed lines
icbc@ubuntu:~$ 

使用自定义变量

icbc@ubuntu:~$ 
icbc@ubuntu:~$ awk '
>  
> BEGIN{
>  
> test="Welcome to LikeGeeks website"
>  
> print test
>  
> }'
Welcome to LikeGeeks website
icbc@ubuntu:~$ 

结构化命令

假设有文件内容如下:

icbc@ubuntu:~$ cat numbers
10
15
6
33
45
icbc@ubuntu:~$ 
icbc@ubuntu:~$ 
icbc@ubuntu:~$ 

执行IF条件判断

icbc@ubuntu:~$ 
icbc@ubuntu:~$ awk '{if ($1 > 30) print $1}' numbers
33
45
icbc@ubuntu:~$ 

或者使用大括号执行多条语句

icbc@ubuntu:~$ 
icbc@ubuntu:~$ awk '{
>  
> if ($1 > 30)
>  
> {
>  
> x = $1 * 3
>  
> print x
>  
> }
>  
> }' numbers
99
135
icbc@ubuntu:~$ 

也可以使用ELSE语句

icbc@ubuntu:~$ awk '{
>  
> if ($1 > 30)
>  
> {
>  
> x = $1 * 3
>  
> print x
>  
> } else
>  
> {
>  
> x = $1 / 2
>  
> print x
>  
> }}' numbers
5
7.5
3
99
135
icbc@ubuntu:~$ 

也可以使用分号将ELSE写在一行上

icbc@ubuntu:~$ awk '{if ($1 > 20) print $1 *2;else print $1 /2 }' numbers
5
7.5
3
66
90
icbc@ubuntu:~$ 

While循环

有文件内容如下:

icbc@ubuntu:~$ cat numbers2
124 127 130
112 142 135
175 158 245
118 231 147
icbc@ubuntu:~$ 
icbc@ubuntu:~$ 

循环求平均

icbc@ubuntu:~$ awk '{
>  
> sum = 0
>  
> i = 1
>  
> while (i < 5)
>  
> {
>  
> sum += $i
>  
> i++
>  
> }
>  
> average = sum / 3
>  
> print "Average:",average
>  
> }' numbers2
Average: 127
Average: 129.667
Average: 192.667
Average: 165.333
icbc@ubuntu:~$ 

使用break中断循环

icbc@ubuntu:~$ awk '{
 
tot = 0
 
i = 1
 
while (i < 5)
 
{
 
tot += $i
 
if (i == 3)
 
break
 
i++
 
}
 
average = tot / 3
 
print "Average is:",average
 
}' numbers2
Average is: 127
Average is: 129.667
Average is: 192.667
Average is: 165.333
icbc@ubuntu:~$ 

For循环

icbc@ubuntu:~$ awk '{
 
total = 0
 
for (var = 1; var < 5; var++)
 
{
 
total += $var
 
}
 
avg = total / 3
 
print "Average:",avg
 
}' numbers2
Average: 127
Average: 129.667
Average: 192.667
Average: 165.333
icbc@ubuntu:~$ 

格式化打印

%[modifier]control-letter

  • c:将数字作为字串打印
  • d:打印整型数据
  • e:科学计数
  • f:浮点数
  • o:八进制
  • s:字符串
icbc@ubuntu:~$  awk 'BEGIN{
>  
> x = 100 * 100
>  
> printf "The result is: %e\n", x
>  
> }'
The result is: 1.000000e+04
icbc@ubuntu:~$ 

内嵌函数

数学函数

sin(x) | cos(x) | sqrt(x) | exp(x) | log(x) | rand()

icbc@ubuntu:~$ awk 'BEGIN{x=exp(5); print x}'
148.413
icbc@ubuntu:~$ 

字符串函数
icbc@ubuntu:~$ 
icbc@ubuntu:~$ awk 'BEGIN{x = "likegeeks"; print toupper(x)}'
LIKEGEEKS
icbc@ubuntu:~$ 

使用自定义函数

icbc@ubuntu:~$ awk '
>  
> function myfunc()
>  
> {
>  
> printf "The user %s has home path at %s\n", $1,$6
>  
> }
>  
> BEGIN{FS=":"}
>  
> {
>  
> myfunc()
>  
> }' /etc/passwd
The user root has home path at /root
The user daemon has home path at /usr/sbin
The user bin has home path at /bin
The user sys has home path at /dev
The user sync has home path at /bin
The user games has home path at /usr/games
The user man has home path at /var/cache/man
The user lp has home path at /var/spool/lpd
The user mail has home path at /var/mail
The user news has home path at /var/spool/news
The user uucp has home path at /var/spool/uucp
The user proxy has home path at /bin
The user www-data has home path at /var/www
The user backup has home path at /var/backups
The user list has home path at /var/list
The user irc has home path at /var/run/ircd
The user gnats has home path at /var/lib/gnats
The user nobody has home path at /nonexistent
The user systemd-timesync has home path at /run/systemd
The user systemd-network has home path at /run/systemd/netif
The user systemd-resolve has home path at /run/systemd/resolve
The user syslog has home path at /home/syslog
The user _apt has home path at /nonexistent
The user messagebus has home path at /var/run/dbus
The user uuidd has home path at /run/uuidd
The user lightdm has home path at /var/lib/lightdm
The user whoopsie has home path at /nonexistent
The user avahi-autoipd has home path at /var/lib/avahi-autoipd
The user avahi has home path at /var/run/avahi-daemon
The user dnsmasq has home path at /var/lib/misc
The user colord has home path at /var/lib/colord
The user speech-dispatcher has home path at /var/run/speech-dispatcher
The user hplip has home path at /var/run/hplip
The user kernoops has home path at /
The user pulse has home path at /var/run/pulse
The user rtkit has home path at /proc
The user saned has home path at /var/lib/saned
The user usbmux has home path at /var/lib/usbmux
The user icbc has home path at /home/icbc
The user mysql has home path at /nonexistent
The user cups-pk-helper has home path at /home/cups-pk-helper
The user geoclue has home path at /var/lib/geoclue
The user gdm has home path at /var/lib/gdm3
The user gnome-initial-setup has home path at /run/gnome-initial-setup/
The user redis has home path at /var/lib/redis
The user jenkins has home path at /var/lib/jenkins
The user sshd has home path at /run/sshd
icbc@ubuntu:~$ 

原文链接
https://likegeeks.com/awk-command/

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值