一,记录与字段
1,
记录分隔符:默认行输入和输出的分隔符都是回车,保存在RS和ORS内部变量中。 变量$0:awk每次一行取得整条记录,$0随之改变,同时内部变量NF(字段的总数)也随之变化。 变量NR:每条记录的行号,处理完一行将会加1,所以全部处理完后可以理解成行数的总数。
2,
FS: Field Separator
OFS: Output Field Separator
RS: Row Separator (行分隔符)
ORS: Output Row Separator
NF: Number of Fields
NR: Number of Rows
正在处理的哪一行就变成了$0
3,
[root@rhel helinbash]# gawk '{print NR,"--> ",$0}' /etc/passwd
1 --> root:x:0:0:root:/root:/bin/bash
2 --> bin:x:1:1:bin:/bin:/sbin/nologin
3 --> daemon:x:2:2:daemon:/sbin:/sbin/nologin
4 --> adm:x:3:4:adm:/var/adm:/sbin/nologin
5 --> lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
6 --> sync:x:5:0:sync:/sbin:/bin/sync
7 --> shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
8 --> halt:x:7:0:halt:/sbin:/sbin/halt
9 --> mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
10 --> news:x:9:13:news:/etc/news:
11 --> uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
12 --> operator:x:11:0:operator:/root:/sbin/nologin
13 --> games:x:12:100:games:/usr/games:/sbin/nologin
14 --> gopher:x:13:30:gopher:/var/gopher:/sbin/nologin
15 --> ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
16 --> nobody:x:99:99:Nobody:/:/sbin/nologin
17 --> nscd:x:28:28:NSCD Daemon:/:/sbin/nologin
18 --> vcsa:x:69:69:virtual console memory owner:/dev:/sbin/nologin
19 --> rpc:x:32:32:Portmapper RPC user:/:/sbin/nologin
20 --> mailnull:x:47:47::/var/spool/mqueue:/sbin/nologin
21 --> smmsp:x:51:51::/var/spool/mqueue:/sbin/nologin
22 --> pcap:x:77:77::/var/arpwatch:/sbin/nologin
23 --> ntp:x:38:38::/etc/ntp:/sbin/nologin
24 --> dbus:x:81:81:System message bus:/:/sbin/nologin
25 --> avahi:x:70:70:Avahi daemon:/:/sbin/nologin
26 --> sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
27 --> rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin
28 --> nfsnobody:x:65534:65534:Anonymous NFS User:/var/lib/nfs:/sbin/nologin
29 --> haldaemon:x:68:68:HAL daemon:/:/sbin/nologin
30 --> avahi-autoipd:x:100:101:avahi-autoipd:/var/lib/avahi-autoipd:/sbin/nologin
31 --> xfs:x:43:43:X Font Server:/etc/X11/fs:/sbin/nologin
32 --> gdm:x:42:42::/var/gdm:/sbin/nologin
33 --> sabayon:x:86:86:Sabayon user:/home/sabayon:/sbin/nologin
34 --> oracle:x:500:500::/home/oracle:/bin/bash
35 --> named:x:25:25:Named:/var/named:/sbin/nologin
[root@rhel helinbash]#
4,
[root@rhel helinbash]# gawk '{print NR,"--> ",$1}' names.txt
1 --> Tom
2 --> Molly
3 --> John
4 --> yang
[root@rhel helinbash]# gawk '{print NR,":",NF,"--> ",$1}' names.txt
1 : 3 --> Tom
2 : 3 --> Molly
3 : 3 --> John
4 : 3 --> yang
[root@rhel helinbash]#
5,
[root@rhel helinbash]# vim names.txt
[root@rhel helinbash]# gawk '{print NR,":",NF,"--> ",$1}' names.txt
1 : 3 --> Tom
2 : 3 --> Molly
3 : 3 --> John
4 : 4 --> yang
[root@rhel helinbash]# cat names.txt
Tom Savage 100
Molly Lee 200
John Doe 300
yang wawa asdfas -121212
[root@rhel helinbash]#
[root@rhel helinbash]# cat names.txt
Tom Savage 100
Molly Lee 200
John Doe 300
yang wawa asdfas -121212
[root@rhel helinbash]#
6,
[root@rhel helinbash]# gawk '{print $1,$NF}' names.txt
Tom 100
Molly 200
John 300
yang -121212
[root@rhel helinbash]#
二,跨平台数据移植格式转换
1,
把linux文本转化为window文本
[root@rhel helinbash]# gawk '{ORS="\r\n";RS="\n";print $0}' names.txt > names.win
[root@rhel helinbash]# cat names.win
Tom Savage 100
Molly Lee 200
John Doe 300
yang wawa asdfas -121212
2,
把window文本转化为linux版本
[root@rhel helinbash]# gawk '{ORS="\n";RS="\r\n";print $0}' names.win > names.linux
[root@rhel helinbash]# cat names.linux
Tom Savage 100
Molly Lee 200
John Doe 300
yang wawa asdfas -121212
[root@rhel helinbash]#
三,模式
1,
awk模式用来控制输入的文本行执行什么样的操作
模式为正则表达式
模式具有着隐式if语句
模式写在模式操作符两个“//”中
2,
[root@rhel helinbash]# gawk '/^root/' /etc/passwd
root:x:0:0:root:/root:/bin/bash
[root@rhel helinbash]#
3,
操作
格式
模式{操作语句1;操作语句2;....;}
或者
模式
{
操作语句1
操作语句2
........
}
4,
模式对单列处理
[root@rhel helinbash]# gawk -F: '/bash$/{ print $1}' /etc/passwd
root
oracle
[root@rhel helinbash]# gawk -F: '$7 ~ /bash$/{ print $1}' /etc/passwd
root
oracle
[root@rhel helinbash]#
[root@rhel helinbash]# gawk -F: '$7 !~ /bash$/{ print $1}' /etc/passwd
bin
daemon
adm
lp
sync
shutdown
halt
mail
news
uucp
operator
games
gopher
ftp
nobody
nscd
vcsa
rpc
mailnull
smmsp
pcap
ntp
dbus
avahi
sshd
rpcuser
nfsnobody
haldaemon
avahi-autoipd
xfs
gdm
sabayon
named
[root@rhel helinbash]#
四,正则表达式
1,
很多地方都是用到正则表达式来匹配特定的信息
^ 串首
$ 串尾
. 匹配单个任意字符
* 匹配零个或多个前面的字符
+ 匹配一个或多个前面的字符
? 匹配零个或一个前面的字符
[ABC] 匹配括号中给出的任一个字符
[A-Z] 匹配A到Z之间的任一个字符
A|B 匹配二选一,或者的意思,等同于[AB]
(AB)+ 匹配一个或多个括号中的组合
\* 星号本身,转义星号本身
2,
[root@rhel helinbash]# gawk '/^a/' test.txt
a
ab
abab
abbb
ababab
aaab
[root@rhel helinbash]# gawk '/^a$/' test.txt
a
[root@rhel helinbash]# gawk '/a$/' test.txt
a
baaa
[root@rhel helinbash]# gawk '/..../' test.txt
abab
abbb
ababab
aaab
baaa
[root@rhel helinbash]# gawk '/^....&/' test.txt
[root@rhel helinbash]# gawk '/^....$/' test.txt
abab
abbb
aaab
baaa
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
[root@rhel helinbash]#
3,
[root@rhel helinbash]# vim test.txt
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
[root@rhel helinbash]# gawk '/^....$/' test.txt
abab
abbb
aaab
baaa
c.tx
[root@rhel helinbash]# gawk '/^.\...$/' test.txt
c.tx
[root@rhel helinbash]#
4,
[root@rhel helinbash]# gawk '/^a*$/' test.txt
a
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
5,
[root@rhel helinbash]# vim test.txt
[root@rhel helinbash]# gawk '/^a*$/' test.txt
a
aaaaa
aaaa
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
aaaaa
aaaa
[root@rhel helinbash]#
6,
[root@rhel helinbash]# gawk '/^a*$/' test.txt
a
aaaaa
aaaa
[root@rhel helinbash]# vim test.txt
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
aaaaa
aaaa
[root@rhel helinbash]# gawk '/^a*$/' test.txt # 注: * 可以匹配空行就是出现了0次
a
aaaaa
aaaa
[root@rhel helinbash]# gawk '/^a+$/' test.txt # 注: 这个不可以匹配空行至少出现1次
a
aaaaa
aaaa
[root@rhel helinbash]#
[root@rhel helinbash]# gawk '/^a?$/' test.txt
a
[root@rhel helinbash]#
[root@rhel helinbash]# gawk '/^[ab]*/' ./test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx #为什么会出现,因为ab出现了0次,也就是*的含义
aaaaa
aaaa
[root@rhel helinbash]# gawk '/^[ab]*$/' ./test.txt
a
ab
abab
abbb
ababab
aaab
baaa
aaaaa
aaaa
[root@rhel helinbash]#
[root@rhel helinbash]# gawk '/^[^ab]*$/' ./test.txt
c.tx
[root@rhel helinbash]#
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
aaaaa
aaaa
[root@rhel helinbash]#
7,
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
aaaaa
aaaa
dddddddddddddd
ffffffffff
faaaaaaaaaa
kjjjjjjj
[root@rhel helinbash]# gawk '/^[^ab]*$/' ./test.txt
c.tx
dddddddddddddd
ffffffffff
kjjjjjjj
[root@rhel helinbash]#
8,
[root@rhel helinbash]# gawk '/(ab)*/' test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
aaaaa
aaaa
dddddddddddddd
ffffffffff
faaaaaaaaaa
kjjjjjjj
[root@rhel helinbash]#
[root@rhel helinbash]# gawk '/^(ab)*$/' test.txt
ab
abab
ababab
[root@rhel helinbash]#
9,
[root@rhel helinbash]# gawk '/^(ab)\{3\}*$/' test.txt #{3}这种不支持
[root@rhel helinbash]#
1,
记录分隔符:默认行输入和输出的分隔符都是回车,保存在RS和ORS内部变量中。 变量$0:awk每次一行取得整条记录,$0随之改变,同时内部变量NF(字段的总数)也随之变化。 变量NR:每条记录的行号,处理完一行将会加1,所以全部处理完后可以理解成行数的总数。
2,
FS: Field Separator
OFS: Output Field Separator
RS: Row Separator (行分隔符)
ORS: Output Row Separator
NF: Number of Fields
NR: Number of Rows
正在处理的哪一行就变成了$0
3,
[root@rhel helinbash]# gawk '{print NR,"--> ",$0}' /etc/passwd
1 --> root:x:0:0:root:/root:/bin/bash
2 --> bin:x:1:1:bin:/bin:/sbin/nologin
3 --> daemon:x:2:2:daemon:/sbin:/sbin/nologin
4 --> adm:x:3:4:adm:/var/adm:/sbin/nologin
5 --> lp:x:4:7:lp:/var/spool/lpd:/sbin/nologin
6 --> sync:x:5:0:sync:/sbin:/bin/sync
7 --> shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
8 --> halt:x:7:0:halt:/sbin:/sbin/halt
9 --> mail:x:8:12:mail:/var/spool/mail:/sbin/nologin
10 --> news:x:9:13:news:/etc/news:
11 --> uucp:x:10:14:uucp:/var/spool/uucp:/sbin/nologin
12 --> operator:x:11:0:operator:/root:/sbin/nologin
13 --> games:x:12:100:games:/usr/games:/sbin/nologin
14 --> gopher:x:13:30:gopher:/var/gopher:/sbin/nologin
15 --> ftp:x:14:50:FTP User:/var/ftp:/sbin/nologin
16 --> nobody:x:99:99:Nobody:/:/sbin/nologin
17 --> nscd:x:28:28:NSCD Daemon:/:/sbin/nologin
18 --> vcsa:x:69:69:virtual console memory owner:/dev:/sbin/nologin
19 --> rpc:x:32:32:Portmapper RPC user:/:/sbin/nologin
20 --> mailnull:x:47:47::/var/spool/mqueue:/sbin/nologin
21 --> smmsp:x:51:51::/var/spool/mqueue:/sbin/nologin
22 --> pcap:x:77:77::/var/arpwatch:/sbin/nologin
23 --> ntp:x:38:38::/etc/ntp:/sbin/nologin
24 --> dbus:x:81:81:System message bus:/:/sbin/nologin
25 --> avahi:x:70:70:Avahi daemon:/:/sbin/nologin
26 --> sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
27 --> rpcuser:x:29:29:RPC Service User:/var/lib/nfs:/sbin/nologin
28 --> nfsnobody:x:65534:65534:Anonymous NFS User:/var/lib/nfs:/sbin/nologin
29 --> haldaemon:x:68:68:HAL daemon:/:/sbin/nologin
30 --> avahi-autoipd:x:100:101:avahi-autoipd:/var/lib/avahi-autoipd:/sbin/nologin
31 --> xfs:x:43:43:X Font Server:/etc/X11/fs:/sbin/nologin
32 --> gdm:x:42:42::/var/gdm:/sbin/nologin
33 --> sabayon:x:86:86:Sabayon user:/home/sabayon:/sbin/nologin
34 --> oracle:x:500:500::/home/oracle:/bin/bash
35 --> named:x:25:25:Named:/var/named:/sbin/nologin
[root@rhel helinbash]#
4,
[root@rhel helinbash]# gawk '{print NR,"--> ",$1}' names.txt
1 --> Tom
2 --> Molly
3 --> John
4 --> yang
[root@rhel helinbash]# gawk '{print NR,":",NF,"--> ",$1}' names.txt
1 : 3 --> Tom
2 : 3 --> Molly
3 : 3 --> John
4 : 3 --> yang
[root@rhel helinbash]#
5,
[root@rhel helinbash]# vim names.txt
[root@rhel helinbash]# gawk '{print NR,":",NF,"--> ",$1}' names.txt
1 : 3 --> Tom
2 : 3 --> Molly
3 : 3 --> John
4 : 4 --> yang
[root@rhel helinbash]# cat names.txt
Tom Savage 100
Molly Lee 200
John Doe 300
yang wawa asdfas -121212
[root@rhel helinbash]#
[root@rhel helinbash]# cat names.txt
Tom Savage 100
Molly Lee 200
John Doe 300
yang wawa asdfas -121212
[root@rhel helinbash]#
6,
[root@rhel helinbash]# gawk '{print $1,$NF}' names.txt
Tom 100
Molly 200
John 300
yang -121212
[root@rhel helinbash]#
二,跨平台数据移植格式转换
1,
把linux文本转化为window文本
[root@rhel helinbash]# gawk '{ORS="\r\n";RS="\n";print $0}' names.txt > names.win
[root@rhel helinbash]# cat names.win
Tom Savage 100
Molly Lee 200
John Doe 300
yang wawa asdfas -121212
2,
把window文本转化为linux版本
[root@rhel helinbash]# gawk '{ORS="\n";RS="\r\n";print $0}' names.win > names.linux
[root@rhel helinbash]# cat names.linux
Tom Savage 100
Molly Lee 200
John Doe 300
yang wawa asdfas -121212
[root@rhel helinbash]#
三,模式
1,
awk模式用来控制输入的文本行执行什么样的操作
模式为正则表达式
模式具有着隐式if语句
模式写在模式操作符两个“//”中
2,
[root@rhel helinbash]# gawk '/^root/' /etc/passwd
root:x:0:0:root:/root:/bin/bash
[root@rhel helinbash]#
3,
操作
格式
模式{操作语句1;操作语句2;....;}
或者
模式
{
操作语句1
操作语句2
........
}
4,
模式对单列处理
[root@rhel helinbash]# gawk -F: '/bash$/{ print $1}' /etc/passwd
root
oracle
[root@rhel helinbash]# gawk -F: '$7 ~ /bash$/{ print $1}' /etc/passwd
root
oracle
[root@rhel helinbash]#
[root@rhel helinbash]# gawk -F: '$7 !~ /bash$/{ print $1}' /etc/passwd
bin
daemon
adm
lp
sync
shutdown
halt
news
uucp
operator
games
gopher
ftp
nobody
nscd
vcsa
rpc
mailnull
smmsp
pcap
ntp
dbus
avahi
sshd
rpcuser
nfsnobody
haldaemon
avahi-autoipd
xfs
gdm
sabayon
named
[root@rhel helinbash]#
四,正则表达式
1,
很多地方都是用到正则表达式来匹配特定的信息
^ 串首
$ 串尾
. 匹配单个任意字符
* 匹配零个或多个前面的字符
+ 匹配一个或多个前面的字符
? 匹配零个或一个前面的字符
[ABC] 匹配括号中给出的任一个字符
[A-Z] 匹配A到Z之间的任一个字符
A|B 匹配二选一,或者的意思,等同于[AB]
(AB)+ 匹配一个或多个括号中的组合
\* 星号本身,转义星号本身
2,
[root@rhel helinbash]# gawk '/^a/' test.txt
a
ab
abab
abbb
ababab
aaab
[root@rhel helinbash]# gawk '/^a$/' test.txt
a
[root@rhel helinbash]# gawk '/a$/' test.txt
a
baaa
[root@rhel helinbash]# gawk '/..../' test.txt
abab
abbb
ababab
aaab
baaa
[root@rhel helinbash]# gawk '/^....&/' test.txt
[root@rhel helinbash]# gawk '/^....$/' test.txt
abab
abbb
aaab
baaa
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
[root@rhel helinbash]#
3,
[root@rhel helinbash]# vim test.txt
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
[root@rhel helinbash]# gawk '/^....$/' test.txt
abab
abbb
aaab
baaa
c.tx
[root@rhel helinbash]# gawk '/^.\...$/' test.txt
c.tx
[root@rhel helinbash]#
4,
[root@rhel helinbash]# gawk '/^a*$/' test.txt
a
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
5,
[root@rhel helinbash]# vim test.txt
[root@rhel helinbash]# gawk '/^a*$/' test.txt
a
aaaaa
aaaa
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
aaaaa
aaaa
[root@rhel helinbash]#
6,
[root@rhel helinbash]# gawk '/^a*$/' test.txt
a
aaaaa
aaaa
[root@rhel helinbash]# vim test.txt
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
aaaaa
aaaa
[root@rhel helinbash]# gawk '/^a*$/' test.txt # 注: * 可以匹配空行就是出现了0次
a
aaaaa
aaaa
[root@rhel helinbash]# gawk '/^a+$/' test.txt # 注: 这个不可以匹配空行至少出现1次
a
aaaaa
aaaa
[root@rhel helinbash]#
[root@rhel helinbash]# gawk '/^a?$/' test.txt
a
[root@rhel helinbash]#
[root@rhel helinbash]# gawk '/^[ab]*/' ./test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx #为什么会出现,因为ab出现了0次,也就是*的含义
aaaaa
aaaa
[root@rhel helinbash]# gawk '/^[ab]*$/' ./test.txt
a
ab
abab
abbb
ababab
aaab
baaa
aaaaa
aaaa
[root@rhel helinbash]#
[root@rhel helinbash]# gawk '/^[^ab]*$/' ./test.txt
c.tx
[root@rhel helinbash]#
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
aaaaa
aaaa
[root@rhel helinbash]#
7,
[root@rhel helinbash]# cat test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
aaaaa
aaaa
dddddddddddddd
ffffffffff
faaaaaaaaaa
kjjjjjjj
[root@rhel helinbash]# gawk '/^[^ab]*$/' ./test.txt
c.tx
dddddddddddddd
ffffffffff
kjjjjjjj
[root@rhel helinbash]#
8,
[root@rhel helinbash]# gawk '/(ab)*/' test.txt
a
ab
abab
abbb
ababab
aaab
baaa
c.tx
aaaaa
aaaa
dddddddddddddd
ffffffffff
faaaaaaaaaa
kjjjjjjj
[root@rhel helinbash]#
[root@rhel helinbash]# gawk '/^(ab)*$/' test.txt
ab
abab
ababab
[root@rhel helinbash]#
9,
[root@rhel helinbash]# gawk '/^(ab)\{3\}*$/' test.txt #{3}这种不支持
[root@rhel helinbash]#
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/29611940/viewspace-1174232/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/29611940/viewspace-1174232/