Awk是操作和处理文本文件的强有力的语言,当行在文本文件以记录的格式最有帮助,等。一个记录包含各种各样的域被分隔符隔开即使当输入文件不是记录格式的,你仍然可以使用awk做一些基本的数据处理,你可以用aws写程序逻辑甚至没有输入文件需要处理。
简而言之,awk是强有力的语言,可以随手做日常工作。
如果你没接触过awk,开始读这个awk 介绍教程,是awk系列教程的一部分。
Learning curve on AWK is much smaller than the learning curve on any other languages. If you know C program already, you’ll appreciate how simple and easy it is to learn AWK.
学习曲线在awk比学习其他语言的曲线小很多。如果你已经会用C语言了,你会知道学习awk室多么的简单多么的容易。
AWK was originally written by three developers — A. Aho, B. W. Kernighan and P. Weinberger. So, the name AWK came from the initials of those three developers.
The following are the three variations of AWK:
awk起初是由三个开发者一起写的— A. Aho, B. W. Kernighan and P. Weinberger.所以,awk来自于起始的三个开发者,接下来的是其三个变种。
1. Awk
awk起初由A. Aho, B. W. Kernighan and P. Weinberger所著
2. Nawk
NAWK代表 “New AWK” 这个是AT&T版本的awk
3. Gawk
GAWK代表“GNU AWK”.所有的linux发行版都带有gawk。这个是和awk,nawk完全兼容的。
在linux系统上,无论输入awk或gawk,awk链接到gawk显示在linux系统。
# ls -l /bin/awk /usr/bin/awk lrwxrwxrwx 1 root root 4 Jan 5 23:13 /bin/awk -> gawk lrwxrwxrwx 1 root root 14 Jan 5 23:13 /usr/bin/awk -> ../../bin/gawk
接下来的表概括了可用版本的不同特色。正如你所见到的,gawk是一个超集包含了起初的awk和nawk的特色.
Awk Vs Nawk Vs Gawk
下载awk vs nawk vs gawk的不同在 下面的文档中。
下面基本的内疚变量 fs,ofs,rs,ors,nr,nf,和filename 在所有的awk版本可用
特色 | 描述 | AWK | NAWK | GAWK |
---|---|---|---|---|
FS | 输入域分割符 | Yes | Yes | Yes |
OFS | 输出域分割符 | Yes | Yes | Yes |
RS | 记录分割符 | Yes | Yes | Yes |
ORS | 输出记录分割符 | Yes | Yes | Yes |
NR | 记录的个数 | Yes | Yes | Yes |
NF | 记录的域的个数 | Yes | Yes | Yes |
FILENAME | 包含了当前正在处理的输入文件 | Yes | Yes | Yes |
接下来的特色在起初的awk中不可用。他们是可用的在nawk 和/或 gawk像下面的:
特色 | 描述 | NAWK | GAWK |
---|---|---|---|
FNR | 记录个数的文件 | Yes | Yes |
ARGC | Total number or arguments passed to awk script | Yes | Yes |
ARGV | Array containing all awk script arguments | Yes | Yes |
ARGIND | Index to ARGV to retrieve the current file name | Yes | |
SUBSEP | Subscript separator for array indexes | Yes | Yes |
RSTART | Match function sets RSTART with the starting location of str1 in str2 | Yes | Yes |
RLENGTH | Match function sets RLENGTH with length of the str1 | Yes | Yes |
OFMT | Awk uses this to decide how to print values. Default is “%.6g” | Yes | Yes |
ENVIRON | Array containing all environment variables and values | Yes | |
IGNORECASE | Default is 0. When set to 1, it is case insensitive for string and reg-ex comparisons. | Yes | |
ERRNO | Contains error message of an I/O operation. e.g. while using getline function. | Yes | |
BINMODE n | Set binary mode for I/O. n can be 1 (input files), 2(output files), or 3(all files) | Yes | |
CONVFMT | The format used while converting number to string. | Yes | |
FIELDWIDTHS n | n is a space delimited number that indicates the column widths. If this is available, gawk uses this instead of FS. | Yes | |
LINT n | n can be a number. When n is a nonzero number (indicating true), gawk will displays fatal, invalid, or warning lint messages (same as –lint command line) | Yes | |
TEXTDOMAIN | This is used for internationalization. | Yes | |
sub(str1,str2,var) | In the input string (var), str1 is replaced with str2, and output is stored back in var | Yes | Yes |
gsub(str1,str2,var) | Same as sub, but global. It does multiple substitutions on the same input string (var). | Yes | Yes |
match(str1,str2) | Returns positive number when str1 is present in str2. | Yes | Yes |
getline < file | Read next line from another input-file. Sets $0, NF | Yes | Yes |
getline var < file | Read next line from another input-file and store it in variable (var) | Yes | Yes |
toupper(str) | 字符串转换为大写格式 | Yes | |
tolower(str) | 字符串转换为小写格式 | Yes | |
|& | 两种交流方式在awk和外部处理 | Yes | |
systime() | Current time in epoch time. Combine with strftime. e.g. print strftime(“%c”,systime()) | Yes |