linux shell中awk,Linux脚本Bash中的文本利器-awk

awk确实很复杂,平常用的也是很少的一部分。边查边用,把平常用的做做笔记,也是方便自己的查找。

*调用方式

awk [-F field-separator] 'commands' input-file(s)

默认空格作为field-separator。

*模式

awk 'BEGIN{} {command} END{}' input.txt

*正则表达式

\ ^ $ . [] | () * + ?

但+(一个或多个) ?(出现频率)不适应于grep和sed。

*匹配与不匹配

awk 'if ($3~/pattern/) actions' input.txt

awk 'if ($3!~/pattern/) actions' input.txt

awk 'if ($3=="abc") actions' input.txt

*awk内置变量

-----------------------------------------------------

A R G C 命令行参数个数

A R G V 命令行参数排列

E N V I R O N 支持队列中系统环境变量的使用

FILENAME a w k浏览的文件名

F N R 浏览文件的记录数

F S 设置输入域分隔符,等价于命令行- F选项

N F 浏览记录的域个数

N R 已读的记录数

O F S 输出域分隔符

O R S 输出记录分隔符

R S 控制记录分隔符

-----------------------------------------------------

*awk内置字符串函

-----------------------------------------------------

g s u b ( r, s ) 在整个$ 0中用s替代r

g s u b ( r, s , t ) 在整个t中用s替代r

i n d e x ( s , t ) 返回s中字符串t的第一位置

l e n g t h ( s ) 返回s长度

m a t c h ( s , r ) 测试s是否包含匹配r的字符串

s p l i t ( s , a , f s ) 在f s上将s分成序列a

s p r i n t ( f m t , e x p ) 返回经f m t格式化后的e x p

s u b ( r, s ) 用$ 0中最左边最长的子串代替s

s u b s t r ( s , p ) 返回字符串s中从p开始的后缀部分

s u b s t r ( s , p , n ) 返回字符串s中从p开始长度为n的后缀部分

-----------------------------------------------------

$1, $2...依次表示第一个,第二个。。。内部自动变量,$0表示整条记录。

首先执行BEGIN,当awk读完所有的输入行后,执行END(如果有的化)。

And now for a grand example:# This awk program collects statistics on two

# "random variables" and the relationships

# between them. It looks only at fields 1 and

# 2 by default Define the variables F and G

# on the command line to force it to look at

# different fields. For example:

# awk -f stat_2o1.awk F=2 G=3 stuff.dat \

# F=3 G=5 otherstuff.dat

# or, from standard input:

# awk -f stat_2o1.awk F=1 G=3

# It ignores blank lines, lines where either

# one of the requested fields is empty, and

# lines whose first field contains a number

# sign. It requires only one pass through the

# data. This script works with vanilla awk

# under SunOS 4.1.3.

BEGIN{

F=1;

G=2;

}

length($F) > 0 && \

length($G) > 0 && \

$1 !~/^#/ {

sx1+= $F; sx2 += $F*$F;

sy1+= $G; sy2 += $G*$G;

sxy1+= $F*$G;

if( N==0 ) xmax = xmin = $F;

if( xmin > $F ) xmin=$F;

if( xmax < $F ) xmax=$F;

if( N==0 ) ymax = ymin = $G;

if( ymin > $G ) ymin=$G;

if( ymax < $G ) ymax=$G;

N++;

}

END {

printf("%d # N\n" ,N );

if (N <= 1)

{

printf("What's the point?\n");

exit 1;

}

printf("%g # xmin\n",xmin);

printf("%g # xmax\n",xmax);

printf("%g # xmean\n",xmean=sx1/N);

xSigma = sx2 - 2 * xmean * sx1+ N*xmean*xmean;

printf("%g # xvar\n" ,xvar =xSigma/ N );

printf("%g # xvar unbiased\n",xvaru=xSigma/(N-1));

printf("%g # xstddev\n" ,sqrt(xvar ));

printf("%g # xstddev unbiased\n",sqrt(xvaru));

printf("%g # ymin\n",ymin);

printf("%g # ymax\n",ymax);

printf("%g # ymean\n",ymean=sy1/N);

ySigma = sy2 - 2 * ymean * sy1+ N*ymean*ymean;

printf("%g # yvar\n" ,yvar =ySigma/ N );

printf("%g # yvar unbiased\n",yvaru=ySigma/(N-1));

printf("%g # ystddev\n" ,sqrt(yvar ));

printf("%g # ystddev unbiased\n",sqrt(yvaru));

if ( xSigma * ySigma <= 0 )

r=0;

else

r=(sxy1 - xmean*sy1- ymean * sx1+ N * xmean * ymean)

/sqrt(xSigma * ySigma);

printf("%g # correlation coefficient\n", r);

if( r > 1 || r < -1 )

printf("SERIOUS ERROR! CORRELATION COEFFICIENT");

printf(" OUTSIDE RANGE -1..1\n");

if( 1-r*r != 0 )

printf("%g # Student's T (use with N-2 degfreed)\n&", \

t=r*sqrt((N-2)/(1-r*r)) );

else

printf("0 # Correlation is perfect,");

printf(" Student's T is plus infinity\n");

b = (sxy1 - ymean * sx1)/(sx2 - xmean * sx1);

a = ymean - b * xmean;

ss=sy2 - 2*a*sy1- 2*b*sxy1 + N*a*a + 2*a*b*sx1+ b*b*sx2 ;

ss/= N-2;

printf("%g # a = y-intercept\n", a);

printf("%g # b = slope\n" , b);

printf("%g # s^2 = unbiased estimator for sigsq\n",ss);

printf("%g + %g * x # equation ready for cut-and-paste\n",a,b);

ra = sqrt(ss * sx2 / (N * xSigma));

rb = sqrt(ss / ( xSigma));

printf("%g # radius of confidence interval ");

printf("for a, multiply by t\n",ra);

printf("%g # radius of confidence interval ");

printf("for b, multiply by t\n",rb);

}0b1331709591d260c1c78e86d0c51c18.png

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值