awk [-F field-separator] 'commands' input-file(s) 其中,commands 是真正awk命令,[-F域分隔符]是可选的。 input-file(s) 是待处理的文件。 在awk中,文件的每一行中,由域分隔符分开的每一项称为一个域。通常,在不指名-F域分隔符的情况下,默认的域分隔符是空格。
scan 'shortUrl',{COLUMN=>['su:customerId','su:postId'], LIMIT=>10}
echo "scan 'foo'" | ./hbase shell > myTextecho "scan 'registration',{COLUMNS=>'registration:status'}" | hbase shell | grep "^ " > registration.txt
scan 'shortUrl',{COLUMN=>['su:customerId','su:postId'], LIMIT=>10}
echo "scan 'shortUrl',{COLUMN=>['su:customerId','su:postId'], LIMIT=>10}" | ./hbase shell > myText
echo "scan 'shortUrl',{COLUMN=>['su:customerId','su:postId']}" | ./hbase shell > myText
awk '{print $1,substr($4,7)}' file | awk '{if (NR%2==0){print " "$2} else {printf $0}}'
awk '{print $1,substr($4,7)}' myText | awk '{if (NR%2==0){print " "$2} else {printf $0}}' > myText2
统计文件中所有行中某个字段的最大长度:
cat GIWEB_20150123131134_046_001150url4.dat |awk '{if (length($1)>maxlength) maxlength=length($1) ; print NR,$1,maxlength,length($1) }; END {print maxlength }'