我有点像shell脚本和awk的新手.任何人都可以建议一个更有效和优雅的解决方案,我正在做什么下面执行两个文件之间的密钥查找?
两个输入文件:
文件1 – 包含单个列键字段(server-metricname-minute):
key_column
server026-AckDelayAverage-00:01:00
server026-AckDelayMax-00:01:00
server026-AckSent-00:01:00
server026-DigEnvValidationLatestTime-00:01:00
server026-DigEnvValidationTimeAverage-00:01:00
文件2 – 以逗号分隔,包含关键字段和其他字段的数量
key_column,host,date,minute,metricname, metric value
server026-AckDelayAverage-00:01:00,server026,May 24 2016,00:01:00,AckDelayAverage,942
server026-AckDelayMax-00:01:00,server026,May 24 2016,00:01:00,AckDelayMax,5855
server026-AckSent-00:01:00,server026,May 24 2016,00:01:00,AckSent,49038
我的逻辑是:
Loop through file1
If key found in File2
print file1.key , file2.field3 , file2.field6 to file3
else
print file1.key + 'KEY_NOT_FOUND' text to file3
fi
因此file3输出应该为file1中的每个记录都有一行.
下面的代码似乎有效,但任何人都可以建议一种更有效和更优雅的方法来实现这一目标吗?
while read key ;
do
metric_found=`grep $key file2`
if [[ ! -z $metric_found ]]
then
echo ${metric_found} | awk -F "," '{print $1",$3,"$6}'
else
echo ${key},KEY_NOT_FOUND
fi
done < file1
基于示例数据从现有脚本输出的示例:
server026-AckDelayAverage-00:01:00,May 24 2016,942
server026-AckDelayMax-00:01:00,May 24 2016,5855
server026-AckSent-00:01:00,May 24 2016,49038
server026-DigEnvValidationLatestTime-23:59:00,KEY_NOT_FOUND
server026-DigEnvValidationTimeAverage-23:59:00,KEY_NOT_FOUND
谢谢..
解决方法:
试试这个:
awk 'BEGIN{FS=OFS=","}NR==FNR{a[$1]=1;b[$1]=$3;c[$1]=$6;}NR>FNR{if (a[$1]) print $1,b[$1],c[$1]; else print $1,"KEY_NOT_FOUND";}' file2 file1 > file3
标签:linux,shell,awk
来源: https://codeday.me/bug/20190824/1704510.html