bash脚本实例-linux性能数据清洗-1

1. 目标

性能采集脚本将用uptime、vmstat、free、iostat命令采集到的数据记录到了一个日志文件中。本脚本就是将按照不同的标签把数据提取出来,加上标题栏保存为csv文件,方便进一步的分析。

性能数据记录文件。文件的第一列文件时间,一分钟一组数据,第二列为标签,grep这个标签就可以分离数据了。

oliver@bigdatadev:~/_src/testing/analyse/data-20160723$ head os_20160723.log
20160723-00:00:01 uptime	 00:00:01 up 3 days, 22:26,  0 users,  load average: 0.31, 0.42, 0.43
20160723-00:00:01 vmstat	 3  0      0 5608676 4083744 13327436    0    0     6    45   15   13  1  4 95  0  0
20160723-00:00:01 free 32186 26709 5477 0 3988 13015
20160723-00:00:01 iostat-cpu	           1.47    0.01    3.93    0.05    0.00   94.55
20160723-00:00:01 iostat-io	hda               0.95     5.34  0.15  1.33    10.56    53.38    43.19     0.00    3.20   0.46   0.07
20160723-00:00:01 iostat-io	hda1              0.95     5.34  0.15  1.33    10.56    53.38    43.19     0.00    3.20   0.46   0.07
20160723-00:00:01 iostat-io	hdb               0.00     0.00  0.00  0.00     0.01     0.00    27.94     0.00    2.35   2.35   0.00
20160723-00:00:01 iostat-io	hdc               0.06    29.25  0.37  8.49    34.00   302.00    37.92     0.14   15.97   0.52   0.46
20160723-00:01:01 uptime	 00:01:01 up 3 days, 22:27,  0 users,  load average: 0.29, 0.40, 0.43
20160723-00:01:01 vmstat	 3  0      0 5605176 4084260 13328336    0    0     6    44   15   13  1  4 95  0  0
oliver@bigdatadev:~/_src/testing/analyse/data-20160723$ 

2.数据清洗脚本

用grep匹配数据实现抽取;用awk实现删除列、增加标题行等操作;用case实现选择,每一类数据对应一个实际的抽取函数。

#!/bin/bash

label=$1
infile=$2
outfile=$3

function etl_os_uptime(){
    grep "$label" $infile | awk '{ if(NR==1){print "datetime,loadavg_1,loadavg_5,loadavg_15" };\
                                   if(NF==14){print $1 "," $12 $13 $14}else if(NF==15){ print $1 "," $13 $14 $15} \
                                 }'  > $outfile
}

function etl_os_vmstat(){
    grep "$label" $infile | awk '{ if(NR==1){print "datetime,r,b,swpd,free,buff,cache,si,so,bi,bo,in,cs,us,sy,id,wa,st" };\
                                   print $1 "," $3 "," $4 "," $5 "," $6 "," $7 "," $8 "," $9 "," $10 "," \
                                         $11 "," $12 "," $13 "," $14 "," $15 "," $16 "," $17 "," $18 "," $19 ","
                                 }'  > $outfile
}
function etl_os_free(){
    grep "$label" $infile | awk '{ if(NR==1){print "datetime,total,used,free,shared,buffers,cached" };\
                                   print $1 "," $3 "," $4 "," $5 "," $6 "," $7 "," $8\
                                 }'  > $outfile
}

function etl_os_iostatcpu(){
    grep "$label" $infile | awk '{ if(NR==1){print "datetime,total,user,nice,system,iowait,steal,idle" };\
                                   print $1 "," $3 "," $4 "," $5 "," $6 "," $7 "," $8\
                                 }'  > $outfile
}

function etl_os_iostatio(){
    grep "$label" $infile | awk '{ if(NR==1){print "datetime,Device,rrqm/s,wrqm/s,r/s,w/s,rsec/s,wsec/s,avgrq-sz,avgqu-sz,await,svctm,util" };\
                                   print $1 "," $3 "," $4 "," $5 "," $6 "," $7 "," $8 "," \
                                         $9 "," $10 "," $11 "," $12 "," $13 "," $14
                                 }'  > $outfile
}

case $1 in
"uptime")
    etl_os_uptime
    ;;
"vmstat")
    etl_os_vmstat
    ;;
"free")
    etl_os_free
    ;;
"iostat-cpu")
    etl_os_iostatcpu
    ;;
"iostat-io")
    etl_os_iostatio
    ;;
*)
    echo "It's not be supported."
    ;;
esac
2.运行效果

../etl.sh iostat-io os_20160723.log  os_iostatio_20160723.log 
oliver@bigdatadev:~/_src/testing/analyse/data-20160723$ head os_iostatio_20160723.log 
datetime,Device,rrqm/s,wrqm/s,r/s,w/s,rsec/s,wsec/s,avgrq-sz,avgqu-sz,await,svctm,util
20160723-00:00:01,hda,0.95,5.34,0.15,1.33,10.56,53.38,43.19,0.00,3.20,0.46,0.07
20160723-00:00:01,hda1,0.95,5.34,0.15,1.33,10.56,53.38,43.19,0.00,3.20,0.46,0.07
20160723-00:00:01,hdb,0.00,0.00,0.00,0.00,0.01,0.00,27.94,0.00,2.35,2.35,0.00
20160723-00:00:01,hdc,0.06,29.25,0.37,8.49,34.00,302.00,37.92,0.14,15.97,0.52,0.46
20160723-00:01:01,hda,0.95,5.34,0.15,1.33,10.56,53.38,43.19,0.00,3.20,0.46,0.07
20160723-00:01:01,hda1,0.95,5.34,0.15,1.33,10.55,53.38,43.19,0.00,3.20,0.46,0.07
20160723-00:01:01,hdb,0.00,0.00,0.00,0.00,0.01,0.00,27.94,0.00,2.35,2.35,0.00
20160723-00:01:01,hdc,0.06,29.25,0.37,8.49,33.99,301.99,37.92,0.14,15.98,0.52,0.46
20160723-00:02:01,hda,0.95,5.34,0.15,1.33,10.55,53.38,43.18,0.00,3.20,0.46,0.07
oliver@bigdatadev:~/_src/testing/analyse/data-20160723$ 




  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值