在服务器测试(storage)过程中,会看到很多人写跑fio的脚本用minimal格式来解析,因为这种格式返回的结果对与脚本(shell,python)解析log非常方便.下面介绍一下这种方式下,用Python来解析log
1 一般客户会要求结果中出现一下参数的值:
bandwidth
iops
clat_p99
clat_p999
clat_p9999
lat_max
lat_avg
那么就要知道这些参数的下标:
7 read_bandwidth_kb 带宽值
8 read_iops IOPS值
30 read_clat_pct13 p99的值
32 read_clat_pct15 p999的值
34 read_clat_pct17 p9999的值
39 read_lat_max_us lat max值
40 read_lat_mean_us lat平均值
48 write_bandwidth_kb 带宽值
49 write_iops IOPS值
71 write_clat_pct13 p99值
73 write_clat_pct15 p999的值
75 write_clat_pct17 p9999的值
80 write_lat_max_us lat最大值
81 write_lat_mean_us lat平均值
2 先写一个跑fio的脚本,以英特尔的P5520 nvme 盘为例子吧
因为我们只跑顺序,所以就不需要做预处理了哈
#!/bin/bash
if [ $# != 2 ]; then
echo "You must input the time of each testcase (unit is second)"
exit -1
fi
nvme=$1
time=$2
date
echo " sequential write test start"
for ioengine in libaio
do
for rw in write
do
for bs in 128k
do
for iodepth in 16
do
for jobs in 1
do
date
echo "$hdd $rw $bs iodepth=$iodepth numjobs=$jobs test satrt"
job_name="${ioengine}_${bs}B_${rw}_${jobs}job_${iodepth}QD"
fio --name=$job_name --filename=/dev/$nvme --ioengine=${ioengine} --direct=1 --thread=1 --numjobs=${jobs} --iodepth=${iodepth} --rw=${rw} --bs=${bs} --runtime=$time --time_based=1 --size=100% --group_reporting --minimal >>${nvme}_write.log
done
done
done
done
done
echo " sequential read test start"
for ioengine in libaio
do
for rw in read
do
for bs in 128k
do
for iodepth in 16
do
for jobs in 1
do
date
echo "$hdd $rw $bs iodepth=$iodepth numjobs=$jobs test satrt"
job_name="${ioengine}_${bs}B_${rw}_${jobs}job_${iodepth}QD"
fio --name=$job_name --filename=/dev/$nvme --ioengine=${ioengine} --direct=1 --thread=1 --numjobs=${jobs} --iodepth=${iodepth} --rw=${rw} --bs=${bs} --runtime=$time --time_based=1 --size=100% --group_reporting --minimal >>${nvme}_read.log
done
done
done
done
done
echo " sequential mix test"
date
for ioengine in libaio
do
for rw in rw
do
for mixread in 80
do
for blk_size in 1024k
do
for jobs in 1
do
for queue_depth in 16
do
job_name="${ioengine}_${rw}_${blk_size}B_mix_read${mixread}_${jobs}job_QD${queue_depth}"
echo "$hdd $job_name test satrt"
fio --name=${job_name} --filename=/dev/$nvme --ioengine=libaio --direct=1 --thread=1 --numjobs=${jobs} --iodepth=${queue_depth} --rw=${rw} --bs=${blk_size} --rwmixread=$mixread --runtime=$time --time_based=1 --size=100% --group_reporting --minimal >> "$nvme"_mix_data.log
done
done
done
done
done
done
mkdir $nvme
mkdir $nvme/test_data
mkdir $nvme/test_log
mv $nvme*.log $nvme/test_log
mv $nvme*.csv $nvme/test_data
echo "test has been finsished"
执行一下这脚本如下:
bash fio.sh nvme2n1 10
以混合mix的log为例,先看下这个log
root@bytedance:/home/csdn/nvme2n1/test_log# cat nvme2n1_mix_data.log
3;fio-3.33;libaio_rw_1024kB_mix_read80_1job_QD16;0;0;2363392;235163;229;10050;13;67;30.704325;9.349805;40273;89971;53092.300995;4951.654762;1.000000%=43778;5.000000%=45875;10.000000%=47448;20.000000%=49020;30.000000%=50069;40.000000%=51642;50.000000%=52690;60.000000%=53739;70.000000%=55312;80.000000%=56885;90.000000%=59506;95.000000%=61603;99.000000%=64225;99.500000%=65798;99.900000%=86507;99.950000%=87556;99.990000%=89653;0%=0;0%=0;0%=0;40309;89994;53123.005321;4951.633129;186368;272384;99.934088%;235008.000000;20316.314295;609280;60624;59;10050;20;87;51.591739;12.444604;50703;113342;63317.704111;5518.856446;1.000000%=53215;5.000000%=54788;10.000000%=56360;20.000000%=58982;30.000000%=60555;40.000000%=61603;50.000000%=63176;60.000000%=64749;70.000000%=65798;80.000000%=67633;90.000000%=70778;95.000000%=71827;99.000000%=74973;99.500000%=76021;99.900000%=113770;99.950000%=113770;99.990000%=113770;0%=0;0%=0;0%=0;50745;113381;63369.295850;5518.110925;49152;75776;100.000000%;60723.200000;6977.609858;0.308488%;1.194149%;2846;0;1;0.1%;0.1%;0.1%;0.3%;99.5%;0.0%;0.0%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;0.00%;22.74%;77.23%;0.03%;0.00%;0.00%;0.00%;0.00%;0.00%;nvme2n1;2599;21180;0;0;137065;1316093;1441336;100.00%
root@bytedance:/home/csdn/nvme2n1/test_log#
3 下面开始用Python来解析这个log,我们需要用到Python中文件函数open,调用两个模块sys和argparse模块 如下:
#!/usr/bin/python
import argparse
import sys
inputfile = sys.argv[1]
resultfile = sys.argv[2]
rwm = sys.argv[3]
sys.argv[n]是传递Python脚本的第几个参数,这里我主要用到三个参数:
inputfile :代表要打开的fio log文件
resultfile:代表要输出的解析结果
rwm: 代表read.write,mix三种读写模式
(1)下面用open函数来处理一下sdb_mix_data.log
先用换行符进行分割,因为总共有4个workload,也就是4个换行,分割之后,列表里就会有5个元素第5个元素为空,需要用函数pop(-1)去掉最后一个元素空
#!/usr/bin/python
import argparse
import sys
inputfile = "nvme2n1_mix_data.log"
resultfile = "nvme2n1_mix.csv"
rwm = "read"
datastr = open(inputfile).read()
data = datastr.split('\n')
data.pop(-1)
print "len(data)"
执行结果:
(2) 第二步就是用open函数write一个输出表格 如下:
#!/usr/bin/python
import argparse
import sys
inputfile = "nvme2n1_mix_data.log"
resultfile = "nvme2n1_mix.csv"
rwm = "read"
datastr = open(inputfile).read()
data = datastr.split('\n')
data.pop(-1)
if rwm == "read":
f=open(resultfile,"a")
f.write("workload,bandwith(kb/s),IOPS,lat-avg(us),Qos(us) 99%,Qos(us) 99_9%, Qos(us) 99_99%,lat-max(us)\n")
f.close()
生成表格如下:
(3)第三步 就是把log的数据追加到CSV里
#!/usr/bin/python
import argparse
import sys
inputfile = "nvme2n1_mix_data.log"
resultfile = "nvme2n1_mix.csv"
rwm = "read"
datastr = open(inputfile).read()
data = datastr.split('\n')
data.pop(-1)
if rwm == "read":
f=open(resultfile,"a")
f.write("workload,bandwith(kb/s),IOPS,lat-avg(us),Qos(us) 99%,Qos(us) 99_9%, Qos(us) 99_99%,lat-max(us)\n")
f.close()
for i in data:
oneCaseList = i.split(";")
caseName = oneCaseList[2]
oneCaseRes = caseName + "," + oneCaseList[6] + "," + oneCaseList[7] + "," + oneCaseList[39].split(".")[0] + "," + oneCaseList[29].split("=")[1] + "," + oneCaseList[31].split("=")[1] + "," + oneCaseList[33].split("=")[1] + "," + oneCaseList[38].split(".")[0] + "\n"
f = open(resultfile,"a")
f.write(oneCaseRes)
f.close()
结果如下图:
(4)这个是read的数据,下面看看write的数据
#!/usr/bin/python
import argparse
import sys
inputfile = "nvme2n1_mix_data.log"
resultfile = "nvme2n1_mix.csv"
rwm = "write"
datastr = open(inputfile).read()
data = datastr.split('\n')
data.pop(-1)
if rwm == "read":
f=open(resultfile,"a")
f.write("workload,bandwith(kb/s),IOPS,lat-avg(us),Qos(us) 99%,Qos(us) 99_9%, Qos(us) 99_99%,lat-max(us)\n")
f.close()
for i in data:
oneCaseList = i.split(";")
caseName = oneCaseList[2]
oneCaseRes = caseName + "," + oneCaseList[6] + "," + oneCaseList[7] + "," + oneCaseList[39].split(".")[0] + "," + oneCaseList[29].split("=")[1] + "," + oneCaseList[31].split("=")[1] + "," + oneCaseList[33].split("=")[1] + "," + oneCaseList[38].split(".")[0] + "\n"
f = open(resultfile,"a")
f.write(oneCaseRes)
f.close()
if rwm == "write":
f=open(resultfile,"a")
f.write("workload,bandwith(kb/s),IOPS,lat-avg(us),Qos(us) 99%,Qos(us) 99_9%, Qos(us) 99_99%,lat-max(us)\n")
f.close()
for i in data:
oneCaseList = i.split(";")
caseName = oneCaseList[2]
oneCaseRes = caseName + "," + oneCaseList[47] + "," + oneCaseList[48] + "," + oneCaseList[80] + "," + oneCaseList[70].split(".")[0] + "," + oneCaseList[72].split("=")[1] + "," + oneCaseList[74].split("=")[1] + "," + oneCaseList[79].split(".")[0] + "\n"
f = open(resultfile,"a")
f.write(oneCaseRes)
f.close()
结果如下:
(5) 因为是读写混合数据,是要把read和write的数据放在同一个CSV里边的 如下:
#!/usr/bin/python
import argparse
import sys
inputfile = "nvme2n1_mix_data.log"
resultfile = "nvme2n1_mix.csv"
rwm = "mix"
datastr = open(inputfile).read()
data = datastr.split('\n')
data.pop(-1)
if rwm == "read":
f=open(resultfile,"a")
f.write("workload,bandwith(kb/s),IOPS,lat-avg(us),Qos(us) 99%,Qos(us) 99_9%, Qos(us) 99_99%,lat-max(us)\n")
f.close()
for i in data:
oneCaseList = i.split(";")
caseName = oneCaseList[2]
oneCaseRes = caseName + "," + oneCaseList[6] + "," + oneCaseList[7] + "," + oneCaseList[39].split(".")[0] + "," + oneCaseList[29].split("=")[1] + "," + oneCaseList[31].split("=")[1] + "," + oneCaseList[33].split("=")[1] + "," + oneCaseList[38].split(".")[0] + "\n"
f = open(resultfile,"a")
f.write(oneCaseRes)
f.close()
if rwm == "write":
f=open(resultfile,"a")
f.write("workload,bandwith(kb/s),IOPS,lat-avg(us),Qos(us) 99%,Qos(us) 99_9%, Qos(us) 99_99%,lat-max(us)\n")
f.close()
for i in data:
oneCaseList = i.split(";")
caseName = oneCaseList[2]
oneCaseRes = caseName + "," + oneCaseList[47] + "," + oneCaseList[48] + "," + oneCaseList[80] + "," + oneCaseList[70].split(".")[0] + "," + oneCaseList[72].split("=")[1] + "," + oneCaseList[74].split("=")[1] + "," + oneCaseList[79].split(".")[0] + "\n"
f = open(resultfile,"a")
f.write(oneCaseRes)
f.close()
if rwm == "mix":
f=open(resultfile,"a")
f.write("workload,read,bandwith(kb/s),IOPS,Lantency(us),Qos(us) 99%,Qos(us) 99_9%, Qos(us) 99_99%,lat-max(us),write,bandwith(kb/s),IOPS,Lantency(us),Qos(us) 99%,Qos(us) 99_9%, Qos(us) 99_99%, lat-max(us)\n")
f.close()
for i in data:
oneCaseList = i.split(";")
caseName = oneCaseList[2]
oneCaseRes = caseName + "," + "read" + "," + oneCaseList[6] + "," + oneCaseList[7] + "," + oneCaseList[39] + "," + oneCaseList[29].split(".")[0] + "," + oneCaseList[31].split("=")[1] + "," + oneCaseList[33].split("=")[1] + "," + oneCaseList[38].split(".")[0] + "," + "write" + "," + oneCaseList[47] + "," + oneCaseList[48] + "," + oneCaseList[80] + "," + oneCaseList[70].split(".")[0] + "," + oneCaseList[72].split("=")[1] + "," + oneCaseList[74].split("=")[1] + "," + oneCaseList[79].split(".")[0] + "\n"
f = open(resultfile,"a")
f.write(oneCaseRes)
f.close()
结果如下图:
(6) 把三个参数(inputfile , resultfile ,rwm )写进函数里 最终的Python脚本如下:
#!/usr/bin/python
import argparse
import sys
def fiodata(inputFile, resultFile,rwm):
datastr = open(inputfile).read()
data = datastr.split('\n')
data.pop(-1)
if rwm == "read":
f=open(resultfile,"a")
f.write("workload,bandwith(kb/s),IOPS,lat-avg(us),Qos(us) 99%,Qos(us) 99_9%, Qos(us) 99_99%,lat-max(us)\n")
f.close()
for i in data:
oneCaseList = i.split(";")
caseName = oneCaseList[2]
oneCaseRes = caseName + "," + oneCaseList[6] + "," + oneCaseList[7] + "," + oneCaseList[39].split(".")[0] + "," + oneCaseList[29].split("=")[1] + "," + oneCaseList[31].split("=")[1] + "," + oneCaseList[33].split("=")[1] + "," + oneCaseList[38].split(".")[0] + "\n"
f = open(resultfile,"a")
f.write(oneCaseRes)
f.close()
if rwm == "write":
f=open(resultfile,"a")
f.write("workload,bandwith(kb/s),IOPS,lat-avg(us),Qos(us) 99%,Qos(us) 99_9%, Qos(us) 99_99%,lat-max(us)\n")
f.close()
for i in data:
oneCaseList = i.split(";")
caseName = oneCaseList[2]
oneCaseRes = caseName + "," + oneCaseList[47] + "," + oneCaseList[48] + "," + oneCaseList[80] + "," + oneCaseList[70].split(".")[0] + "," + oneCaseList[72].split("=")[1] + "," + oneCaseList[74].split("=")[1] + "," + oneCaseList[79].split(".")[0] + "\n"
f = open(resultfile,"a")
f.write(oneCaseRes)
f.close()
if rwm == "mix":
f=open(resultfile,"a")
f.write("workload,read,bandwith(kb/s),IOPS,Lantency(us),Qos(us) 99%,Qos(us) 99_9%, Qos(us) 99_99%,lat-max(us),write,bandwith(kb/s),IOPS,Lantency(us),Qos(us) 99%,Qos(us) 99_9%, Qos(us) 99_99%, lat-max(us)\n")
f.close()
for i in data:
oneCaseList = i.split(";")
caseName = oneCaseList[2]
oneCaseRes = caseName + "," + "read" + "," + oneCaseList[6] + "," + oneCaseList[7] + "," + oneCaseList[39] + "," + oneCaseList[29].split(".")[0] + "," + oneCaseList[31].split("=")[1] + "," + oneCaseList[33].split("=")[1] + "," + oneCaseList[38].split(".")[0] + "," + "write" + "," + oneCaseList[47] + "," + oneCaseList[48] + "," + oneCaseList[80] + "," + oneCaseList[70].split(".")[0] + "," + oneCaseList[72].split("=")[1] + "," + oneCaseList[74].split("=")[1] + "," + oneCaseList[79].split(".")[0] + "\n"
f = open(resultfile,"a")
f.write(oneCaseRes)
f.close()
inputfile = sys.argv[1]
resultfile = sys.argv[2]
rwm = sys.argv[3]
fiodata(inputfile,resultfile,rwm)
4 接下来需要在shell脚本里调用Python脚本 如下:
#!/bin/bash
if [ $# != 2 ]; then
echo "You must input the time of each testcase (unit is second)"
exit -1
fi
nvme=$1
time=$2
date
echo " sequential write test start"
for ioengine in libaio
do
for rw in write
do
for bs in 128k 256k
do
for iodepth in 16
do
for jobs in 1
do
date
echo "$hdd $rw $bs iodepth=$iodepth numjobs=$jobs test satrt"
job_name="${ioengine}_${bs}B_${rw}_${jobs}job_${iodepth}QD"
fio --name=$job_name --filename=/dev/$nvme --ioengine=${ioengine} --direct=1 --thread=1 --numjobs=${jobs} --iodepth=${iodepth} --rw=${rw} --bs=${bs} --runtime=$time --time_based=1 --size=100% --group_reporting --minimal >>${nvme}_write.log
done
done
done
done
done
python3 min.py ${nvme}_write.log ${nvme}_rw_write.csv write
echo " sequential read test start"
for ioengine in libaio
do
for rw in read
do
for bs in 128k 256k
do
for iodepth in 16
do
for jobs in 1
do
date
echo "$hdd $rw $bs iodepth=$iodepth numjobs=$jobs test satrt"
job_name="${ioengine}_${bs}B_${rw}_${jobs}job_${iodepth}QD"
fio --name=$job_name --filename=/dev/$nvme --ioengine=${ioengine} --direct=1 --thread=1 --numjobs=${jobs} --iodepth=${iodepth} --rw=${rw} --bs=${bs} --runtime=$time --time_based=1 --size=100% --group_reporting --minimal >>${nvme}_read.log
done
done
done
done
done
python3 min.py ${nvme}_read.log ${nvme}_rw_read.csv read
echo " sequential mix test"
date
for ioengine in libaio
do
for rw in rw
do
for mixread in 80 20
do
for blk_size in 1024k
do
for jobs in 1
do
for queue_depth in 16
do
job_name="${ioengine}_${rw}_${blk_size}B_mix_read${mixread}_${jobs}job_QD${queue_depth}"
echo "$hdd $job_name test satrt"
fio --name=${job_name} --filename=/dev/$nvme --ioengine=libaio --direct=1 --thread=1 --numjobs=${jobs} --iodepth=${queue_depth} --rw=${rw} --bs=${blk_size} --rwmixread=$mixread --runtime=$time --time_based=1 --size=100% --group_reporting --minimal >> "$nvme"_mix_data.log
done
done
done
done
done
done
python3 min.py "$nvme"_mix_data.log ${nvme}_rw_mix.csv mix
mkdir $nvme
mkdir $nvme/test_data
mkdir $nvme/test_log
mv $nvme*.log $nvme/test_log
mv $nvme*.csv $nvme/test_data
echo "test has been finsished"
5 OK 到这里测试脚本已经完成,测试一下:
bash fio.sh nvme2n1 10 (跑一个workload 10秒钟)
结果如下:
read的结果
write的结果
混合结果