一般在测试服务器整机测压过程中,需要监控固件的温度变化,有时候会根据spec来判断其温度是否超标,这样就需要我们写一个监控温度的脚本。
比如 我们来看一个测试用例:
测试过程检查SDR信息固件温度,SPEC要求各芯片低于Tjmax-10℃,spec如下:
部件 spec Tmax
FPGA 100
NIC 105
CPU0 105
DIMMG0 85
DiskG0 70
下面利用ipmitool写一下这个测试脚本。
1 筛选的我们需要查看的固件温度
ipmitool sdr list |egrep “FPGA|DIMM|NIC_Temp|CPU0_Temp|DiskG0_Temp”
ipmitool sdr list |egrep "FPGA|DIMM|NIC_Temp|CPU0_Temp|DiskG0_Temp"
FPGA_Temp | 78 degrees C | ok
NIC_Temp | 86 degrees C | ok
CPU0_Temp | 78 degrees C | ok
DIMMG0_Temp | 55 degrees C | ok
DiskG0_Temp | 46 degrees C | ok
2.这一步需要写一个循环进行实时监控,因为整机压测一般在48h之内,所以我们可以用for 循环来写,每隔15秒检测一次:
for i in {1..10000};do
ipmitool sdr list |egrep "FPGA|DIMM|NIC_Temp|CPU0_Temp|DiskG0_Temp" >temp.txt
sleep 15
done
3.接下来我们就要判断每个固件的温度是否超标,先提取每个固件的温度,然后判断:
temp=`cat temp.txt|awk '{print $3}'`
fpga=`cat temp.txt|grep -i fpga|awk '{print $3}'`
nic=`cat temp.txt|grep -i nic|awk '{print $3}'`
cpu=`cat temp.txt|grep -i cpu|awk '{print $3}'`
dimm=`cat temp.txt|grep -i dimm|awk '{print $3}'`
disk=`cat temp.txt|grep -i disk|awk '{print $3}'`
if [ $fpga -lt 90 ];then
echo "pass" >>result.txt
else
echo "failed" >>result.txt
fi
if [ $nic -lt 105 ];then
echo "pass" >>result.txt
else
echo "failed" >>result.txt
fi
if [ $cpu -lt 95 ];then
echo "pass" >>result.txt
else
echo "failed" >>result.txt
fi
if [ $dimm -lt 75 ];then
echo "pass" >>result.txt
else
echo "failed" >>result.txt
fi
if [ $disk -lt 60 ];then
echo "pass" >>result.txt
else
echo "failed" >>result.txt
fi
4.最后我们需要将判断的结果保存到测试结果里,并记录当前的次数和时间,用paste来实现结果的添加:
echo "===============$i==============="|tee -a temp_result.txt
date |tee -a temp_result.txt
paste temp.txt result.txt |tee -a temp_result.txt
5.完整版脚本如下:
#!/bin/bash
rm -rf result.txt
rm -rf temp_result.txt
for i in {1..10000};do
ipmitool sdr list |egrep "FPGA|DIMM|NIC_Temp|CPU0_Temp|DiskG0_Temp" >temp.txt
temp=`cat temp.txt|awk '{print $3}'`
fpga=`cat temp.txt|grep -i fpga|awk '{print $3}'`
nic=`cat temp.txt|grep -i nic|awk '{print $3}'`
cpu=`cat temp.txt|grep -i cpu|awk '{print $3}'`
dimm=`cat temp.txt|grep -i dimm|awk '{print $3}'`
disk=`cat temp.txt|grep -i disk|awk '{print $3}'`
if [ $fpga -lt 90 ];then
echo "pass" >>result.txt
else
echo "failed" >>result.txt
fi
if [ $nic -lt 105 ];then
echo "pass" >>result.txt
else
echo "failed" >>result.txt
fi
if [ $cpu -lt 95 ];then
echo "pass" >>result.txt
else
echo "failed" >>result.txt
fi
if [ $dimm -lt 75 ];then
echo "pass" >>result.txt
else
echo "failed" >>result.txt
fi
if [ $disk -lt 60 ];then
echo "pass" >>result.txt
else
echo "failed" >>result.txt
fi
echo "===============$i==============="|tee -a temp_result.txt
date |tee -a temp_result.txt
paste temp.txt result.txt |tee -a temp_result.txt
rm -rf result.txt
sleep 15
done
脚本运行结果如下:
===============1===============
Wed Sep 29 02:17:43 CST 2021
FPGA_Temp | 79 degrees C | ok pass
NIC_Temp | 87 degrees C | ok pass
CPU0_Temp | 78 degrees C | ok pass
DIMMG0_Temp | 56 degrees C | ok pass
DiskG0_Temp | 46 degrees C | ok pass
===============2===============
Wed Sep 29 02:18:01 CST 2021
FPGA_Temp | 79 degrees C | ok pass
NIC_Temp | 87 degrees C | ok pass
CPU0_Temp | 79 degrees C | ok pass
DIMMG0_Temp | 56 degrees C | ok pass
DiskG0_Temp | 46 degrees C | ok pass
===============3===============
Wed Sep 29 02:18:18 CST 2021
FPGA_Temp | 79 degrees C | ok pass
NIC_Temp | 87 degrees C | ok pass
CPU0_Temp | 79 degrees C | ok pass
DIMMG0_Temp | 56 degrees C | ok pass
DiskG0_Temp | 47 degrees C | ok pass