1. 问题来源
-
新增代码需要进行性能分析,其中涉及到耗时分析
-
新增代码功能比较集中,在代码块添加运行时间戳完成耗时统计
2. 耗时统计方案
-
理论上耗时统计应该使用linux工具如gprof等来实现(耗时检测-gprof操作入门),统计出新增函数的调用次数和耗时占比,但是这种方法更多适用于查找程序性能瓶颈
-
此处想要统计的是新增功能增加的耗时,采用如下方法:
(1) 新增代码块添加运行时间戳
(2) 程序运行一段时间(持续播包,时长应具有统计意义),保存日志
(3) 使用该程序分析日志结果,得到新功能引入的平均/最大/最小耗时
3. 添加时间戳
// 代码块1
time1 = clock();
UpdateItems(); // new code 1
time2 = clock();
std::cout << "\n UpdateItems time: " << (time2 - time1) << "ms" << std::endl;
// 代码块2
time3 = clock();
ItemsProcess(); // new code 2
time4 = clock();
std::cout << "\n ItemsProcess time: " << (time4 - time3) << std::endl;
- 如上所示,两个代码块分别新增了耗时打印
4. 编码实现
# -*- coding: utf-8 -*-
#!/usr/bin/env python
#@Time : 2022/01/18
#@Author : shuaixio
#@FileName : delay_analysis.py
import re # for regular expression
import numpy as np # for mean value calculate
'''
Extract items update and items process time delay
'''
def ExtractTimeDelayData():
fp = open('./output_log.txt', 'r', encoding='utf-8')
fp1 = open('./update.txt', 'w')
fp2 = open('./process.txt', 'w')
raws = fp.readlines()
for raw in raws:
if not raw.split():
raw.strip()
continue
line_update = re.findall(r'UpdateItems time:.*ms', raw)
line_process = re.findall(r'ItemsProcess time:.*', raw)
if line_update:
fp1.writelines(line_update[0]+'\n') # [0] for string, \n for line feed
# print(line_update[0])
if line_process:
fp2.writelines(line_process[0]+'\n')
# print(line_process[0])
fp2.close()
fp1.close()
fp.close()
'''
calculate time delay, items update
'''
def CalculateTimeDelayOfBoundaryUpdate():
fp = open('./update.txt', 'r')
time_list = []
raws = fp.readlines()
for raw in raws:
line = re.findall(r'UpdateItems time:(.*)ms', raw)
if not line:
continue
time_list.append(line[0])
# print(time_list)
time_list = list(map(float, time_list)) # python3.x
mean_time = np.mean(time_list)
max_time = max(time_list)
min_time = min(time_list)
fp.close()
return [mean_time, max_time, min_time]
'''
calculate time delay, items process
'''
def CalculateTimeDelayOfTrajectoryProcess():
fp = open('./process.txt', 'r')
time_list = []
raws = fp.readlines()
for raw in raws:
line = re.findall(r'ItemsProcess time:(.*)', raw)
if not line:
continue
line_filter = re.findall(r'time|mecs', line[0])
# 过滤条件1
if line_filter:
continue
# 过滤条件2
if len(line[0]) == 0 or len(line[0]) > 8:
continue
time_list.append(line[0])
# print(time_list)
time_list = list(map(float, time_list)) # python3.x
mean_time = np.mean(time_list)
max_time = max(time_list)
min_time = min(time_list)
fp.close()
return [mean_time, max_time, min_time]
if __name__ == '__main__':
ExtractTimeDelayData()
timeDelayOfItemsUpdate = CalculateTimeDelayOfBoundaryUpdate()
print("\n【Items Update】\nmean time delay = %.4f ms\nmax time delay = %.4f ms\nmin time delay = %.4f ms"
% (timeDelayOfItemsUpdate[0], timeDelayOfItemsUpdate[1], timeDelayOfItemsUpdate[2]))
timeDelayOfItemsProcess = CalculateTimeDelayOfTrajectoryProcess()
print("\n【Items Process】\nmean time delay = %.4f ms\nmax time delay = %.4f ms\nmin time delay = %.4f ms"
% (timeDelayOfItemsProcess[0], timeDelayOfItemsProcess[1], timeDelayOfItemsProcess[2]))
print("\ntotal mean time = %.4f ms" % (timeDelayOfItemsUpdate[0] + timeDelayOfItemsProcess[0]))
print("total max time = %.4f ms" % (timeDelayOfItemsUpdate[1] + timeDelayOfItemsProcess[1]))
print("total min time = %.4f ms" % (timeDelayOfItemsUpdate[2] + timeDelayOfItemsProcess[2]))
- 代码总体还是比较清晰明了的,请根据实际打印做修改
5. 输出效果
6. python方法详解
- re模块
python自1.5版本新增了re模块,re模块提供全部的正则表达式功能,具体用法参考:python正则表达式模块
- findall方法
re模块的过滤函数,使用正则表达式,返回匹配结果的列表。findall函数
- write和writelines方法
write(str) 参数是一个字符串。需传入字符串而不是列表
writelines(sequence) 参数是一个序列。必须传入字符串序列,不能是数字
write和writelines区别
- numpy求平均和中位数
numpy包可用来进行求和、求平均、求中位数等 numpy求平均
求和:np.sum(list)
求平均:np.mean(list)
求中位数:np.median(list)
- max/min列表求最大最小
求最大:max(list)
求最小:min(list)
参考文章:
提取具有特定字符串的行数字
字符串中提取特定的数据
txt文本字符提取
数字字符串转数字
字符串写入后换行
created by shuaixio, 2022.01.18