场景
很多模型的性能比较,不只是指标的提升,还需要考虑模型的效率、速度;
计算模型的推理速度FPS是一项重要的指标,记录如下:
# the first several iterations may be very slow so skip them
parser.add_argument(
'--max-iter', type=int, default=2000, help='num of max iter')
parser.add_argument(
'--log-interval', type=int, default=50, help='interval of logging')
num_warmup = 5
pure_inf_time = 0
fps = 0
# benchmark with 2000 image and take the average
for i, data in enumerate(data_loader):
torch.cuda.synchronize()
start_time = time.perf_counter()
with torch.no_grad():
model(return_loss=False, rescale=True, **data)
torch.cuda.synchronize()
elapsed = time.perf_counter() - start_time
if i >= num_warmup:
pure_inf_time += elapsed
if (i + 1) % log_interval == 0:
fps = (i + 1 - num_warmup) / pure_inf_time
print(
f'Done image [{i + 1:<3}/ {max_iter}], '
f'fps: {fps:.1f} img / s, '
f'times per image: {1000 / fps:.1f} ms / img',
flush=True)
if (i + 1) == max_iter:
fps = (i + 1 - num_warmup) / pure_inf_time
print(
f'Overall fps: {fps:.1f} img / s, '
f'times per image: {1000 / fps:.1f} ms / img',
flush=True)
break
return fps
思路
FPS的计算,主要是对模型前向传播的耗时平均数进行计算,不包含读取数据和计算指标耗时。
- 计算每一次、每张图进行推理是耗时
- 将耗时累加并除以累加次数得到平均值