计算深度学习模型的推理时间、fps

最新推荐文章于 2024-05-26 23:42:42 发布

Alocus_

最新推荐文章于 2024-05-26 23:42:42 发布

阅读量6.9k

点赞数 2

分类专栏：深度学习 python

本文链接：https://blog.csdn.net/crystal_remember/article/details/124422588

版权

PyTorch 模型推理时间测量 torch.cuda.synchronize() 性能基准

关键词由CSDN通过智能技术生成

python 同时被 2 个专栏收录

94 篇文章 10 订阅

订阅专栏

深度学习

16 篇文章 0 订阅

订阅专栏

背景

代码

参考：

背景

评估模型的推理时间时有需要注意的地方。如torch.cuda.synchronize()，因为pytorch代码执行时异步的，使用该代码会等待gpu上所有操作结束后再接着运行代码、计算时间等【1】。

代码

函数【2】：


import time
def measure_inference_speed(model, data, max_iter=200, log_interval=50):
    model.eval()

    # the first several iterations may be very slow so skip them
    num_warmup = 5
    pure_inf_time = 0
    fps = 0

    # benchmark with 2000 image and take the average
    for i in range(max_iter):

        torch.cuda.synchronize()
        start_time = time.perf_counter()

        with torch.no_grad():
            model(*data)

        torch.cuda.synchronize()
        elapsed = time.perf_counter() - start_time

        if i >= num_warmup:
            pure_inf_time += elapsed
            if (i + 1) % log_interval == 0:
                fps = (i + 1 - num_warmup) / pure_inf_time
                print(
                    f'Done image [{i + 1:<3}/ {max_iter}], '
                    f'fps: {fps:.1f} img / s, '
                    f'times per image: {1000 / fps:.1f} ms / img',
                    flush=True)

        if (i + 1) == max_iter:
            fps = (i + 1 - num_warmup) / pure_inf_time
            print(
                f'Overall fps: {fps:.1f} img / s, '
                f'times per image: {1000 / fps:.1f} ms / img',
                flush=True)
            break
    return fps

调用【2】：

import measure_inference_speed
net = net.cuda()
data = torch.randn((1, 6, 128, 128)).cuda()
measure_inference_speed(net, (data,))

参考：

【1】

pytorch 正确的测试时间的代码 torch.cuda.synchronize()_枯叶蝶KYD的博客-CSDN博客_pytorch 时间

【2】

NAFNet/NAFSSR_arch.py at main · megvii-research/NAFNet · GitHub

Alocus_

关注

2
点赞
踩
41

收藏

觉得还不错? 一键收藏
打赏
0
评论
计算深度学习模型的推理时间、fps

目录背景代码参考：背景评估模型的推理时间时有需要注意的地方。如torch.cuda.synchronize()，因为pytorch代码执行时异步的，使用该代码会等待gpu上所有操作结束后再接着运行代码、计算时间等【1】。代码函数【2】：import timedef measure_inference_speed(model, data, max_iter=200, log_interval=50): model.eval() # the fir
复制链接

扫一扫