计算机视觉模型性能测试总结

最新推荐文章于 2024-07-06 23:24:43 发布

oliveray

最新推荐文章于 2024-07-06 23:24:43 发布

阅读量271

点赞数 1

文章标签：计算机视觉人工智能

本文链接：https://blog.csdn.net/cshsjdh/article/details/133002031

版权

本文介绍了深度学习模型性能评估的关键指标，包括参数数量、FLOPs（浮点运算次数）和吞吐量，以及如何计算这些指标的方法，如CNN和Transformer的FLOPs计算示例，以及如何通过实验数据验证模型性能。

摘要由CSDN通过智能技术生成

当评估一个深度学习模型好坏时，我们通常可以考虑以下几个关键指标：图像尺寸（Imgsize）、参数数量（param）、FLOPs（Floating Point Operations）、吞吐量（Throughput ）等指标，如图1所示。

如何得出这份评估测试表格呢？

一、实验数据计算方法。

1、param：表示模型中的可学习参数的数量。参数数量越多，模型的容量越大，可以更好地拟合训练数据。然而，较多的参数也会增加模型的大小和计算量。

def count_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

param = count_parameters(your_model)
print(f"Parameters: {params}")

#注意：Param通常以M为单位，所以可以转换单位
param_in_M = Param / 1e6

print(f"Param in M: {Param_in_M} M")

2、FLOPs：表示进行推理所需的浮点运算次数，是衡量模型计算复杂度的指标。较低的FLOPs值通常意味着较快的推理速度。对于CNN和Transformers的FLOPs的计算方法略有不同：

对于CNN而言，FLOPs的计算主要涉及卷积层和全连接层（池化层和BN层一般没有），我们需要遍历每一层计算FLOPs。

#卷积层：FLOPs = 输入通道数 × 输出通道数 × 卷积核尺寸 × 卷积核尺寸 × 输出特征图尺寸 × 输出特征图尺寸
#全连接层：FLOPs = 输入特征数 × 输出特征数
#以VGG-16为例

def count_flops(model, input_size):
    flops = 0
    input = torch.randn(1, *input_size)

    def conv_hook(module, input, output):
        nonlocal flops
        batch_size, input_channels, input_height, input_width = input[0].size()
        output_channels, output_height, output_width = output[0].size()
        kernel_height, kernel_width = module.kernel_size
        flops += batch_size * output_channels * output_height * output_width * (
                    input_channels * kernel_height * kernel_width + 1)

    def fc_hook(module, input, output):
        nonlocal flops
        batch_size, input_features = input[0].size()
        output_features = output[0].size(0)  # 修改这里
        flops += batch_size * input_features * output_features

    hooks = []
    for name, module in model.named_modules():
        if isinstance(module, nn.Conv2d):
            hooks.append(module.register_forward_hook(conv_hook))
        elif isinstance(module, nn.Linear):
            hooks.append(module.register_forward_hook(fc_hook))

    model(input)

    for hook in hooks:
        hook.remove()

    return flops

model = vgg16()
flops = count_flops(model, (3, 224, 224))

对于Transformers而言，FLOPs的计算主要涉及自注意力层和前馈神经网络层。

#这里我用一个轻量级的ViT来实例
import torch
from torchsummary import summary
from models.vision_transformer import deit_tiny_patch16_224

# 加载预训练的Vision Transformer模型
model = deit_tiny_patch16_224(pretrained=False).cuda()
# 输入尺寸
input_height = 224
input_width = 224

# 使用torchsummary库获取模型的详细信息
summary(model, input_size=(3, input_height, input_width))

# 获取模型的FLOPs
flops = torch.sum(torch.Tensor([param.numel() for param in model.parameters() if param.requires_grad])).item()

print("Vision Transformer Model FLOPs:", flops)

结果如下：

================================================================
Total params: 5,679,400
Trainable params: 5,679,400
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 102.14
Params size (MB): 21.67
Estimated Total Size (MB): 124.38
----------------------------------------------------------------
Vision Transformer Model FLOPs: 5717416.0

3、吞吐量：表示模型在单位时间内可以处理的图像数量。较高的吞吐量意味着模型可以更快地处理图像。它往往取决于硬件配置、模型复杂度等因素。要计算代码的吞吐量，可以测量代码的执行时间，并结合任务数量或处理的数据量来计算吞吐量。

import torch
from torchvision import transforms

# 加载模型
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
model = torch.load('model.pth', map_location=DEVICE)  #模型权重文件
model.eval()
model.to(DEVICE)

transform_test = transforms.Compose([
    transforms.Resize((224, 224))# 定义测试数据的转换函数
,
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.51819474, 0.5250407, 0.4945761], std=[0.24228974, 0.24347611, 0.2530049])
])

# 创建虚拟输入
optimal_batch_size = 16
dummy_input = torch.randn(optimal_batch_size, 3, 224, 224, dtype=torch.float).to(DEVICE)

# 计算吞吐量
repetitions = 100
total_time = 0
with torch.no_grad():
    for rep in range(repetitions):
        starter, ender = torch.cuda.Event(enable_timing=True), torch.cuda.Event(enable_timing=True)
        starter.record()
        _ = model(dummy_input)
        ender.record()
        torch.cuda.synchronize()
        curr_time = starter.elapsed_time(ender) / 1000
        total_time += curr_time

throughput = (repetitions * optimal_batch_size) / total_time
print('Final Throughput:', throughput)

这样，常见的模型性能测试的数据就可以被计算出来，它可以验证该模型在给定任务上的表现和效果，它对于模型的开发和部署都有参考作用。

以上为全部内容！

oliveray

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
计算机视觉模型性能测试总结

当评估一个深度学习模型好坏时，我们通常可以考虑以下几个关键指标：图像尺寸（Imgsize）、参数数量（param）、FLOPs（Floating Point Operations）、吞吐量（Throughput ）等指标，如图1所示。要计算代码的吞吐量，可以测量代码的执行时间，并结合任务数量或处理的数据量来计算吞吐量。然而，较多的参数也会增加模型的大小和计算量。这样，常见的模型性能测试的数据就可以被计算出来，它可以验证该模型在给定任务上的表现和效果，它对于模型的开发和部署都有参考作用。
复制链接

扫一扫