Pytorch中计算自己模型的FLOPs | thop.profile() 方法 | yolov5s 网络模型参数量、计算量统计

墨理学AI

已于 2022-04-12 12:15:57 修改

阅读量2.7w

点赞数 45

分类专栏： YOLO专栏文章标签： Count FLOPs 模型计算量模型复杂度

于 2021-02-07 14:51:39 首次发布

本文链接：https://blog.csdn.net/sinat_28442665/article/details/113738818

版权

YOLO专栏专栏收录该内容

17 篇文章 108 订阅

订阅专栏

1-0

享受学术探讨的欢乐，传递温暖，希望能够帮助到刚刚入门的同学

🍊 专栏：墨理有话说：一些读研、论文写作、Bug 高效排除方面建议
🍊 感谢每位读者大大、学术探讨小伙伴的支持和认可

🍖 Pytorch中计算自己模型的FLOPs | yolov5s 网络模型参数量、计算量统计

文章目录

📙 FLOPS 基础概念理解

参考链接：这部分内容精选于 Z 乎问答，感谢各位大佬

FLOPS：注意全大写，是floating point operations per second的缩写，意指每秒浮点运算次数，理解为计算速度。是一个衡量硬件性能的指标。
FLOPs：注意s小写，是floating point operations的缩写（s表复数），意指浮点运算数，理解为计算量。可以用来衡量算法/模型的复杂度。

描述一个深度学习框架/模型，除了精确度，通常用正向推理计算量和参数个数(#Parameters)来描述复杂度

1-0
1-1

深度学习框架 FLOPs 的组成

Conv1d/2d/3d (including grouping)
ConvTranspose1d/2d/3d (including grouping)
BatchNorm1d/2d/3d, GroupNorm, InstanceNorm1d/2d/3d
Activations (ReLU, PReLU, ELU, ReLU6, LeakyReLU)
Linear
Upsample
Poolings (AvgPool1d/2d/3d, MaxPool1d/2d/3d and adaptive ones)

其中，Conv所占的比重通常最大
和预处理之后网络的输入图像大小有关系
而 #Parameters和图像大小无关

关于网络模型参数和计算量统计，当前有如下两个大佬的作品可供使用

本博文简单解析，和试用，第二个代码

📔 pytorch-OpCounter GitHub 主页：

https://github.com/Lyken17/pytorch-OpCounter

🟧 How to install

pip install thop

🟨 How to use

Basic usage

from torchvision.models import resnet50
from thop import profile
model = resnet50()
input = torch.randn(1, 3, 224, 224) 
macs, params = profile(model, inputs=(input, ))

Define the rule for 3rd party module.

class YourModule(nn.Module):
    # your definition
def count_your_model(model, x, y):
    # your rule here

input = torch.randn(1, 3, 224, 224)
macs, params = profile(model, inputs=(input, ), 
                        custom_ops={YourModule: count_your_model})

Improve the output readability

设置参数的输出格式

from thop import clever_format
macs, params = clever_format([macs, params], "%.3f")

📕 运行该项目评估代码

python benchmark/evaluate_famous_models.py

运行输出如下

Model	Params(M)	FLOPs(G)
alexnet	61.10	0.71
densenet121	7.98	2.87
densenet161	28.68	7.79
densenet169	14.15	3.40
densenet201	20.01	4.34
googlenet	6.62	1.50
inception_v3	23.83	5.73
mobilenet_v2	3.50	0.31
resnet101	44.55	7.83
resnet152	60.19	11.56
resnet18	11.69	1.82
resnet34	21.80	3.67
resnet50	25.56	4.11
resnext101_32x8d	88.79	16.48
resnext50_32x4d	25.03	4.26
shufflenet_v2_x0_5	1.37	0.04
shufflenet_v2_x1_0	2.28	0.15
shufflenet_v2_x1_5	3.50	0.30
shufflenet_v2_x2_0	7.39	0.59
squeezenet1_0	1.25	0.82
squeezenet1_1	1.24	0.35
vgg11	132.86	7.62
vgg11_bn	132.87	7.63
vgg13	133.05	11.32
vgg13_bn	133.05	11.35
vgg16	138.36	15.48
vgg16_bn	138.37	15.51
vgg19	143.67	19.65
vgg19_bn	143.68	19.68

python benchmark/evaluate_rnn_models.py

运行输出如下

Model	Params(M)	FLOPs(G)
RNNCell	0.35	0.01
GRUCell	1.04	0.03
LSTMCell	1.38	0.04
RNN	0.35	1.11
GRU	1.04	3.32
LSTM	1.38	4.43
stacked-RNN	1.92	6.15
stacked-GRU	5.76	18.49
stacked-LSTM	7.68	24.64
BiRNN	0.69	2.21
BiGRU	2.07	6.65
BiLSTM	2.76	8.86
stacked-BiRNN	5.41	17.34
stacked-BiGRU	16.24	52.07
stacked-BiLSTM	21.66	69.42

📗 自己实测的一个代码（该代码无法直接运行）

使用 from thop import profile 方法进行统计

input = torch.randn([1, self.img_ch, self.img_size, self.img_size]).to(self.device)
print('input size:')
print(input.size())

print(input)

macs, params = profile(self.disA, inputs=(input,))
name = 'disA'
print("%s | %s | %s" % ("Model", "Params(M)", "FLOPs(G)"))
print("---|---|---")
print("%s | %.2f | %.2f" % (name, params / (1000 ** 2), macs / (1000 ** 3)))


real_A_ae = self.disA(input)
macs, params = profile(self.gen2B, inputs=(real_A_ae,))

print()
name = 'gen2B'
print("%s | %s | %s" % ("Model", "Params(M)", "FLOPs(G)"))
print("---|---|---")
print("%s | %.2f | %.2f" % (name, params / (1000 ** 2), macs / (1000 ** 3)))

Model	Params(M)	FLOPs(G)
disA	0.17	0.73

Model	Params(M)	FLOPs(G)
gen2B	8.10	33.78

这里属个人笔记，和本部分内容无关

这个示例只是说明，对于精简网络层之后的 NiceGAN测试阶段判别器的层（共用作为编码层）参数量极少；
主要还是生成器在运算；

📘 yolov5s 网络模型参数量、计算量统计

🟧 代码修改

对 YOLOv5 有兴趣，可简单查阅我的这篇博文

YOLOv5 环境搭建 | coco128 训练示例 |❤️ 详细记录❤️ |【YOLOv5】

本部分，基础步骤如下

下载 yolov5 代码
https://github.com/ultralytics/yolov5
vim detect.py 效果如下

5-0

推理输出如下，可以发现 yolov5 本身yolov5-5.0/utils/torch_utils.py 下 model_info 方法已对参数和计算量进行统计，对比可以发现

parameters 数值一致
计算量 GFLOPS 存在差异【未深入探究，需要对比 model_info 方法实现】
GFLOPS 数值和送入模型 input 的 shape 正相关
这里的 yolov5s 模型支持动态尺寸【预处理之后，input 宽高为 N 的整数倍数】图像处理
均使用 from thop import profile 下方法进行的统计

python detect.py --source data/images/bus.jpg 

Namespace(agnostic_nms=False, augment=False, classes=None, conf_thres=0.25, device='', exist_ok=False, img_size=640, iou_thres=0.45, name='exp', nosave=False, project='runs/detect', save_conf=False, save_txt=False, source='data/images/bus.jpg', update=False, view_img=False, weights='yolov5s.pt')
YOLOv5 🚀 2021-4-12 torch 1.8.1+cu111 CUDA:0 (Quadro RTX 5000, 16125.3125MB)


Fusing layers... 
# 代码本身统计 ，欢迎补充
Model Summary: 224 layers, 7266973 parameters, 0 gradients, 17.0 GFLOPS



# 自定义方法，参数量、计算量统计输出
 
Model | Params(M) | FLOPs(G)
---|---|---
yolov5s | 7.26697300 | 6.37915530
640x480 4 persons, 1 bus, 1 fire hydrant, Done. (0.040s)
Results saved to runs/detect/exp11
Done. (0.162s)

相关代码如下

import torch
from thop import profile

# ...

"""add moli yolov5s 参数量、计算量统计  start """
print("yolov5s 参数量、计算量统计 \n")
total_ops, total_params = profile(model, inputs=(img,))
name = "yolov5s"

print("%s | %s | %s" % ("Model", "Params(M)", "FLOPs(G)"))
print("---|---|---")
print("%s | %.8f | %.8f" % (name, total_params / (1000 ** 2), total_ops / (1000 ** 3)))
"""add moli yolov5s 参数量、计算量统计  End """

再或者，相关代码添加到 34行 # Load model 之后的位置，效果和上面方式一样

5-1

🟨 相关报错统计

可能遇到的报错一

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same

解决方法是，input 类型需要指定 device ， input = torch.randn(1, 3, 640, 480).to(device)

可能遇到的报错二

RuntimeError: Sizes of tensors must match except in dimension 3. Got 15 and 16 (The offending index is 0)

原因是： input = torch.randn(1, 3, 640, 120).to(device) input 的宽高无法被 N 【盲猜 N = 32，欢迎修正】整除

9-8

📙 博主 AI 领域八大干货专栏、诚不我欺

🍊 计算机视觉： Yolo专栏、一文读懂
🍊 计算机视觉：图像风格转换–论文–代码测试
🍊 计算机视觉：图像修复-代码环境搭建-知识总结
🍊 计算机视觉：超分重建-代码环境搭建-知识总结
🍊 深度学习：环境搭建，一文读懂
🍊 深度学习：趣学深度学习
🍊 落地部署应用：模型部署之转换-加速-封装
🍊 CV 和语音数据集：数据集整理

📙 预祝各位前途似锦、可摘星辰

🎉 作为全网 AI 领域干货最多的博主之一，❤️ 不负光阴不负卿 ❤️
❤️ 过去的每一天、想必你也都有努力、祝你披荆斩棘、未来可期

🍊 深度学习模型训练推理——基础环境搭建推荐博文查阅顺序【基础安装—认真帮大家整理了】
🍊 计算机视觉：硕博士，交流、敬请查阅
🍊 点赞 👍 收藏 ⭐留言 📝 都是博主坚持写作、更新高质量博文的最大动力！

9-9

墨理学AI

关注

45
点赞
踩
275

收藏

觉得还不错? 一键收藏
打赏
25
评论
Pytorch中计算自己模型的FLOPs | thop.profile() 方法 | yolov5s 网络模型参数量、计算量统计

Pytorch中计算自己模型的FLOPspytorch-OpCounter GitHub 主页：https://github.com/Lyken17/pytorch-OpCounterpip install thop运行该项目评估代码python benchmark/evaluate_famous_models.py 运行输出如下ModelParams(M)FLOPs(G)
复制链接

扫一扫