了解卷积神经网络参数量和计算量内存计算:
参数量的计算:
每一次卷积的参数量和特征图的大小无关,仅和卷积核的大小,偏置及BN有关。
每个卷积层的参数量,+1表示偏置:Co x (Kw x Kh x Cin + 1)
输出通道数×(卷积核高×卷积核的宽×输入通道数+1)
全连接层的参数量:(D1 + 1) x D2(维度1+偏置)×维度2
BN层的参数量:
因为BN层需要学习两个参数γ \gammaγ和β \betaβ,所以参数量是2xCo。
计算量分析:
一般卷积神经网络一次向前传播的乘法运算次数为:
H×W×M×C×K×K
(卷积核的尺寸是K × K ,有C个特征图作为输入,每个输出的特征图大小为H × W ,输出为M 个特征图)
一次卷积的计算量:(Kw x Kh) + (Kw x Kh - 1)
卷积核的高x卷积核的宽+(卷积核的高x卷积核的宽-偏置)
一个特征图上的卷积次数,即输出特征图的大小:
全连接层的计算量:(D1 + (D1 - 1) + 1) x D2
(维度1+(维度1-偏置)+偏置)x维度2
代码为输入图片大小后,通道数变化和网络结构及参数量的打印。
内存的计算:
参数量所占内存(32位的float需要占用4个字节)
Memory(MB) = params x 4 /1024 /1024
比如:VGG参数量约为138million,则内存大小为138*3.8 = 524MB
每张图所占内存
计算一整张图的过程中的所有特征图层所占内存为Fw x Fh x C的加和,乘以4byte,再/1024/1024。
训练所占显存
比如:
参数量为500w,则内存为19MB;
一张图内存为100w,则内存为4MB;
Batchsize = 128;则:
模型所占显存:19x2 = 38MB(1为params,1为Adam)
输出所占显存:128x4x2 = 1024MB(2为forward和backward)
总共需要显存:38+1024 > 1G
方法一:
一般使用pytorch,已经定义好的网络,使用torchstat可以直接,输入网络的图大小,然后可查看参数量以及计算量。卷积神经网络参数量和计算量内存计算:
在网络中参数详解:
网络结构:conv2d,BN,ReLU,maxpool等
input shape:输入的大小
output shape:输出的大小
params:参数数量
#pip install torchstat
from torchstat import stat
import torchvision.models as models
model1 = models.resnet18()
stat(model1, (3, 224, 224))
model2 = models.mobilenet_v2()
stat(model2, (3, 128, 128))
print(model2)
'''
model3 = models.mobilenet_v2()
stat(model3, (3, 224, 224))'''#输入大小不同,参数量计算量都不同
结果1:
方法二:
使用tensorboard,shop
可以统计自己模型的大小
#方法二
# 统计模型参数量
from tensorboard import summary
from emotic import Emotic
from skipnet import SkipNet, get_skipnet
# -- coding: utf-8 --
import torch
import torchvision
from thop import profile
# Model
print('***********alexnet##########')
model = torchvision.models.alexnet(pretrained=False)
dummy_input = torch.randn(1, 3, 224, 224)
flops, params = profile(model, (dummy_input,))
print('flops: ', flops, 'params: ', params)
print('flops: %.2f M, params: %.2f M' % (flops / 1000000.0, params / 1000000.0))
# -- coding: utf-8 --
import torch
from thop import profile
# Model
print('**********使用自己的网络*********')
model1 = get_skipnet(num_classes=26, mode='large', width=1.0,skip_w=8)#可以改为自己的模型,在上面导入类就可以
dummy_input = torch.randn(1, 3, 128, 128)
flops, params = profile(model1, (dummy_input,))
print('flops: ', flops*2, 'params: ', params*2)
print('face+body:')
print('flops: %.2f M, params: %.2f M' % (2*flops / 1000000.0, 2*params / 1000000.0))
model2 = torchvision.models.resnet18(pretrained=False)
dummy_input = torch.randn(1, 3, 224, 224)
flops, params = profile(model2, (dummy_input,))
print('flops: ', flops, 'params: ', params)
print('flops: %.2f M, params: %.2f M' % (flops / 1000000.0, params / 1000000.0))
model3 = torchvision.models.resnet18(pretrained=False)
dummy_input = torch.randn(1, 3, 224, 224)
flops, params = profile(model2, (dummy_input,))
print('flops: ', flops*2, 'params: ', params)
print('flops: %.2f M, params: %.2f M' % (flops / 1000000.0, params*2 / 1000000.0))
结果2:
Mobilenet实例:
# Tensorflow 深度学习框架
from collections import namedtuple
import functools
import tensorflow as tf
slim = tf.contrib.slim
# Conv and DepthSepConv namedtuple define layers of the MobileNet architecture
# Conv defines 3x3 convolution layers
# DepthSepConv defines 3x3 depthwise convolution followed by 1x1 convolution.
# stride is the stride of the convolution
# depth is the number of channels or filters in a layer
Conv = namedtuple('Conv', ['kernel', 'stride', 'depth'])
DepthSepConv = namedtuple('DepthSepConv', ['kernel', 'stride', 'depth'])
# _CONV_DEFS specifies the MobileNet body
_CONV_DEFS = [
Conv(kernel=[3, 3], stride=2, depth=32),
DepthSepConv(kernel=[3, 3], stride=1, depth=64),
DepthSepConv(kernel=[3, 3], stride=2, depth=128),
DepthSepConv(kernel=[3, 3], stride=1, depth=128),
DepthSepConv(kernel=[3, 3], stride=2, depth=256),
DepthSepConv(kernel=[3, 3], stride=1, depth=256),
DepthSepConv(kernel=[3, 3], stride=2, depth=512),
DepthSepConv(kernel=[3, 3], stride=1, depth=512),
DepthSepConv(kernel=[3, 3], stride=1, depth=512),
DepthSepConv(kernel=[3, 3], stride=1, depth=512),
DepthSepConv(kernel=[3, 3], stride=1, depth=512),
DepthSepConv(kernel=[3, 3], stride=1, depth=512),
DepthSepConv(kernel=[3, 3], stride=2, depth=1024),
DepthSepConv(kernel=[3, 3], stride=1, depth=1024)
]
input_size = 160
inputdepth = 3
conv_defs = _CONV_DEFS
sumcost = 0
for i, conv_def in enumerate(conv_defs):
stride = conv_def.stride
kernel = conv_def.kernel
outdepth = conv_def.depth
output_size = round((input_size - int(kernel[0] / 2) * 2) / stride)
if isinstance(conv_def, Conv):
sumcost += output_size * output_size * kernel[0] * kernel[0] * inputdepth * outdepth
if isinstance(conv_def, DepthSepConv):
sumcost += output_size * output_size * kernel[0] * kernel[0] * inputdepth * outdepth
inputdepth = outdepth
input_size = output_size
print("src conv: ", sumcost)
input_size = 160
inputdepth = 3
conv_defs = _CONV_DEFS
sumcost1 = 0
for i, conv_def in enumerate(conv_defs):
stride = conv_def.stride
kernel = conv_def.kernel
outdepth = conv_def.depth
output_size = round((input_size - int(kernel[0] / 2) * 2) / stride)
if isinstance(conv_def, Conv):
sumcost1 += output_size * output_size * kernel[0] * kernel[0] * inputdepth * outdepth
if isinstance(conv_def, DepthSepConv):
#sumcost += output_size * output_size * kernel[0] * kernel[0] * inputdepth * outdepth
sumcost1 += output_size * output_size *(inputdepth * kernel[0] * kernel[0] + inputdepth * outdepth * 1 * 1)
inputdepth = outdepth
input_size = output_size
print("DepthSepConv:", sumcost1)
print("compare:", sumcost1 / sumcost)