vgg 16模型的内存和参数量的计算_vgg16去除全连接层有多少参数-CSDN博客

本文链接：https://blog.csdn.net/rocling/article/details/103833004

cs231n上关于VGG-16模型的内存和参数的计算过程如下。

INPUT: [224x224x3]        memory:  224*224*3=150K   weights: 0
CONV3-64: [224x224x64]  memory:  224*224*64=3.2M   weights: (3*3*3)*64 = 1,728
CONV3-64: [224x224x64]  memory:  224*224*64=3.2M   weights: (3*3*64)*64 = 36,864
POOL2: [112x112x64]  memory:  112*112*64=800K   weights: 0
CONV3-128: [112x112x128]  memory:  112*112*128=1.6M   weights: (3*3*64)*128 = 73,728
CONV3-128: [112x112x128]  memory:  112*112*128=1.6M   weights: (3*3*128)*128 = 147,456
POOL2: [56x56x128]  memory:  56*56*128=400K   weights: 0
CONV3-256: [56x56x256]  memory:  56*56*256=800K   weights: (3*3*128)*256 = 294,912
CONV3-256: [56x56x256]  memory:  56*56*256=800K   weights: (3*3*256)*256 = 589,824
CONV3-256: [56x56x256]  memory:  56*56*256=800K   weights: (3*3*256)*256 = 589,824
POOL2: [28x28x256]  memory:  28*28*256=200K   weights: 0
CONV3-512: [28x28x512]  memory:  28*28*512=400K   weights: (3*3*256)*512 = 1,179,648
CONV3-512: [28x28x512]  memory:  28*28*512=400K   weights: (3*3*512)*512 = 2,359,296
CONV3-512: [28x28x512]  memory:  28*28*512=400K   weights: (3*3*512)*512 = 2,359,296
POOL2: [14x14x512]  memory:  14*14*512=100K   weights: 0
CONV3-512: [14x14x512]  memory:  14*14*512=100K   weights: (3*3*512)*512 = 2,359,296
CONV3-512: [14x14x512]  memory:  14*14*512=100K   weights: (3*3*512)*512 = 2,359,296
CONV3-512: [14x14x512]  memory:  14*14*512=100K   weights: (3*3*512)*512 = 2,359,296
POOL2: [7x7x512]  memory:  7*7*512=25K  weights: 0
FC: [1x1x4096]  memory:  4096  weights: 7*7*512*4096 = 102,760,448
FC: [1x1x4096]  memory:  4096  weights: 4096*4096 = 16,777,216
FC: [1x1x1000]  memory:  1000 weights: 4096*1000 = 4,096,000

TOTAL memory: 24M * 4 bytes ~= 93MB / image (only forward! ~*2 for bwd)
TOTAL params: 138M parameters

如果动手算一下memory的和，如下。

0.15M+3.2M+3.2M+0.8M+1.6M+1.6M+0.4M+0.8M3+0.2M+0.4M3+0.1M*4+0.025M+0.004M+0.004M+0.001M = 15.184M

问题来了，计算得到的总内存为15.184*4 bytes ~=60MB / image，而教程中给的为93MB。并且教程中的数据(24M)被许多博客和文章所引用，导致经常看到的就是24M这个数据。起始该答案早在github上有针对教程中的疑问，并给出了如上手动计算的结果。所以，15M的memory才是正确的答案。另外上述针对参数(weight)的计算中并没有加入bias的数量。

延伸1：模型的组成

参数的数量约为138M(不包含bias)，此时占用的内存大小为：138M*4 bytes ~=526M，该大小约等于保存后模型占用磁盘的大小，而实际利用ImageNet训练出来的VGG-16的模型大小超过552M。那么差的20多M内存哪去了，此处不局限于该模型的探讨，而是想更一般化的探讨模型中的成分。

由上可知，模型中主要是权重和偏移单元，如vgg-16加上偏移向量后的参数计算结果如下。然后是优化器和其他特殊层中的参数，如LeakyReLU，BN，和Dropout等的参数，如keras中的model。

conv3-64  x 2       : 38,720
conv3-128 x 2       : 221,440
conv3-256 x 3       : 1,475,328
conv3-512 x 3       : 5,899,776
conv3-512 x 3       : 7,079,424
fc1                 : 102,764,544
fc2                 : 16,781,312
fc3                 : 4,097,000
TOTAL               : 138,357,544

其中全连接的计算为：

 fc1 (x): (512x7x7)x4,096 (weights) + 4,096 (biases)
 fc2    : 4,096x4,096     (weights) + 4,096 (biases)
 fc3    : 4,096x1,000     (weights) + 1,000 (biases)

延伸2：训练和测试时的内存组成