一、ResNet和DenseNet的之间的关系以及区别,如参数量和训练速度。
简单介绍下ResNet以及DenseNet
-
ResNet的最大的贡献是缓解深层的神经网络训练中的梯度消失问题,主要是利用shortcut的技术可以加深网络的深度而没有退化。反向传播时,梯度可能会在距离输入近的地方中产生消失的现象,既然离输入近的神经网络层较难训练,则可以将它短接到更加靠近输出的地方。因为直接映射是难以学习的,而ResNet学习的是残差,这是更容易学习的。shortcut的设计是将模块的输入和输出连接在一起,然后在元素层面上进行加(add),相当于跨过中间层,进行简单的同等映射,这样做不会产生额外的参数,不会增加计算的复杂度,而且保证至少加深后网络的性能不会比加深前差。
-
DenseNet主要是基于ResNet的shortcut的思想,不同的是采用的是一种更密集的连接方式,是一个密集卷积神经网络,以前向传播方式,将每一层与其余层密集连接。这样做的目的是可以确保各层之间的信息流动达到最大,将所有层(特征图大小匹配)直接连接在一起,注意这里是维度上相加(concat)。
-
相对ResNet,DenseNet有以下几个显著特点:
(1)更好地缓解了梯度消失问题
(2)增强了特征在网络间的传播
(3)实现和加强了特征重用
(4)有效减少了参数数量
DenseNet参数比ResNet参数少的原因:
- 每一次卷积输入输出的chanenl个数要比ResNet少很多。
- 全连接层的参数也比ResNet少很多。
DenseNet训练速度比ResNet训练慢的原因:
- DenseNet的feature map比ResNet大很多,导致卷积过程的计算量比ResNet大很多,简而言之就是flops要大一些,内存的占用要大一些。
- 内存的访问次数要多很多,内存的访问是很慢的。因为DenseNet是一种密集的连接方式。每一层的输入是前面每一层的特征,因为每当进行新的一层计算,需要读取前面所有层,因此需要频繁读取内存。
附一下如何计算模型的参数量
from torchstat import stat
import torchvision
DenseNet = torchvision.models.densenet121()
stat(DenseNet, (3,244,244))
module name input shape output shape params memory(MB) MAdd Flops MemRead(B) MemWrite(B) duration[%] MemR+W(B)
0 features.conv0 3 244 244 64 122 122 9408.0 3.63 279,104,768.0 140,028,672.0 752064.0 3810304.0 3.20% 4562368.0
1 features.norm0 64 122 122 64 122 122 128.0 3.63 3,810,304.0 1,905,152.0 3810816.0 3810304.0 0.80% 7621120.0
2 features.relu0 64 122 122 64 122 122 0.0 3.63 952,576.0 952,576.0 3810304.0 3810304.0 0.00% 7620608.0
3 features.pool0 64 122 122 64 61 61 0.0 0.91 1,905,152.0 952,576.0 3810304.0 952576.0 5.60% 4762880.0
4 features.denseblock1.denselayer1.norm1 64 61 61 64 61 61 128.0 0.91 952,576.0 476,288.0 953088.0 952576.0 0.00% 1905664.0
5 features.denseblock1.denselayer1.relu1 64 61 61 64 61 61 0.0 0.91 238,144.0 238,144.0 952576.0 952576.0 0.00% 1905152.0
6 features.denseblock1.denselayer1.conv1 64 61 61 128 61 61 8192.0 1.82 60,488,576.0 30,482,432.0 985344.0 1905152.0 0.80% 2890496.0
7 features.denseblock1.denselayer1.norm2 128 61 61 128 61 61 256.0 1.82 1,905,152.0 952,576.0 1906176.0 1905152.0 0.80% 3811328.0
8 features.denseblock1.denselayer1.relu2 128 61 61 128 61 61 0.0 1.82 476,288.0 476,288.0 1905152.0 1905152.0 0.00% 3810304.0
9 features.denseblock1.denselayer1.conv2 128 61 61 32 61 61 36864.0 0.45 274,222,816.0 137,170,944.0 2052608.0 476288.0 0.80% 2528896.0
10 features.denseblock1.denselayer2.norm1 96 61 61 96 61 61 192.0 1.36 1,428,864.0 714,432.0 1429632.0 1428864.0 0.00% 2858496.0
11 features.denseblock1.denselayer2.relu1 96 61 61 96 61 61 0.0 1.36 357,216.0 357,216.0 1428864.0 1428864.0 0.00% 2857728.0
12 features.denseblock1.denselayer2.conv1 96 61 61 128 61 61 12288.0 1.82 90,971,008.0 45,723,648.0 1478016.0 1905152.0 1.60% 3383168.0
13 features.denseblock1.denselayer2.norm2 128 61 61 128 61 61 256.0 1.82 1,905,152.0 952,576.0 1906176.0 1905152.0 0.00% 3811328.0
14 features.denseblock1.denselayer2.relu2 128 61 61 128 61 61 0.0 1.82 476,288.0 476,288.0 1905152.0 1905152.0 0.80% 3810304.0
15 features.denseblock1.denselayer2.conv2 128 61 61 32 61 61 36864.0 0.45 274,222,816.0 137,170,944.0 2052608.0 476288.0 0.80% 2528896.0
16 features.denseblock1.denselayer3.norm1 128 61 61 128 61 61 256.0 1.82 1,905,152.0 952,576.0 1906176.0 1905152.0 0.80% 3811328.0
17 features.denseblock1.denselayer3.relu1 128 61 61 128 61 61 0.0 1.82 476,288.0 476,288.0 1905152.0 1905152.0 0.00% 3810304.0
18 features.denseblock1.denselayer3.conv1 128 61 61 128 61 61 16384.0 1.82 121,453,440.0 60,964,864.0 1970688.0 1905152.0 1.60% 3875840.0
19 features.denseblock1.denselayer3.norm2 128 61 61 128 61 61 256.0 1.82 1,905,152.0 952,576.0 1906176.0 1905152.0 0.00% 3811328.0
20 features.denseblock1.denselayer3.relu2 128 61 61 128 61 61 0.0 1.82 476,288.0 476,288.0 1905152.0 1905152.0 0.80% 3810304.0
21 features.denseblock1.denselayer3.conv2 128 61 61 32 61 61 36864.0 0.45 274,222,816.0 137,170,944.0 2052608.0 476288.0 0.80% 2528896.0
22 features.denseblock1.denselayer4.norm1 160 61 61 160 61 61 320.0 2.27 2,381,440.0 1,190,720.0 2382720.0 2381440.0 0.00% 4764160.0
23 features.denseblock1.denselayer4.relu1 160 61 61 160 61 61 0.0 2.27 595,360.0 595,360.0 2381440.0 2381440.0 0.00% 4762880.0
24 features.denseblock1.denselayer4.conv1 160 61 61 128 61 61 20480.0 1.82 151,935,872.0 76,206,080.0 2463360.0 1905152.0 0.80% 4368512.0
25 features.denseblock1.denselayer4.norm2 128 61 61 128 61 61 256.0 1.82 1,905,152.0 952,576.0 1906176.0 1905152.0 0.80% 3811328.0
26 features.denseblock1.denselayer4.relu2 128 61 61 128 61 61 0.0 1.82 476,288.0 476,288.0 1905152.0 1905152.0 0.00% 3810304.0
27 features.denseblock1.denselayer4.conv2 128 61 61 32 61 61 36864.0 0.45 274,222,816.0 137,170,944.0 2052608.0 476288.0 0.80% 2528896.0
28 features.denseblock1.denselayer5.norm1 192 61 61 192 61 61 384.0 2.73 2,857,728.0 1,428,864.0 2859264.0 2857728.0 0.80% 5716992.0
29 features.denseblock1.denselayer5.relu1 192 61 61 192 61 61 0.0 2.73 714,432.0 714,432.0 2857728.0 2857728.0 0.80% 5715456.0
30 features.denseblock1.denselayer5.conv1 192 61 61 128 61 61 24576.0 1.82 182,418,304.0 91,447,296.0 2956032.0 1905152.0 1.60% 4861184.0
31 features.denseblock1.denselayer5.norm2 128 61 61 128 61 61 256.0 1.82 1,905,152.0 952,576.0 1906176.0 1905152.0 0.00% 3811328.0
32 features.denseblock1.denselayer5.relu2 128 61 61 128 61 61 0.0 1.82 476,288.0 476,288.0 1905152.0 1905152.0 0.80% 3810304.0
33 features.denseblock1.denselayer5.conv2 128 61 61 32 61 61 36864.0 0.45 274,222,816.0 137,170,944.0 2052608.0 476288.0 0.80% 2528896.0
34 features.denseblock1.denselayer6.norm1 224 61 61 224 61 61 448.0 3.18 3,334,016.0 1,667,008.0 3335808.0 3334016.0 0.80% 6669824.0
35 features.denseblock1.denselayer6.relu1 224 61 61 224 61 61 0.0 3.18 833,504.0 833,504.0 3334016.0 3334016.0 0.80% 6668032.0
36 features.denseblock1.denselayer6.conv1 224 61 61 128 61 61 28672.0 1.82 212,900,736.0 106,688,512.0 3448704.0 1905152.0 1.60% 5353856.0
37 features.denseblock1.denselayer6.norm2 128 61 61 128 61 61 256.0 1.82 1,905,152.0 952,576.0 1906176.0 1905152.0 0.80% 3811328.0
38 features.denseblock1.denselayer6.relu2 128 61 61 128 61 61 0.0 1.82 476,288.0 476,288.0 1905152.0 1905152.0 0.00% 3810304.0
39 features.denseblock1.denselayer6.conv2 128 61 61 32 61 61 36864.0 0.45 274,222,816.0 137,170,944.0 2052608.0 476288.0 1.60% 2528896.0
40 features.transition1.norm 256 61 61 256 61 61 512.0 3.63 3,810,304.0 1,905,152.0 3812352.0 3810304.0 0.80% 7622656.0
41 features.transition1.relu 256 61 61 256 61 61 0.0 3.63 952,576.0 952,576.0 3810304.0 3810304.0 0.00% 7620608.0
42 features.transition1.conv 256 61 61 128 61 61 32768.0 1.82 243,383,168.0 121,929,728.0 3941376.0 1905152.0 2.40% 5846528.0
43 features.transition1.pool 128 61 61 128 30 30 0.0 0.44 460,800.0 476,288.0 1905152.0 460800.0 1.60% 2365952.0
44 features.denseblock2.denselayer1.norm1 128 30 30 128 30 30 256.0 0.44 460,800.0 230,400.0 461824.0 460800.0 0.00% 922624.0
45 features.denseblock2.denselayer1.relu1 128 30 30 128 30 30 0.0 0.44 115,200.0 115,200.0 460800.0 460800.0 0.00% 921600.0
46 features.denseblock2.denselayer1.conv1 128 30 30 128 30 30 16384.0 0.44 29,376,000.0 14,745,600.0 526336.0 460800.0 0.80% 987136.0
47 features.denseblock2.denselayer1.norm2 128 30 30 128 30 30 256.0 0.44 460,800.0 230,400.0 461824.0 460800.0 0.00% 922624.0
48 features.denseblock2.denselayer1.relu2 128 30 30 128 30 30 0.0 0.44 115,200.0 115,200.0 460800.0 460800.0 0.00% 921600.0
49 features.denseblock2.denselayer1.conv2 128 30 30 32 30 30 36864.0 0.11 66,326,400.0 33,177,600.0 608256.0 115200.0 0.80% 723456.0
50 features.denseblock2.denselayer2.norm1 160 30 30 160 30 30 320.0 0.55 576,000.0 288,000.0 577280.0 576000.0 0.00% 1153280.0
51 features.denseblock2.denselayer2.relu1 160 30 30 160 30 30 0.0 0.55 144,000.0 144,000.0 576000.0 576000.0 0.00% 1152000.0
52 features.denseblock2.denselayer2.conv1 160 30 30 128 30 30 20480.0 0.44 36,748,800.0 18,432,000.0 657920.0 460800.0 0.00% 1118720.0
53 features.denseblock2.denselayer2.norm2 128 30 30 128 30 30 256.0 0.44 460,800.0 230,400.0 461824.0 460800.0 0.00% 922624.0
54 features.denseblock2.denselayer2.relu2 128 30 30 128 30 30 0.0 0.44 115,200.0 115,200.0 460800.0 460800.0 0.80% 921600.0
55 features.denseblock2.denselayer2.conv2 128 30 30 32 30 30 36864.0 0.11 66,326,400.0 33,177,600.0 608256.0 115200.0 0.00% 723456.0
56 features.denseblock2.denselayer3.norm1 192 30 30 192 30 30 384.0 0.66 691,200.0 345,600.0 692736.0 691200.0 0.80% 1383936.0
57 features.denseblock2.denselayer3.relu1 192 30 30 192 30 30 0.0 0.66 172,800.0 172,800.0 691200.0 691200.0 0.00% 1382400.0
58 features.denseblock2.denselayer3.conv1 192 30 30 128 30 30 24576.0 0.44 44,121,600.0 22,118,400.0 789504.0 460800.0 0.00% 1250304.0
59 features.denseblock2.denselayer3.norm2 128 30 30 128 30 30 256.0 0.44 460,800.0 230,400.0 461824.0 460800.0 0.80% 922624.0
60 features.denseblock2.denselayer3.relu2 128 30 30 128 30 30 0.0 0.44 115,200.0 115,200.0 460800.0 460800.0 0.00% 921600.0
61 features.denseblock2.denselayer3.conv2 128 30 30 32 30 30 36864.0 0.11 66,326,400.0 33,177,600.0 608256.0 115200.0 0.00% 723456.0
62 features.denseblock2.denselayer4.norm1 224 30 30 224 30 30 448.0 0.77 806,400.0 403,200.0 808192.0 806400.0 0.80% 1614592.0
63 features.denseblock2.denselayer4.relu1 224 30 30 224 30 30 0.0 0.77 201,600.0 201,600.0 806400.0 806400.0 0.00% 1612800.0
64 features.denseblock2.denselayer4.conv1 224 30 30 128 30 30 28672.0 0.44 51,494,400.0 25,804,800.0 921088.0 460800.0 0.80% 1381888.0
65 features.denseblock2.denselayer4.norm2 128 30 30 128 30 30 256.0 0.44 460,800.0 230,400.0 461824.0 460800.0 0.00% 922624.0
66 features.denseblock2.denselayer4.relu2 128 30 30 128 30 30 0.0 0.44 115,200.0 115,200.0 460800.0 460800.0 0.00% 921600.0
67 features.denseblock2.denselayer4.conv2 128 30 30 32 30 30 36864.0 0.11 66,326,400.0 33,177,600.0 608256.0 115200.0 0.00% 723456.0
68 features.denseblock2.denselayer5.norm1 256 30 30 256 30 30 512.0 0.88 921,600.0 460,800.0 923648.0 921600.0 0.00% 1845248.0
69 features.denseblock2.denselayer5.relu1 256 30 30 256 30 30 0.0 0.88 230,400.0 230,400.0 921600.0 921600.0 0.00% 1843200.0
70 features.denseblock2.denselayer5.conv1 256 30 30 128 30 30 32768.0 0.44 58,867,200.0 29,491,200.0 1052672.0 460800.0 0.00% 1513472.0
71 features.denseblock2.denselayer5.norm2 128 30 30 128 30 30 256.0 0.44 460,800.0 230,400.0 461824.0 460800.0 0.00% 922624.0
72 features.denseblock2.denselayer5.relu2 128 30 30 128 30 30 0.0 0.44 115,200.0 115