Paper -- DenseNet：Densely Connected Convolutional Network

最新推荐文章于 2021-12-17 19:08:33 发布

jerry173985

最新推荐文章于 2021-12-17 19:08:33 发布

阅读量296

点赞数

分类专栏： Paper

本文链接：https://blog.csdn.net/jerry173985/article/details/110824533

版权

DenseNet通过特征复用和旁路设置缓解了梯度消失问题，减少了参数量，增强了特征传播和复用。其每个层都从前一层的所有层接收输入，形成直接的“通信”联系，提高了模型的稳定性和收敛速度。

摘要由CSDN通过智能技术生成

Abstracts:

DenseNet breaks away from the fixed thinking of deepening the number of network layers (ResNet) and widening the network structure (Inception) to improve network performance. From the perspective of features, through feature reuse and bypass (Bypass) settings, it has greatly reduced the network The amount of parameters alleviates the gradient vanishing problem to a certain extent
在这里插入图片描述

DenseNets has several compelling advantages:

Alleviated the vanishing gradient
Enhanced feature spread
Enhanced feature reuse
Reduce the amount of parameters

From the figure we can draw the following conclusions:

a) Some features extracted from earlier layers may still be used directly by deeper layers

b) Even the Transition layer will use the features of all layers in the previous Denseblock

c) The layers in the 2-3 Denseblock have low utilization of the previous Transition layer, indicating that the transition layer outputs a large number of redundant features. This also provides evidence support for DenseNet-BC, which is the necessity of Compression.

d) Although the final classification layer uses the multiple layers of information in the previous Denseblock, it is more inclined to use the features of the last few feature maps, indicating that in the last few layers of the network, some high-level features may be generated.

a) 一些较早层提取出的特征仍可能被较深层直接使用

b) 即使是Transition layer也会使用到之前Denseblock中所有层的特征

c) 第2-3个Denseblock中的层对之前Transition layer利用率很低,说明transition layer输出大量冗余特征.这也为DenseNet-BC提供了证据支持,既Compression的必要性.

d) 最后的分类层虽然使用了之前Denseblock中的多层信息,但更偏向于使用最后几个feature map的特征,说明在网络的最后几层,某些high-level的特征可能被产生.

All the layers in front of each layer add a single shortcut to this layer, so that any two layers of networks can directly “communication”. This is the picture below:

在这里插入图片描述
好处：

从feature来考虑，每一层feature 被用到时，都可以被看作做了新的 normalization，从实验结果看到即便去掉BN，深层 DenseNet也可以保证较好的收敛率。
从perceptual field来看，浅层和深层的field 可以更自由的组合，会使得模型的结果更加robust。
从 wide-network 来看， DenseNet 看以被看作一个真正的宽网络，在训练时会有比 ResNet 更稳定的梯度，收敛速度自然更好（paper的实验可以佐证）
benefit:

From the feature point of view, when each layer of feature is used, it can be regarded as a new normalization. It can be seen from the experimental results that even if BN is removed, the deep DenseNet can guarantee a better convergence rate.
From the perspective of perceptual field, shallow and deep fields can be combined more freely, which will make the results of the model more robust.
From the perspective of wide-network, DenseNet can be seen as a real wide network. During training, there will be a more stable gradient than ResNet, and the convergence speed will natu