ResNet小结

最新推荐文章于 2024-06-27 16:15:12 发布

bingochenjx

最新推荐文章于 2024-06-27 16:15:12 发布

阅读量381

点赞数

分类专栏： deep-learning 文章标签： ResNet key-point deep-learning

本文链接：https://blog.csdn.net/bingochenjx/article/details/78430104

版权

deep-learning 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Key Point list

1 Vanishing/exploding gradients are largely addressed by normalized initialization and intermediate normalization layers(BN)

2 The plain network with depth increasing gets saturated and then degrades rapidly, which is not caused by overfitting. (degradation problem)

这里写图片描述

3 ResNet assump that it is easier to optimize the residual mapping than optize the origin, unreferenced mapping
Denote the desired underlying mapping as $H(x)$ , residual mapping as $F(x)$
$F(x) :=H(x) - x$
Residual Block

Reasoning: The degradation problem suggests that the solvers might have difficulties in approximating identity mappings by multiple nonlinear layers. With the residual learning reformulation, if identity mappings are optimal, the solvers may simply drive the weights of the multiple nonlinear layers toward zero to approach identity mappings.
If the optimal function is closer to an identity mapping than to a zero mapping, it should be easier for the solver to find the perturbations with reference to an identity mapping, than to learn the function as a new one

4 The Design of ResNet:

Bottleneck

Reduce the parameters and calculations to develop more deeper network by reducing the channels.
The parameter-free identity shortcuts are particularly important for the bottleneck architectures
Identity Mapping by Shortcuts(See Figure 2 )
when dimension increases,
method:
A) identity mapping with zeros padding
B) projection shortcut by 1 x 1 convolutions
Network Framework

5 Experiment on ImageNet show:
- ResNet are easier to optimize, the plain net exhibit higher training error with depth increasing
- ResNet can easily enjoy accuracy from increased depth producing results substantially better than previous networks

这里写图片描述