【论文笔记】residual neural network-kaiming he

最新推荐文章于 2024-04-18 22:07:18 发布

Efight

最新推荐文章于 2024-04-18 22:07:18 发布

阅读量895

点赞数

文章标签：论文笔记 deep learning cv

本文链接：https://blog.csdn.net/Efight/article/details/50924837

版权

http://arxiv.org/abs/1512.03385

The stack of layer will cause degradation(same train, more layer, less accuracy, which is not caused by overfit or derivation vanishing).

H(x) is expected result, F(x) is residual error, H(x)=F(x)+x. Assume F(x) is easier be approached by CNN than H(x). Experiment support this assumption.

Bottle-neck architecture of ResNets is more economical.

ReLu

Activation function. Its derivative is logistic function. y = 0 when x < 0, it reduce the number of active neuron in network. Therefore there are only about half of parameters should be modified when BP, increase the training speed.

derivation vanishing

Activation function has output range (-1,1) or (0,1), cause the decrease of derivation in back propagation, making shallow layer parameters can not be modified effectively, called vanish.

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

Efight

关注关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
【论文笔记】residual neural network-kaiming he

http://arxiv.org/abs/1512.03385The stack of layer will cause degradation(same train, more layer, less accuracy, which is not caused by overfit or derivation vanishing). H(x) is expected
复制链接

扫一扫