ImageNet Evolution论文笔记(4)

Deep residual learning for image recognition

degradation problem:with the network depth increasing, accuracy gets saturated (which might be unsurprising) and then degrades rapidly。【解决了增加深度带来的副作用(退化问题),这样能够通过单纯地增加网络深度,来提高网络性能。】

Deep Residual Learning

Identity Mapping by Shortcuts
we explicitly let these layers approximate a residual functionF(x) := H(x) − x。The original function thus becomesF(x)+x。
对于残差网络,维度匹配的shortcut连接为实线,反之为虚线。维度不匹配时,同等映射有两种可选方案:一是直接通过zero padding 来增加维度(channel)。二是乘以W矩阵投影到新的空间。实现是用1x1卷积实现的,直接改变1x1卷积的filters数目。这种会增加参数。
这里写图片描述
Network Architectures
1,全是3x3卷积核;积步长2取代池化;使用Batch Normalization【中间的正则化层】,
2,取消Max池化、全连接层、Dropout
这里写图片描述
更深网络:根据Bootleneck优化残差映射网络【原始:3x3x256x256->3x3x256x256。 优化:1x1x256x64->3x3x64x64->1x1x64x256】
这里写图片描述
这里写图片描述

实验参数

1,The image is resized with its shorter side randomly sampled in [256; 480] for scale augmentation .A 224×224 crop is randomly sampled from an image or its horizontal flip, with the per-pixel mean subtracted。
2,standard color augmentation,adopt batch normalization (BN) right after each convolution and before activation
3,SGD with a mini-batch size of 256. The learning rate,starts from 0.1 and is divided by 10 when the error plateaus, and the models are trained for up to 60 × 104 iterations.
4,a weight decay of 0.0001 and a momentum of 0.9
testing
1,adopt the standard 10-crop testing
2, average the scores at multiple scales (images are resized such that the shorter side is in {224; 256; 384; 480; 640}).

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值