学习笔记 Dropout, DropConnect, DropPath

Motivations

One of the major challenges when training a model in (Deep) Machine Learning is co-adaptation. This means that the neurons are very dependent on each other. They influence each other considerably and are not independent enough regarding their inputs. It is also common to find cases where some neurons have a predictive capacity that is more significant than others. In other words, we have an output that is excessively dependent on one neuron [ 1 ] ^{[1]} [1].

These effects must be avoided and the weight must be distributed to prevent overfitting. The co-adaptation and the high predictive capacity of some neurons can be regulated with different regularization methods. One of the most used is the Dropout. Yet the full capabilities of dropout methods are rarely used [ 1 ] ^{[1]} [1].

Standard Dropout

The most well known and used dropout method is the Standard Dropout introduced in 2012 by Hinton et al… Usually simply called “Dropout”, for obvious reasons, in this article we will call it Standard Dropout.

Dropout 操作是指在网络的训练阶段,每次迭代时会从基础网络中随机丢弃一定比例的神经元,然后在修改后的网络上进行数据的前向传播和误差的反向传播,如下图所示。注意,模型在测试阶段会恢复全部的神经元。Dropout是一种常用的正则化方法,可以缓解网络的过拟合问题 [ 2 ] ^{[2]} [2]

image-20220519165825607

一方面,Dropout可以看作是集成了大量神经网络的 Bagging方法。Bagging是指用相同的数据训练若干个不同的模型,最终的预测结果是这些模型进行投票或取平均值而得到的。在训练阶段,Dropout 通过在每次迭代中随机丢弃一些神经元来改变网络的结构,以实现训练不同结构的神经网络的目的;而在测试阶段,Dropout则会使用全部的神经元,这相当于之前训练的不同结构的网络都参与了对最终结果的投票,以此获得较好的效果。Dropout通过这种方式提供了一种强大、快捷且易实现的近似Bagging方法。需要注意的是,在原始Bagging中所有模型是相互独立的,而Dropout则有所不同,这里不同的网络其实是共享了参数的 [ 2 ] ^{[2]} [2]
另一方面,Dropout 能够减少神经元之间复杂的共适应(co-adaptation)关系。由于Dropout每次丢弃的神经元是随机选择的,所以每次保留下来的网络会包含着不同的神经元,这样在训练过程中,网络权值的更新不会依赖于隐节点之间的固定关系(固定关系可能会产生一些共同作用从而影响网络的学习过程)。换句话说,网络中每个神经元不会对另一个特定神经元的激活非常敏感,这使得网络能够学习到—些更加泛化的特征 [ 2 ] ^{[2]} [2]

DropConnect

Perhaps you are already familiar with the Standard Dropout method. But there are many variations. To regularize the forward pass of a Dense network, you can apply a dropout on the neurons. The DropConnect introduced by L. Wan et al. does not apply a dropout directly on the neurons but on the weights and bias linking these neurons.

image-20220519170636800

DropPath / Stochastic Depth

DropPath 就是随机将深度学习网络中的多分支结构随机删除。

fractalnet-f2

参考资料:

[1] 12 Main Dropout Methods: Mathematical and Visual Explanation for DNNs, CNNs, and RNNs

[2] 诸葛越, 百面深度学习: 算法工程师带你去面试

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值