关于Dropout

1.Dropout: A simple way to prevent neural networks from overfitting

摘要中提出,对于深度神经网络,过拟合是一个严重的问题,并且对于参数庞大的网络,计算速度很慢,没有办法采用传统的用多个模型集成的方法来解决过拟合。因此提出了Dropout的方法来解决过拟合问题,Dropout的核心是在训练时随机的丢掉一些unit。

在训练时,相当于随机sample出了指数级数目的“瘦”的网络;在测试时,相当于使用这些网络的平均预测效果(ensemble的思想),这种方法可以有效的解决过拟合问题,比其他的泛化方法更加有效。

提到了一些其他的防止过拟合的方法:

1.early stopping(根据validation集的表现);2.系数正则项(l1或者l2正则);

3.soft weight sharing

对于有n个units的神经网络,可以sample出2^n个“瘦”的网络,但因为这个网络中的参数是共享的,因此计算时间复杂度没有变,O(n^2)在测试时,使用一个没有dropout的网络,这个网络的权重,是经过scaled-down的权重(权重乘以dropout probability)

Dropping out 20% of the input units and 50% of the hidden units was often found to be optimal.

模型:

Although dropout alone gives significant improvements, using dropout along with maxnorm regularization, large decaying learning rates and high momentum provides a significant boost over just using dropout.

A possible justification is that constraining weight vectors to lie inside a ball of fixed radius makes it possible to use a huge learning rate without the possibility of weights blowing up. 

The noise provided by dropout then allows the optimization process to explore different regions of the weight space that would have otherwise been difficult to reach.

As the learning rate decays, the optimization takes shorter steps, thereby doing less exploration and eventually settles into a minimum.

droppout使得权重稀疏:

We found that as a side-effect of doing dropout, the activations of the hidden units become sparse, even when no sparsity inducing regularizers are present. Thus, dropout automatically leads to sparse representations.

 

2.Dropout for RNN

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值