Dropout 层应该加在什么地方？

最新推荐文章于 2025-03-18 02:30:19 发布

qq_27292549

最新推荐文章于 2025-03-18 02:30:19 发布

阅读量6.1w

点赞数 18

本文探讨了Dropout层在深度学习模型中的作用，特别是在卷积神经网络中防止过拟合的能力。通过引用英文博客及论文内容，介绍了Dropout层如何帮助网络学习更独立的内部表示，并提供了一个在CIFAR-10数据集上的应用实例。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Dropout 层是否有效

2018年04月09日 22:00:15

阅读数：164

之前做分类的时候，Dropout 层一般加在全连接层防止过拟合提升模型泛化能力。而很少见到卷积层后接Drop out （原因主要是卷积参数少，不易过拟合），今天找了些博客，特此记录。

首先是一篇外文博客（他的一系列写的都很好）：Dropout Regularization For Neural Networks
也有中文翻译版的：基于Keras/Python的深度学习模型Dropout正则项

You can imagine that if neurons are randomly dropped out of the network during training, that other neurons will have to step in and handle the representation required to make predictions for the missing neurons. This is believed to result in multiple independent internal representations being learned by the network.

The effect is that the network becomes less sensitive to the specific weights of neurons. This in turn results in a network that is capable of better generalization and is less likely to overfit the training data.

在cifar数据集上使用Dropout的实例：92.45% on CIFAR-10 in Torch
这里面卷积层和全连接层都加了Dropout。But dropout values are usually < 0.5, e.g. 0.1, 0.2, 0.3 for the convolutional layers.

在附上提出Dropout的论文中的观点：

from the Srivastava/Hinton dropout paper:

“The additional gain in performance obtained by adding dropout in the convolutional layers (3.02% to 2.55%) is worth noting. One may have presumed that since the convolutional layers don’t have a lot of parameters, overfitting is not a problem and therefore dropout would not have much effect. However, dropout in the lower layers still helps because it provides noisy inputs for the higher fully connected layers which prevents them from overfitting.”
They use 0.7 prob for conv drop out and 0.5 for fully connected.

这次实验我是在输入层后加入了Dropout层，感觉像是数据扩增，还不知道效果如何。