# dropout in training and testing #5357

### wenouyang commented on 11 Feb 2017

 In this link, devinplatt gives the following way to include dropout in training,model = Sequential() model.add(Dropout(0.5, input_shape=(20,))) model.add(Dense(64, init='uniform')) In this post, author mentioned that “Finally, if the training has finished, you’d use the complete network for testing (or in other words, you set the dropout probability to 0).”In terms of keras implementation, does that mean, we have to modify the line model.add(Dropout(0.5, input_shape=(20,))) after we loading the training weight.

### unrealwill commented on 11 Feb 2017

 Hello,By looking at the source code :https://github.com/fchollet/keras/blob/master/keras/layers/core.py#L111x = K.in_train_phase(dropped_inputs, lambda: x)You can see that dropout is only applied in train phase.

### radekosmulski commented on 24 Feb 2017

 That is correct - dropout should be applied during training (drop inputs with probability p) but there also needs to be a corresponding component of scaling the weights at test time as outlined in the referenced paperI guess this is not happening at the moment, at least the results I got thus far might indicate that there is an issue here. Will investigate this further and see if I can provide an example.

### unrealwill commented on 24 Feb 2017•

 Hello, @radekosmulskiThis is not a problem. See issue fchollet#3305.Keras use inverse scaling during training (so that remaining weights are increased during training).See :def dropped_inputs(): return K.dropout(x, self.p, noise_shape, seed=self.seed) https://github.com/fchollet/keras/blob/master/keras/layers/core.py#L110

### radekosmulski commented on 24 Feb 2017•

 Thank you for your reply @unrealwill. I am new to keras so sorry if I misunderstand something. I still feel there is something unusual when running model.predict or model.evaluate when using dropout. Please see below:import keras import numpy as np X = np.array( [[2, 1], [4, 2]]) y = np.array( [[5], [10]] ) # Works as expected without dropout model = keras.models.Sequential() model.add(keras.layers.Dense(input_dim=2, output_dim=1)) model.compile(keras.optimizers.SGD(), loss='MSE') model.fit(X, y, nb_epoch=10000, verbose=0) model.evaluate(X, y) # => ~0 # With dropout model = keras.models.Sequential() model.add(keras.layers.Dense(input_dim=2, output_dim=1)) model.add(keras.layers.Dropout(0.5)) model.compile(keras.optimizers.SGD(), loss='MSE') model.fit(X, y, nb_epoch=10000, verbose=0) model.evaluate(X, y) # => converges to MSE of 15.625 model.predict(X) # => array([[ 2.5], # [ 5. ]], dtype=float32)The MSE this converges to is due to the outputs being exactly half of what they should be (2.5^2+5^2)/2 = 15.625

### unrealwill commented on 24 Feb 2017•

 @radekosmulskiThe Dropout noise introduce bias as it is a non symmetric noise.Dropout shouldn't be added as a last layer (which we normally don't do).Because "mse" is convex, Jensen inequality applies and you are training to learn the bias of the noise.The bias of the dropout can be subsequently removed by using a dense layer after the first layer (=>average result = 7.5 ).And if you had more hidden cells (100) you average the noise out, and get what you want.import keras import numpy as np X = np.array( [[2, 1], [4, 2]]) y = np.array( [[5], [10]] ) # Works as expected without dropout model = keras.models.Sequential() model.add(keras.layers.Dense(input_dim=2, output_dim=1)) model.compile(keras.optimizers.SGD(), loss='MSE') model.fit(X, y, nb_epoch=10000, verbose=0) print model.evaluate(X, y) # => ~0 # With dropout model = keras.models.Sequential() model.add(keras.layers.Dense(input_dim=2, output_dim=100)) model.add(keras.layers.Dropout(0.5)) model.add(keras.layers.Dense(1)) model.compile(keras.optimizers.adam(), loss='MSE') model.fit(X, y, nb_epoch=100000, verbose=0) print model.evaluate(X, y) # => converges to MSE of 15.625 print model.predict(X) # => array([[ 4.91], # [ 9.96 ]], dtype=float32) 

### radekosmulski commented on 24 Feb 2017

 @unrealwill thank you very much for taking the time to reply, I really appreciate it. I understand now.

Closed

Open

### stale bot commented on 26 May 2017

 This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

### spearsem commented on 8 Dec 2017

 @unrealwill There is another use case of dropout at testing or inference time: in order to get a notion of uncertainty and variability in the prediction of the network model, you might take a given input and run predict on it many times, each with different randomly assigned dropout neurons.Say you run predict 100 times for a single test input. The average of these will approximate what you get with no dropout, the 'expected value' over different weight schemes. And various metrics like the standard deviation of these results will give you a sense of the error bounds of your estimate (conditioned on assumptions about the validity of the underlying model structure).In this sense, it would be very useful to have to ability to re-activate Dropout settings from training, but specifically during testing or regular inference.

# Dropout层源代码

dropout层在layer下的core.py中

class Dropout(Layer):
'''Applies Dropout to the input. Dropout consists in randomly setting
a fraction p of input units to 0 at each update during training time,
which helps prevent overfitting.

# Arguments
p: float between 0 and 1. Fraction of the input units to drop.

# References
- [Dropout: A Simple Way to Prevent Neural Networks from Overfitting](http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf)
'''
def __init__(self, p, **kwargs):
self.p = p
if 0. < self.p < 1.:
self.uses_learning_phase = True
super(Dropout, self).__init__(**kwargs)

if 0. < self.p < 1.:
x = K.in_train_phase(K.dropout(x, level=self.p), x)
return x

def get_config(self):
config = {'p': self.p}
base_config = super(Dropout, self).get_config()
return dict(list(base_config.items()) + list(config.items()))

• 1
• 2
• 3
• 4
• 5
• 6
• 7
• 8
• 9
• 10
• 11
• 12
• 13
• 14
• 15
• 16
• 17
• 18
• 19
• 20
• 21
• 22
• 23
• 24
• 25
• 26
• 27
• 28

# 分析

## theano的dropout

def dropout(x, level, seed=None):
if level < 0. or level >= 1:
raise Exception('Dropout level must be in interval [0, 1[.')
if seed is None:
seed = np.random.randint(10e6)
rng = RandomStreams(seed=seed)
retain_prob = 1. - level
x *= rng.binomial(x.shape, p=retain_prob, dtype=x.dtype)
x /= retain_prob
return x
• 1
• 2
• 3
• 4
• 5
• 6
• 7
• 8
• 9
• 10

# in_train_phase

def switch(condition, then_expression, else_expression):
'''condition: scalar tensor.
'''
return T.switch(condition, then_expression, else_expression)

def in_train_phase(x, alt):
x = T.switch(_LEARNING_PHASE, x, alt)
x._uses_learning_phase = True
return x

• 评论

• 上一篇
• 下一篇