1.1 Setting up your Machine Learning Application
Train/Dev/Test sets
It depends on the data. (98/1/1)
worst:
Train set error: 50%(high bias) underfitting
Dev set error: 50%(high variance) overfitting
basic recipe for ml
Now we don't need to balance bias and variance.
1.2 Regularizing your neural network
high bias: regularization/add training data
L2 may lead over fitness
L2 norm: "weight decay"
For each iteration, w minus a small percent of w. This seems like gradient descent.
So why regularization can efficiently reduce overfitting.
In a word, regularization increases the "linear" percent of activating function(tanh/sigmoid)
regularization: w is a linear model.
dropout regularization: seems like a highly risky operation
initialize keep_prob(a random 0/1-matrix)
For each layer, we set a dropout-f, and if necessary we can set a key to controlling whether user fropout=f.
Other regularization methods
data augmentation: horizontal, vertically, rotation
early stopping: avoid overfitting but stop minimize cost function which increases the bias
1.3 Setting up your optimization problem
normalizing inputs: here is a photo showing the differences after normalizing inputs
deep network: Vanishing / Exploding gradient
weight initialization for deep networks
let w have a default number
check Gradient
wrong:
2.The dev and test set should: Come from the same distribution
6. Increase the regularization hyperparameter lambda: Weights are pushed toward becoming smaller (closer to 0)
question