1. Deciding what to try next
- Debugging a learning algorithm
- Suppose you have implemented regularized linear regression to predict housing prices, when you test your hypothesis on a new set of houses, you find that it makes unacceptably large errors in its predictions. What should you try next?
- Get more training examples
- Try smaller sets of features
- Try getting additional features
- Try adding polynomial features
- Try decreasing
- Try increasing
- Suppose you have implemented regularized linear regression to predict housing prices, when you test your hypothesis on a new set of houses, you find that it makes unacceptably large errors in its predictions. What should you try next?
2. Evaluating a hypothesis
- separate data sets into training set (70%) andtest set (30%)
- Training/Testing procedure for logistic regression
- learn parameter from training data
- compute test set error
- misclassification error (0/1 misclassification error)
3. Model selection and training/validation/test sets
- overfitting example
- the training error is likely to be lower than the actual generalization error
- model selection
- select the model that has the lowest test error
- training set - 60%
- cross validation set (cv) - 20%
- test set - 20%
- training error
- cross validation error
- test error
4. Diagnosing bias vs. variance
- bias (underfit)
- variance (overfit)
5. Regularization and bias/variance
- choosing the regularization parameter
- ........
6. Learning curves
- If a learning algorithm is suffering from high bias, getting more training data will not (by itself) help much.
- If a learning algorithm is suffering from high variance, getting more training data is likely to help.
7. Deciding what to try next (revisited)
- "small" neural network (fewer parameters, more prone to underfitting)
- computationally cheapter
- "large" neural network (more parameters, more prone to overfitting)
- computationally more expensive
- use regularization to address overfitting
the definition:
Variance: measures the extent to which the solutions for individual data sets vary around their average, hence this measures the extent to which the function f(x) is sensitive to theparticular choice of data set.
Bias: represents the extent to which the average prediction over all data sets differs from the desired regression function.
variance:估计本身的方差。
bias:估计的期望和样本数据样本希望得到的回归函数之间的差别。
From : http://blog.csdn.net/abcjennifer/article/details/7797502