吴恩达·Machine Learning || chap10 Advice for applying machine learning简记

最新推荐文章于 2024-09-13 23:54:45 发布

The Prestige

最新推荐文章于 2024-09-13 23:54:45 发布

阅读量133

点赞数

分类专栏： Machine Learning 文章标签：机器学习

本文链接：https://blog.csdn.net/qq_46203130/article/details/120084738

版权

Machine Learning 专栏收录该内容

17 篇文章 0 订阅

订阅专栏

本文探讨了在应用机器学习时遇到的问题，如预测错误和过拟合。提出了通过获取更多训练样本、调整特征集和正则化参数来改善模型性能的策略。此外，介绍了学习曲线和交叉验证作为评估模型偏差和方差的工具，以及如何根据学习曲线选择合适的正则化参数。最后，强调了训练数据量对高偏差和高方差问题的影响。

摘要由CSDN通过智能技术生成

10 Advice for applying machine learning

10-1 Deciding what to try next

Debugging a learning algorithm

Suppose you have implemented regularized linear regression to predict housing prices.

$\theta ) = \frac { 1 } { 2 m } [ \sum _ { i = 1 } ^ { m } ( h _ { 0 } ( x ^ { ( i ) } ) - y ^ { ( i ) } ) ^ { 2 } + \lambda\sum _ { j = 1 } ^ { m } \theta _ { j } ^ { 2 } ]$

However, when you test your hypothesis on a new set of houses, you find that it makes unacceptably large errors in its predictions. What should you try next?

Get more training examples
Try smaller sets if features
Try getting additional features
Try adding polynomial features
Try decreasing $\lambda$
Try increasing $\lambda$

Machine learning diagnostic:

Diagnostic: A test that you can run to gain insight what is/isn’ t working with a learning algorithm, and gain guidance as to how best to improve its performance.

Diagnostics can take time to implement, but doing so can be a very good use of your time.

10-2 Evaluating a hypothesis

Evaluating your hypothesis

Fails to generalize to new examples not in training set

Training/testing procedure for linear regression

Learn parameter $\theta$ from training data(minimizing training error ( $J(\theta)$ ))
Compute test set error : $J_{test}(\theta)$

classification problem:

Learn parameter $\theta$ from training data
Compute test set error:

$\theta ) = - \frac { 1 } { m _ { test} } \sum _ { i = 1 } ^ { m _ { t e s t } } y^{(i)}_{test}\log h_{\theta}(x_{test}^{(i)}) + ( 1 - y _ { t e s t }) ^ { ( i ) }logh_{\theta}(x_{test}^{(i)})$

Miscalssification error (0/1 miscalssification error):

在这里插入图片描述

10-3 Model selection and training/validation/test sets

Overfitting example

Once parameters $\theta_0,\theta_1,\cdots,\theta_n$ were fit to some set of data (training set), the error of the parameters as measured on that data(the training error $J(\theta)$ is likely to be lower than the actual generalization error.

Model selection

parameter d=degree of polynomial

How well does the model generalize? Report test set error $J_{test}(\theta^{(d)})$

Problem: $J_{test}(\theta^{(d)})$ is likely to be an optimistic estimate of generalization error l.e. our extra parameter(d = degree of polynomial)is fit to test set.

Evaluating your hypothesis

Training set 60%

cross validation 20%

test set 20%

Training/validation/test error

Training error:

$\theta ) = \frac { 1 } { 2 m } \sum _ { i = 1 } ^ { m } ( h _ { \theta } ( x ^ { ( i ) } ) - y ^ { ( i ) } ) ^ { 2 }$

Cross validation error:

$\theta ) = \frac { 1 } { 2 m _ { c v } } \sum _ { i = 1 } ^ { m _ { v } } ( h _ { \theta } ( x _ { c v } ^ { ( i )} ) - y _ { c v } ^ { ( i ) } ) ^ { 2 }$

Test error:

$\theta ) = \frac { 1 } { 2 m_{ t e s t} } \sum _ { i = 1 } ^ { m _ { t e s t } } ( h _ { \theta } ( x _ { t e s t }^{(i)} ) - y _ { t e s t } ^ { ( i ) } ) ^ { 2 }$

Estimate generalization error for test set $J_{test}(\theta^{(4)})$

10-4 Diagnosing bias vs. variance

Bias/variance 偏差/方差

Training error:

Crossing validation error:

Diagnosing bias vs variance
Suppose your learning algorithm is performing less well than you were hoping. $(J_{cv}(\theta)\;or\;J_{test}(\theta) \; is \;high)$ Is it a bias problem or a variance problem?

Bias (underfit): $J_{train}(\theta)$ will be high; $J_{cv}(\theta)\approx J_{train}(\theta)$

Variance(Overfit): $J_{train}(\theta)$ will be low; $J_{cv}(\theta)>> J_{train}(\theta)$

10-5 Regularization and bias/variance

Linear regression with regularization

Model:

$\theta } ( x ) = \theta _ { 0 } + \theta _ { 1 } x + \theta _ { 2 } x ^ { 2 } + \theta _ { 3 } x ^ { 3 } + \theta _ { 4 } x ^ { 4 }$

$\theta ) = \frac { 1 } { 2 m } [ \sum _ { i = 1 } ^ { m } ( h _ { \theta } ( x ^ { ( i ) } ) - y ^ { ( i ) } ) ^ { 2 } + \lambda \sum _ { j = 1 } ^ { n } \theta _ { j } ^ { 2 } ]$

High bias——Large $\lambda$

Just right——intermediate $\lambda$

High variance——small $\lambda$

Choosing the regularization parameter $\lambda$

the smallest $J_{cv}(\theta^{(i)})$ error

Bias/variance as a function of the regularization parameter $\lambda$

10-6 Learning curves

Learning curves

$J_{train}(\theta)$

$J_{cv}(\theta)$

High bias: more and more close

If a learning algorithm is suffering from high bias, getting more training data will not (by itself) help much

High variance: large gap

If a learning algorithm is suffering from high variance, getting more training data is likely to help

10-7 Deciding what to try next (revisited)

Debugging a learning algorithm:

Get more training examples $\longrightarrow$ fixed high variance
Try smaller sets if features $\longrightarrow$ fixed high variance
Try getting additional features $\longrightarrow$ fixed high bias
Try adding polynomial features $\longrightarrow$ $\longrightarrow$ fixed high bias
Try decreasing $\lambda$ $\longrightarrow$ fixed high bias
Try increasing $\lambda$ $\longrightarrow$ fixed high variance