Coursera ML笔记6

最新推荐文章于 2021-01-25 22:50:57 发布

ksboys

最新推荐文章于 2021-01-25 22:50:57 发布

阅读量417

点赞数

分类专栏：机器学习文章标签：机器学习

本文链接：https://blog.csdn.net/js1568/article/details/70233333

版权

机器学习专栏收录该内容

12 篇文章 0 订阅

订阅专栏

Coursera ML笔记6

标签（空格分隔）：机器学习

Evaluating a Hypothesis

Once we have done some trouble shooting for errors in our predictions by:

Getting more training examples
Trying smaller sets of features
Trying additional features
Trying polynomial features( ${x_1}^2 \, {x_2}^2 \, x_1x_2$ )
Increasing or decreasing λ
1. Learn Θ and minimize $J_{train}(Θ)$ using the training set
2. Compute the test set error $J_{test}(Θ)$

The test set error

To evaluate a hypothesis, given a dataset of training examples, we can split up the data into two sets: a training set and a test set. Typically, the training set consists of 70 % of your data and the test set is the remaining 30 %.
The new procedure using these two sets is then:
1. For linear regression: $J_{test}(\Theta) = \dfrac{1}{2m_{test}} \sum_{i=1}^{m_{test}}(h_\Theta(x^{(i)}_{test}) - y^{(i)}_{test})^2$
2. For classification ~ Misclassification error (aka 0/1 misclassification error): $err(h_\Theta(x),y) = \begin{matrix} 1 & \mbox{if } h_\Theta(x) \geq 0.5\ and\ y = 0\ or\ h_\Theta(x) < 0.5\ and\ y = 1\newline 0 & \mbox otherwise \end{matrix}$
$\text{Test Error} = \dfrac{1}{m_{test}} \sum^{m_{test}}_{i=1} err(h_\Theta(x^{(i)}_{test}), y^{(i)}_{test})$

Model Selection and Train/Validation/Test Sets

One way to break down our dataset into the three sets is:

Training set: 60%
Cross validation set: 20%
Test set: 20%

We can now calculate three separate error values for the three different sets using the following method:

Optimize the parameters in Θ using the training set for each polynomial degree.
Find the polynomial degree d with the least error using the cross validation set.
Estimate the generalization error using the test set with Jtest(Θ(d)), (d = theta from polynomial with lower error);

Diagnosing Bias vs. Variance

High bias (underfitting): both $J_{train}(\Theta)$ and $J_{CV}(\Theta)$ will be high. Also, $J_{train}(\Theta) \approx J_{CV}(\Theta)$ .
High variance (overfitting): $J_{train}(\Theta)$ will be low and $J_{CV}(\Theta)$ will be much greater than $J_{train}(\Theta)$ .

Regularization and Bias/Variance

Learning Curves

If a learning algorithm is suffering from high bias, getting more training data will not (by itself) help much.

If a learning algorithm is suffering from high variance, getting more training data is likely to help.

Deciding What to Do Next Revisited

Getting more training examples: Fixes high variance
Trying smaller sets of features: Fixes high variance
Adding features: Fixes high bias
Adding polynomial features: Fixes high bias
Decreasing λ: Fixes high bias
Increasing λ: Fixes high variance.

Prioritizing What to Work On

Collect lots of data (for example “honeypot” project but doesn’t always work)
Develop sophisticated features (for example: using email header data in spam emails)
Develop algorithms to process your input in different ways (recognizing misspellings in spam).
It is difficult to tell which of the options will be most helpful.

Error Analysis

The recommended approach to solving machine learning problems is to:

Start with a simple algorithm, implement it quickly, and test it early on your cross validation data.
Plot learning curves to decide if more data, more features, etc. are likely to help.
Manually examine the errors on examples in the cross validation set and try to spot a trend where most of the errors were made.