机器学习笔记 ---- Evaluations and Diagnostics on Algorithms

Improvements and Diagnostics on Algorithms

1. How to Evaluate A Hypothesis

Split training set into 2 parts: training set + test set
If Jtest(θ) J t e s t ( θ ) high, J(θ) J ( θ ) low, then overfitting occurs.



Linear Regression Test Error:
Same as J(θ) J ( θ )


Logistic Regression Test Error:

err(hΘ(x),y)={1,0,if hΘ(x)0.5 and y=0 or hΘ(x)<0.5 and y=1otherwise e r r ( h Θ ( x ) , y ) = { 1 , if  h Θ ( x ) ≥ 0.5   a n d   y = 0   o r   h Θ ( x ) < 0.5   a n d   y = 1 0 , o t h e r w i s e

then
TestError=1Mtesterr(hΘ(x),y) T e s t E r r o r = 1 M t e s t ∑ e r r ( h Θ ( x ) , y )

2. Model Selection

Split training set into 3 parts: training set + cross validation set (CV) + test set
1) Optimize the parameters in Θ using the training set for each polynomial degree.
2) Find the polynomial degree d with the least error using the cross validation set.
3) Estimate the generalization error using the test set with Jtest(Θ(d)) J t e s t ( Θ ( d ) ) , ( d d = theta from polynomial with lower error);
In reality, CV set and test set should be randomly picked!

3. Diagnosing Bias & Variance

Training error decreases with d increases.
Validation error first decreases, then increases as d d becomes bigger.

High Bias:
JCV(θ)Jtrain(θ) is high
High Variance:
JCV(θ) J C V ( θ ) high, Jtrain(θ) J t r a i n ( θ ) low

4. Choosing λ λ When Doing Regularization

Try λ:=λ2 λ := λ ∗ 2 , Pick the one wth least JCV(θ) J C V ( θ ) and see its test error
High Bias:
JCV(θ)Jtrain(θ) J C V ( θ ) ≈ J t r a i n ( θ ) is high, λ λ is big
High Variance:
JCV(θ) J C V ( θ ) high, Jtrain(θ) J t r a i n ( θ ) low, λ λ is small

5. Learning Curves

x-axis is m, y-axis is error

High Bias:

If bias is high, adding more training data won’t help.


High Variance:

If variance is high, adding more training data may help.

6. Solutions for Bias & Variance

High Bias:
-more features;
-more polynomials;
-decreasing λ λ

High Variance:
-more examples;
-less features;
-increasing λ λ

7.Bias & Variance for Neural Network

Small Network: High Bias
Big Network: High Variance, using λ λ doing regularization

8. Error Metrics: Precision & Recall

Put y=1 in presence of rare classes.
- Precision: Of all y=1 predictions, how many are correctly detected?
- Recall: Of all the rare cases, how many are correctly detected?

How to compare precision and recall? Using F score.
F score = 2PRP+R 2 P R P + R

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值