CalTech machine learning video 8 note(Bias-Variance Tradeoff)

8:46 2014-09-26 Friday
start CalTech machine learning video 8


Bias-Variance Trandeoff


8:46 2014-09-26
what is the "VC dimension"?


VC dimension dVC(H) is "most points H can shatter"


8:47 2014-09-26
this is where it applies


8:48 2014-09-26
the most important part of the application is


are the disappearing block, because it gives the


generality of VC inequality has, the VC bound is 


valid for 


1.any "learning algorithm", for 


2.any "input distribution" may take place, and also for 


3.any "target function" that may be able to learn.


8:51 2014-09-26
N: you need the number of examples


Rule of thumb: N >= 10 * dVC


8:53 2014-09-26
generalization bound:


Eout <= Ein + Ω


8:54 2014-09-26
it gives us a different angle of generalization


8:55 2014-09-26
* bais & variance


* learning curves


8:58 2014-09-26
small Eout: 


that's the purpose of learning,


if Eout is small, you have learned, you have hypotheis


approximate the target function well.


8:59 2014-09-26
Small Eout: good approximation of f out of sample


9:00 2014-09-26
More complex H => better chance of approximating f


Less complex H => better chance of generalizing out of sample.


9:01 2014-09-26
but have a bigger hypothese set may be bad news,


because you may not be able to get it.


9:02 2014-09-26
what is the ideal hypothese set for learning?


9:02 2014-09-26
Quantifying the tradeoff:


VC analysis was one approach: Eout <= Ein + Ω


// the generalization bound


9:06 2014-09-26
Ein is an approximation, Ω is purely generalization


9:07 2014-09-26
Bias-variance analysis is another: decomposing Eout into


1. How well H can approximate f


2. How well we can zoom in on a good h ∈H


9:13 2014-09-26
this is the best hypothesis, it has a certain 


ability, I need to pick it, I have to use the example


to zoom into the hypothese set.


9:14 2014-09-26
decomposing into: approximation + generalization


9:22 2014-09-26
the 1st thing I'm going to do, is exchange 


the order of expectations


9:25 2014-09-26
the average hypothesis


9:26 2014-09-26
you learned from a bunch of data sets and get a hypothesis,


someone learned from another bunch of data sets & get another 


hypothesis, so how about get the expectation of these hypothese?


9:28 2014-09-26
hopping from your guy to the target goes from small steps:


1. from your guy to the best hypothesis


2. another hop from the best hypothesis to the target function


your hypothesis => the best hypothesis => the target function


9:38 2014-09-26
the cross term goes away, and that's the advantage 


of the particular measure that we have.


9:40 2014-09-26
bias(x) + var(x)


9:43 2014-09-26
and this is the bias + variance decomposition


9:47 2014-09-26
here I have a small hypotheses set, 


this one, I have a huge hypothese set.


9:50 2014-09-26
if hypothese set gets bigger => bias decrease, variance increase


9:56 2014-09-26
which one is better?


better for what? that's the key issue.


10:03 2014-09-26
you don't know it, you only know the target function


10:07 2014-09-26
let's do the bias + variance decomposition 


10:10 2014-09-26
when you're in a learning situation, always remember


you're matching the "model complexity" to the "data resource"


you have, not to the "target complexity"


10:25 2014-09-26
the matter is not that if the target function is there,


the question is can I find it?


10:27 2014-09-26
expected Eout // expected out-of-sample error


expected Ein  // expected in-sample error


how do they vary with N?


10:29 2014-09-26
expected error vs Number of Data Points, N


10:30 2014-09-26
it doesn't bother me, because the in-sample error


is not the bottom line, the out-of-sample error is.


10:32 2014-09-26
VC <=> Bias + Variance 


10:37 2014-09-26
VC analysis


10:37 2014-09-26
Eout = Ein + Ω


Eout // out-of-sample error


Ein  // in-sample error


Ω   // generalization error


     // generalize from in-sample to out-of-sample error


10:37 2014-09-26
generalization: 


in-sample error => out-of-sample error


10:42 2014-09-26
linear regression


10:44 2014-09-26
noisy target: linear + noise


10:44 2014-09-26
Data set D


10:45 2014-09-26
you put the input data set & output data set


10:45 2014-09-26
in-sample error pattern


10:46 2014-09-26
in-sample error vector


10:47 2014-09-26
error pattern


10:47 2014-09-26
out-of-sample error vector


10:47 2014-09-26
pseudo-inverse


10:48 2014-09-26
expected in-sample error,


expected out-of-sample error,


expected generalization error


10:56 2014-09-26
degree of freedom
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值