CalTech machine learning video 8 note(Bias-Variance Tradeoff)

最新推荐文章于 2019-07-08 11:08:13 发布

「已注销」

最新推荐文章于 2019-07-08 11:08:13 发布

阅读量696

点赞数

CalTech video note 专栏收录该内容

15 篇文章 0 订阅

订阅专栏

8:46 2014-09-26 Friday
start CalTech machine learning video 8

Bias-Variance Trandeoff

8:46 2014-09-26
what is the "VC dimension"?

VC dimension dVC(H) is "most points H can shatter"

8:47 2014-09-26
this is where it applies

8:48 2014-09-26
the most important part of the application is

are the disappearing block, because it gives the

generality of VC inequality has, the VC bound is

valid for

1.any "learning algorithm", for

2.any "input distribution" may take place, and also for

3.any "target function" that may be able to learn.

8:51 2014-09-26
N: you need the number of examples

Rule of thumb: N >= 10 * dVC

8:53 2014-09-26
generalization bound:

Eout <= Ein + Ω

8:54 2014-09-26
it gives us a different angle of generalization

8:55 2014-09-26
* bais & variance

* learning curves

8:58 2014-09-26
small Eout:

that's the purpose of learning,

if Eout is small, you have learned, you have hypotheis

approximate the target function well.

8:59 2014-09-26
Small Eout: good approximation of f out of sample

9:00 2014-09-26
More complex H => better chance of approximating f

Less complex H => better chance of generalizing out of sample.

9:01 2014-09-26
but have a bigger hypothese set may be bad news,

because you may not be able to get it.

9:02 2014-09-26
what is the ideal hypothese set for learning?

9:02 2014-09-26
Quantifying the tradeoff:

VC analysis was one approach: Eout <= Ein + Ω

// the generalization bound

9:06 2014-09-26
Ein is an approximation, Ω is purely generalization

9:07 2014-09-26
Bias-variance analysis is another: decomposing Eout into

1. How well H can approximate f

2. How well we can zoom in on a good h ∈H

9:13 2014-09-26
this is the best hypothesis, it has a certain

ability, I need to pick it, I have to use the example

to zoom into the hypothese set.

9:14 2014-09-26
decomposing into: approximation + generalization

9:22 2014-09-26
the 1st thing I'm going to do, is exchange

the order of expectations

9:25 2014-09-26
the average hypothesis

9:26 2014-09-26
you learned from a bunch of data sets and get a hypothesis,

someone learned from another bunch of data sets & get another

hypothesis, so how about get the expectation of these hypothese?

9:28 2014-09-26
hopping from your guy to the target goes from small steps:

1. from your guy to the best hypothesis

2. another hop from the best hypothesis to the target function

your hypothesis => the best hypothesis => the target function

9:38 2014-09-26
the cross term goes away, and that's the advantage

of the particular measure that we have.

9:40 2014-09-26
bias(x） + var(x)

9:43 2014-09-26
and this is the bias + variance decomposition

9:47 2014-09-26
here I have a small hypotheses set,

this one, I have a huge hypothese set.

9:50 2014-09-26
if hypothese set gets bigger => bias decrease, variance increase

9:56 2014-09-26
which one is better?

better for what? that's the key issue.

10:03 2014-09-26
you don't know it, you only know the target function

10:07 2014-09-26
let's do the bias + variance decomposition

10:10 2014-09-26
when you're in a learning situation, always remember

you're matching the "model complexity" to the "data resource"

you have, not to the "target complexity"

10:25 2014-09-26
the matter is not that if the target function is there,

the question is can I find it?

10:27 2014-09-26
expected Eout // expected out-of-sample error

expected Ein // expected in-sample error

how do they vary with N?

10:29 2014-09-26
expected error vs Number of Data Points, N

10:30 2014-09-26
it doesn't bother me, because the in-sample error

is not the bottom line, the out-of-sample error is.

10:32 2014-09-26
VC <=> Bias + Variance

10:37 2014-09-26
VC analysis

10:37 2014-09-26
Eout = Ein + Ω

Eout // out-of-sample error

Ein // in-sample error

Ω // generalization error

// generalize from in-sample to out-of-sample error

10:37 2014-09-26
generalization:

in-sample error => out-of-sample error

10:42 2014-09-26
linear regression

10:44 2014-09-26
noisy target: linear + noise

10:44 2014-09-26
Data set D

10:45 2014-09-26
you put the input data set & output data set

10:45 2014-09-26
in-sample error pattern

10:46 2014-09-26
in-sample error vector

10:47 2014-09-26
error pattern

10:47 2014-09-26
out-of-sample error vector

10:47 2014-09-26
pseudo-inverse

10:48 2014-09-26
expected in-sample error,

expected out-of-sample error,

expected generalization error

10:56 2014-09-26
degree of freedom

「已注销」

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
CalTech machine learning video 8 note(Bias-Variance Tradeoff)

8:46 2014-09-26 Fridaystart CalTech machine learning video 8Bias-Variance Trandeoff8:46 2014-09-26what is the "VC dimension"?VC dimension dVC(H) is "most points H can shatter"
复制链接

扫一扫