CalTech machine learning, video 13 note(validation)

8:58 2014-10-09
start CalTech machine learning, vieo 13, 


validation


9:57 2014-10-09
outline:


* validation set


* model selection


* cross validation


10:03 2014-10-09
Validation vs. regularization


Eout(h) = Ein(h) + overfit penalty


regularization estimates "overfit penalty"


validation estimates "Eout(h)"


10:08 2014-10-09
Eval(h) // validation error


// this will be a good estimate of the out-of-sample performance


10:13 2014-10-09
k is taken out of N 


// validation set are different from training set


10:18 2014-10-09
K points => validation


N-K points => training


10:18 2014-10-09
Dval, Dtrain


10:22 2014-10-09
small K => best estimate


large K => 


10:26 2014-10-09
why not put K back into the original N?


10:26 2014-10-09
we call it validation because we use it to make choice


10:34 2014-10-09
Dval is used to make learning choices


If an estimate of Eout affects learning


10:36 2014-10-09
early stopping


10:36 2014-10-09
this is going up, I better stop here


10:37 2014-10-09
What is the difference?


* Test set is unbiased;


* validation set has optimistic bias


10:39 2014-10-09
e1 is an unbiased estimate of out-of-sample error


10:42 2014-10-09
unbiased mean the expected value is what should be


10:42 2014-10-09
Error estimates e1 & e2


Pick h ∈{h1, h2} with e = min(e1, e2)


what is the expectation of e: E(e)?


10:45 2014-10-09
now we realize that this is an optimistic bias


10:46 2014-10-09
fortunately to us the utility of validation in 


machine learning is so light, that we're going to


swallow the bias


10:47 2014-10-09
so with this understanding, let's use validation for 


model selection which validation set do


10:48 2014-10-09
the choice of λ happens to be a manifestation of this


10:48 2014-10-09
Using Dval more than once


10:49 2014-10-09
that's a choice between models


10:50 2014-10-09
they have a small minus because I'm traning on Dtrain


10:53 2014-10-09
so these are done without any validation, just train


on a reduced set.


10:53 2014-10-09
once I get them, I'm going to evaluate the performance


10:54 2014-10-09
these are "validation errors"


10:54 2014-10-09
your model selection is to look at these errors which


supposed to reflect the out-of-sample performance if you


use this as your final product


10:57 2014-10-09
you pick the smallest of them, now you have a bias


10:57 2014-10-09
now we realized it has an optimistic bias


10:58 2014-10-09
we're now going back to our full data set


10:58 2014-10-09
restore your D as we did it before


10:59 2014-10-09
so this is the algorithm for model selection


10:59 2014-10-09
so I'm going to run an experiment to show you the bias


11:00 2014-10-09
not because it has an inherent good performance, but 


because you look for the one with a good performance


11:01 2014-10-09
validation set size


11:02 2014-10-09
and after that, I look the actual out-of-sample performance error


11:03 2014-10-09 
I'd like to ask you 2 questions:


* why the curves goes up?


* why are the 2 curves getting closer together?


11:06 2014-10-09
because when I use more for validation, I use less


for training, 


11:07 2014-10-09
how much bias depend on the factors, but the bias is there


11:11 2014-10-09
I'm using the validation set to estimate the Eout


11:12 2014-10-09
validation set(Dval) is used for "training" on the 


"finalist model"


11:16 2014-10-09
if you have decent set(set size == K), then your estimate 


will not be that far from Eout(out-sample-error)


11:25 2014-10-09
so I'm choosing when to stop


11:25 2014-10-09
the training of the network tries to choose the 


weight of the network


11:27 2014-10-09
validation error is a reasonable estimate of the 


out-of-sample error that we can rely on


11:28 2014-10-09
data contamination:


if you use the data for making choices, you're 


contaminating it as far as the ability to make the 


real performance


11:31 2014-10-09
contamination: optismistic(decpetive) bias


11:32 2014-10-09
you're trying to measure what is the level of contamination


11:33 2014-10-09
we have a great Ein, and we know Ein is no indication


of Eout, this has been contaminated to death


11:34 2014-10-09
when you go to the 'test set', this is totally clean,


there is no bias here


11:35 2014-10-09
Ein   // in-sample error


Etest // out-of-sample error


Eval  // validation set


11:36 2014-10-09
the validation set is in between, it's slightly 


contaminated.


11:36 2014-10-09
now we go to 'cross validation', very sweet regime


11:38 2014-10-09
the dilemma about K


11:40 2014-10-09
the fluctuation around the estimate we want


11:39 2014-10-09
Eout(g) // g is the hypothesis we're going to report


11:42 2014-10-09
Eout(g-) 


// this is the proper sample error but on the hypothese set


// on a reduced set


11:42 2014-10-09
Eout(g) ≈ Eout(g-) ≈ Eval(g-)


Eout(g)  // this is what we want


Eout(g-) // this is unknown to me


Eval(g-) // this is what I'm working with


11:43 2014-10-09
I want K to be small so that: Eout(g) ≈ Eout(g-)


11:45 2014-10-09
but also I want K to be large, because  Eout(g-) ≈ Eval(g-)


11:45 2014-10-09
can we have K both small & large?


11:46 2014-10-09
leave one out, leave more out


11:46 2014-10-09
I'm going to use N-1 points for training,


and 1 point for validation


11:47 2014-10-09
I'm going to create a reduced set from D, called Dn


11:48 2014-10-09
this one(the taken out) will be the one I use for validation


11:48 2014-10-09
let's look at the validation error


11:49 2014-10-09
in this case, the validation error is just 1 point


11:49 2014-10-09
what happens if I repeat this exercise for different


small n?


11:50 2014-10-09
so in spite of these are different hypotheses, the fact 


that they come from different points (N-1), 


11:53 2014-10-09
I'm going to define the cross validation error: Ecv


11:53 2014-10-09
the catch is that these are not independent,


each of them is affected by the other


11:55 2014-10-09
It's remarkablly in getting it


11:56 2014-10-09
let's just estimate the out-of-sample error 


using the cross validation method


11:57 2014-10-09
and we take an average performance of these


as an indication of what will happen out of sample


12:01 2014-10-09
we're using only 2 points here, when we're done,


we're using 3 points 


12:02 2014-10-09
but think of 99/100, who cares?


12:02 2014-10-09
so let's use this for model selection


12:02 2014-10-09
model selection using CV // CV == Cross Validation


12:03 2014-10-09
we're like to find a separating surface


12:07 2014-10-09
Ecv tracks Eout very nicely


12:09 2014-10-09
if I use it as a criteria for model choice


12:10 2014-10-09
let me cutoff at six, and see what the performance like?


// early stop


12:10 2014-10-09
without validation, I'm using the full model


12:11 2014-10-09
with validation, you stop at 6, because the 


cross validation tells you do so, it's nice 


smooth surface


12:12 2014-10-09
I don't care the in-sample error to go to zero, 


that's harmful in some cases


12:12 2014-10-09
so now you can see why validation is seen in this


context as similar to regularization, it does the 


same thing, it prevents overfitting, but it prevents


overfitting by estimating out-of-sample error(Eout)


rather than estimating something else


12:16 2014-10-09
seldom use leave one out in real problems,


12:18 2014-10-09
take more points for validation


12:18 2014-10-09
Leave more than one out


12:18 2014-10-09
what you do is you take your data set,


you just break it into several fold


12:18 2014-10-09
exactly the same, except for here,


here I'm taking a chunk


12:20 2014-10-09
this is what I recommend it to you:


10-fold cross validation


-----------------------------------------------
13:29 2014-10-09
both validation & cross validation have bias


for the same reason
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值