9:00 2014-09-25 Thursday
start CalTech machine learning, video 7
the VC dimension
9:00 2014-09-25
bounding the growth function in terms a
function with that break point.
9:01 2014-09-25
the VC dimension of the hypotheses set
9:07 2014-09-25
VC dimension of perceptrons
9:10 2014-09-25
the VC dimension is a quantity that is defined
for a hypothese set.
9:11 2014-09-25
denoted by dVC(H)
9:11 2014-09-25
what is the VC dimension of H(hypothese set)?
the most points H(hypothese set) can shatter.
9:12 2014-09-25
N <= dVC(H) => H is guaranteed to be able to shatter N points
9:14 2014-09-25
* H is "positive rays"
* H is "2D perceptrons"
* H is "convex set"
9:19 2014-09-25
// dVC(H) is the VC dimension of hypothese set H
dVC(H) is finite => g ∈H will generalize
* independent of the "learning algorithm"
* independent of the "input distribution"
* independent of the "target function"
9:24 2014-09-25
can we shatter this data set?
9:43 2014-09-25
the VC dimension gives me the maximum number
I can shatter.
10:01 2014-09-25
the interpretation of the VC dimension
10:02 2014-09-25
how can I apply the definition of the VC dimension
in practice?
10:03 2014-09-25
think parameter as knobs
10:04 2014-09-25
parameters create degree of freedom
10:04 2014-09-25
dVC equivalent 'binary' degrees of freedom.
10:07 2014-09-25
if I can shatter 20 points, that's good,
if somebody can shatter 30 points, they have
more degree of freedom.
10:08 2014-09-25
it's not just paratmeters, it's degree of freedom.
10:11 2014-09-25
generalization bound
10:48 2014-09-25
Eout - Ein // generalization error
10:50 2014-09-25
conclusion:
with probability >= 1-δ, Eout <= Ein + Ω
// generalization bound
10:53 2014-09-25
it's good for Ein, but bad for Eout
10:54 2014-09-25
poor generalization
10:54 2014-09-25
Eout <= Ein + Ω
one of the most important technique based on this
is called regularization, and the idea here is that
10:56 2014-09-25
I use Ein for a approximation for Eout, it's not
Ein only affect the game, use Ein plus something else
which better reflect the Eout, this is called regularization.
10:58 2014-09-25
most of the discussion deals with the smallest
break point, the notion of a break point covers
a lot of values, the VC dimension is a unique one
which happens just a biggest value just shatter of the
1st break point.
11:27 2014-09-25
when we talk about shattering n points, we just
talk about shattering n set of points
11:29 2014-09-25
I give you the priveledge to pick the points to shatter
but you get to choose which points to shatter.
11:30 2014-09-25
shattering some set of n points
11:31 2014-09-25
break point is a failure to shatter, and
VC dimension is a ability to shatter.
11:31 2014-09-25
I want Ein to track Eout, the level of track is ε
11:33 2014-09-25
there is an upperbound in the variance
11:34 2014-09-25
the VC inequality => the VC dimension
-----------------------------------------
start CalTech machine learning, video 7
the VC dimension
9:00 2014-09-25
bounding the growth function in terms a
function with that break point.
9:01 2014-09-25
the VC dimension of the hypotheses set
9:07 2014-09-25
VC dimension of perceptrons
9:10 2014-09-25
the VC dimension is a quantity that is defined
for a hypothese set.
9:11 2014-09-25
denoted by dVC(H)
9:11 2014-09-25
what is the VC dimension of H(hypothese set)?
the most points H(hypothese set) can shatter.
9:12 2014-09-25
N <= dVC(H) => H is guaranteed to be able to shatter N points
9:14 2014-09-25
* H is "positive rays"
* H is "2D perceptrons"
* H is "convex set"
9:19 2014-09-25
// dVC(H) is the VC dimension of hypothese set H
dVC(H) is finite => g ∈H will generalize
* independent of the "learning algorithm"
* independent of the "input distribution"
* independent of the "target function"
9:24 2014-09-25
can we shatter this data set?
9:43 2014-09-25
the VC dimension gives me the maximum number
I can shatter.
10:01 2014-09-25
the interpretation of the VC dimension
10:02 2014-09-25
how can I apply the definition of the VC dimension
in practice?
10:03 2014-09-25
think parameter as knobs
10:04 2014-09-25
parameters create degree of freedom
10:04 2014-09-25
dVC equivalent 'binary' degrees of freedom.
10:07 2014-09-25
if I can shatter 20 points, that's good,
if somebody can shatter 30 points, they have
more degree of freedom.
10:08 2014-09-25
it's not just paratmeters, it's degree of freedom.
10:11 2014-09-25
generalization bound
10:48 2014-09-25
Eout - Ein // generalization error
10:50 2014-09-25
conclusion:
with probability >= 1-δ, Eout <= Ein + Ω
// generalization bound
10:53 2014-09-25
it's good for Ein, but bad for Eout
10:54 2014-09-25
poor generalization
10:54 2014-09-25
Eout <= Ein + Ω
one of the most important technique based on this
is called regularization, and the idea here is that
10:56 2014-09-25
I use Ein for a approximation for Eout, it's not
Ein only affect the game, use Ein plus something else
which better reflect the Eout, this is called regularization.
10:58 2014-09-25
most of the discussion deals with the smallest
break point, the notion of a break point covers
a lot of values, the VC dimension is a unique one
which happens just a biggest value just shatter of the
1st break point.
11:27 2014-09-25
when we talk about shattering n points, we just
talk about shattering n set of points
11:29 2014-09-25
I give you the priveledge to pick the points to shatter
but you get to choose which points to shatter.
11:30 2014-09-25
shattering some set of n points
11:31 2014-09-25
break point is a failure to shatter, and
VC dimension is a ability to shatter.
11:31 2014-09-25
I want Ein to track Eout, the level of track is ε
11:33 2014-09-25
there is an upperbound in the variance
11:34 2014-09-25
the VC inequality => the VC dimension
-----------------------------------------