将Machine Learning拆成了两个问题,1.Ein和Eout是不是一样 2.Ein会不会很小
Recap and Preview
Recap: the 'Statistical' Learning Flow
if H = M finite, N large enough,
for whatever g picked up by A, Eout = Ein
if A finds one g with Ein = 0
PAC guarantee for Eout = 0
test: Eout = Ein
train: Ein = 0
Two Central Questions
for batch and supervised binary classification, g = f <=> Eout = 0
achieved through Eout = Ein and Ein = 0
Trade-off on M
1. can we make sure that Eout is close enough to Ein?
2. can we make Ein small enough?
small M:
1. Yes. P[BAD] <= 2 M exp( ... )
2. No. Too few choices.
large M:
1. No. P[BAD] <= 2 M exp(...)
2. Yes. many choices..
----------------------------------------------------
so using the right M or H is important
Effective Number of Lines
Effective Number of Hypothesis
DIchotomies: Mini-hypothesis
H = { hypothesis h: x-> {x ,o} }
call h(x1, x2, ... xn) = (h(x1), h(x2), ... h(x3) ) belong to {x,o}^n
a dichotomy: hypothesis 'limited' to the eyes of x1, x2, ..., xn
Growth Function
remove dependence by taking max of all possible(x1, x2, ... xn)
mH(N) = max | H(x1,x2,...xn) | finite, upper-bounded 2^N
Growth Function for Positive Rays
mH(N) = N + 1
Growth Function for Positive Intervals
mH(N) = 1/2 * ( N*N + N) + 1
Growth Function for Convex Ssets
mH(N) = 2^N
Break Point
positive rays: break point at 2
positive interval: break point at 3
convex sets: no break point
2D perceptrons: break point at 4
break point is where mH(N) becomes 'non-exponential'