CalTech machine learning, video 15 note(Kernel Method)

18:56 2014-10-09
start CalTech machine learning, video 15


Kernel Method


18:56 2014-10-09
if you think of linear model as economy car,


you can think SVM as luxury car


18:56 2014-10-09
maximizing the margin


18:58 2014-10-09
Review of Lecture 14


* The margin


* quadratic programming


* support vectors


* nonlinear transform


19:01 2014-10-09
support vectors are the ones that achieve the margin,


they're used to define the plane


19:02 2014-10-09
in-sample check out-of-sample error


19:03 2014-10-09
we went to a fairly high-dimensional space


19:04 2014-10-09
outline:


* The kernel trick


* Soft-margin SVM


19:06 2014-10-09
you're going to a high-dimensional space without


paying the price for it


19:07 2014-10-09
you count the number of "support vectors"


19:09 2014-10-09
I'm the guadian of the z space, you come to me


with requests, 


19:17 2014-10-09
inner product after a transformation


/
8:58 2014-10-10 Friday
start CalTech machine learning, video 14


Kernel Methods


8:58 2014-10-10
outline:


* The kernel trick


* Soft-margin SVM


9:08 2014-10-10
extend SVM from the linear separable case to 


the nonlinear separable case allowing yourself


to make errors


9:09 2014-10-10
shift from hard-margin to soft-margin to allow


some errors


9:10 2014-10-10
in a practical problem, you're going to use both,


you're going to a high dimensional space, probably 


an infinite dimensional space without paying the price


for it


9:11 2014-10-10
the idea of the kernel is that: I want to go to the 


z space without paying the price for it.


9:12 2014-10-10
you count the #support vectors, the dimensionality 


of the z space doesn't appear


9:13 2014-10-10
we still need to take the innner product in the z space


9:13 2014-10-10
what do I need from the z space in order to be carrying 


out the machinery that I have seen so far?


9:14 2014-10-10
in order to be able to carry out the Langrange, I need to get the 


inner product in the z space


9:16 2014-10-10
but getting the inner product in the z space is less demand


than getting actual vector in the z space.


9:16 2014-10-10
I'm the guardian of the z space, I 'm closing the door, nobody 


has acess to the z space, you come to me with requests, if you


give me a x, and ask me what is the transformation, it's a big 


demand, I have to handle a big z.


9:19 2014-10-10
but all I'm willing to give you is "inner product"


9:20 2014-10-10
you give me x & x', I close the door,do my thing, and come


back with a number, which is the inner product with z & z',


without actually telling where z & z' 


9:21 2014-10-10
that will be a simple operation,


9:21 2014-10-10
it's a good thing, because we can compeletely focusing on 


innner product in the z space, and see if that can lead us


to simplification.


9:22 2014-10-10
so this is the 1st constraints, I don't see any z


9:23 2014-10-10
I don't know what w is, w lives in the z space


9:23 2014-10-10
can I get a way with just inner product to solve this?


9:24 2014-10-10
w is not mysterious to us, we already solve it


9:24 2014-10-10
but particular to the support vectors with nonzero αs


9:25 2014-10-10
I solve for be by taking any support vector


9:26 2014-10-10
we only deal with z, as far as inner product is concerned


9:27 2014-10-10
if I'm able to compute the inner product in the z space


without visiting the z space, I still can carry this machinery


9:28 2014-10-10
all we have to do is some thing:


I give you x & x', 2 points in the x space, 


you do your thing, come back with a number, 


promise that this is the inner product in the z space.


the mysterious z space.


9:30 2014-10-10
you do something, you know the existence is sufficient


9:31 2014-10-10
let's look at the idea as a generalizd inner product


9:31 2014-10-10
we view it as a generalized inner product in the x space


9:32 2014-10-10
K(x, x') // the kernel


the kernel will correspond to some z space


9:33 2014-10-10
K(x, x') // generalized inner product between x & x'


9:33 2014-10-10
not an ordinary inner product, but an inner product


after a transformation


9:34 2014-10-10
z = Φ(x) // transformation


K(x, x') = ztz' // inner product


9:36 2014-10-10
now we come to the trick:


Can we compute this kernel, K(x, x') without transforming


x & x'?


9:37 2014-10-10
it doesn't tranform thing into the z space & take the inner product,


it just tell you what the kernel is, and then I'm going to 


convince you that this kernel actually correspond to a transformation


to some z space, and taking the inner product there.


9:39 2014-10-10
the main thing you can observe is that, this is not


a inner product 


9:40 2014-10-10
the polynomial kernel


9:42 2014-10-10
the equivalent kernel


9:45 2014-10-10
a valid kernel is the inner product in some space


9:46 2014-10-10
how much computations does it take you to do this?


9:46 2014-10-10
multiply them ,and raise the the power Q


9:47 2014-10-10
d: dimensionality of the x space


9:48 2014-10-10
you can see that I expand this conceptually not


computationally.


9:48 2014-10-10
this will be an ugly beast to deal with


9:50 2014-10-10
but the bottom line is a kernel of this form does


correspond to an inner product in a higher space.


9:51 2014-10-10
by compute it in the x space just using this formula,


9:52 2014-10-10
with this in mind, we need only z to be exist,


let's get carried away, and try to just get a kernel,


and map us to z without even imaging what a is.


9:54 2014-10-10
we take this K(x, x') to be an inner product in "some" space z


9:55 2014-10-10
We need only Z to exist!


9:55 2014-10-10
I have no diea what it is, I can compute it.


9:56 2014-10-10
the interesting thing is that that space is infinite dimesional


9:56 2014-10-10
you have get the benefit of a horrific nonlinear transformation


9:57 2014-10-10
here we don't care, we carry the machinery, then we


count the #support vectors, 


10:00 2014-10-10
my purpose is to convince you that there is a z space,


and this is an inner product


10:02 2014-10-10
this is a very interesting kernel, called the


RBF(Radial Basis Function) kernel


10:04 2014-10-10
let's look at the kernel in action, it's a very


sophisticated kernel, it corresponds to an infinite


dimensional space,


10:06 2014-10-10
so this is the data set I'm working with


10:07 2014-10-10
now I'm going to transform x into an infinite dimensional space


10:08 2014-10-10
you get the kernel, you pass it onto the quadratic proramming,


the QP give you the "support vectors"


10:09 2014-10-10
I darken the dot that end up being support vectors


10:11 2014-10-10
I have 9 SVs altogether, hundred points, can you tell me


what is the Eout(out-of-sample error)?


can you bounded above?


10:12 2014-10-10
the z space is a mysterious guy


10:13 2014-10-10
but you can see why support vectors are called support vectors


10:14 2014-10-10
if you don't get linear separability in the z space, 


you're really in trouble


10:15 2014-10-10
the other thing is that when you think of the notion


of a distance, when I get the linear separability there(the z space),


I get a margin, I try to maximize the margin, that is 


already maximized by the machinery.


10:17 2014-10-10
I do get the small number of support vectors


10:17 2014-10-10
but again, this is not the margin, these


guys are pre-image of support vectors, the distance 


solve for happens in the z space, 


10:19 2014-10-10
you may get support vectors which are far away,


it happens in the space you don't understand,


10:19 2014-10-10
so we get the solution, it's a pretty nice tool to have


10:20 2014-10-10
check the number of SVs, it will tell you the generalization property


10:21 2014-10-10
if I give you kernel, it's a valid kernel, how


do you formalate the problem in the z space?


10:21 2014-10-10
Kernel formulation of SVM


10:22 2014-10-10
QP == Quadratic Programming


10:22 2014-10-10
quadratic coefficient


10:22 2014-10-10
how do I construct the hypothese in terms of the kernel?


10:24 2014-10-10
SVM is not a specific model, you choose a kernel,


it will give you a different model


10:26 2014-10-10
this is for any support vector which is defined by


αm > 0


10:28 2014-10-10
I just avoided the label by using the kernel,


I got the solution made me completely forgot 


that I did nonliear transformation to the z space


10:30 2014-10-10
the only thing to remember is that this transformation


depends on you data set.


10:31 2014-10-10
so the whole idea of the kernel is that you don't


visit the z space, if you find this is a valid kernel


namely some inner product in a space without visiting 


that space.


10:34 2014-10-10
by the way, in support vector machines, you will


come up with your own kernels


10:34 2014-10-10
in order to be a valid kernel, there are 3 approaches:


* By construction // conceptually construction?


* Math property   // use the math properties of the kernel(Mercer's condition)


* who cares?


10:36 2014-10-10
Design your own kernel


K(x, x') is a valid kernel iff


1. it is symmetric 


2. the matrix is positive semi-definite


// the matrix should be greater or equal to zero!


// Mercer's condition


10:39 2014-10-10
but that indeed is the condition, and if you manage to 


establish the condition for any kernel, then you establish


the z space exists, even if you don't know what the z space


is.


10:43 2014-10-10
now we're going to the case where the data is not 


linear separable, and we still insist on separating them


with making some errors  


=> soft-margin SVM


10:45 2014-10-10
2 types of non-separable


* slightly  // soft-margin SVMs deal with this


* serious  // kernels deal with this


10:46 2014-10-10
you will be combining the kernel with the soft-margin SVM


in almost all the probolems you encounter


10:47 2014-10-10
Error measure:


* Margin violation


*


10:52 2014-10-10
error measure based on violating the margin


10:53 2014-10-10
when this fails, the margin violated


10:54 2014-10-10
I'm going to introduce a slack for every point


10:55 2014-10-10
now I'm going to penalize you for the total violation


you made, I'm just going to add up these violations


10:56 2014-10-10
so this is the quantity that I provide to you,


capture the violation of the margin.


10:58 2014-10-10
this is no different from our notion of argmented error


10:59 2014-10-10
* margin support vectors


* non-margin support vectors


11:13 2014-10-10
the KKT condition is necessary


11:18 2014-10-10
the major success of SVM is in classification


------------------------------------------------------
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值