SVM
Here I just realize a simple SVM which only supports binary classification, named C-SVC.
代码在Github
Formulation
Linear
max γs.t.yi(wxi+b)|w|≥γ m a x γ s . t . y i ( w x i + b ) | w | ≥ γ
Due to γ=γ¯|w| γ = γ ¯ | w | , and we can scale |w| | w | to scale γ¯ γ ¯ , so we define γ¯ γ ¯ as 1.Then, the above formulation is as following:
max 1|w|s.t.yi(wxi+b)≥1 m a x 1 | w | s . t . y i ( w x i + b ) ≥ 1
That equals:
min 12|w|2s.t.yi(wxi+b)≥1 m i n 1 2 | w | 2 s . t . y i ( w x i + b ) ≥ 1Non-Linear
Many times, the data can not be classified linearly, so we will join a parameter ξi ξ i for each sample.
At the same time, we will add a penalty term about ξi ξ i , and use a parameter C to control balance between it with |w| | w | .
The formulation is as following:
min 12|w|2+C∑iξis.t.yi(wxi+b)+ξi≥1,ξi≥0 m i n 1 2 | w | 2 + C ∑ i ξ i s . t . y i ( w x i + b ) + ξ i ≥ 1 , ξ i ≥ 0Kernel
Replace xixj x i x j with K(xi,xj) K ( x i , x j ) .
Lagrange dual problem
origin problem
We want to solve a optimization problem {min P , with constraint C<=0}.
Then we can define a Lagrange Function L = P + a*C.(a>=0)
min max problem
{min max L} <=> {min P with C<=0} , that is they have the same solution.
max min problem – dual problem
{max min L}
we need to connect it with the min max problem to conclude the connection between the origin problem and dual problem.
The solution of origin problem and max min problem(dual problem) at the same time should satisfy the KKT condition.
So back to the SVM problem, the Lagrange Function is that,
12|w|2+C∑iξi−∑iαi{yi(wxi+b)+ξi−1}−μiξi 1 2 | w | 2 + C ∑ i ξ i − ∑ i α i { y i ( w x i + b ) + ξ i − 1 } − μ i ξ i
Now the time of getting dual problem:First, we min this Lagrange Function, make a derivation about parameters w,b,ξi w , b , ξ i , and set them=0.
Second, substitute the results of equations into the Lagrange Function to eliminate w,C,μi,ξi,b w , C , μ i , ξ i , b , only reserve αi α i , and then max it.
We will get the dual problem:
maxα −12∑i∑jαiαjyiyj(xi.xj) +∑iαis.t.∑iαiyi=0,0≤αi≤μi max α − 1 2 ∑ i ∑ j α i α j y i y j ( x i . x j ) + ∑ i α i s . t . ∑ i α i y i = 0 , 0 ≤ α i ≤ μ i
That is,
min 12αTQα−eTαs.t.yTα=0,0≤αt≤C m i n 1 2 α T Q α − e T α s . t . y T α = 0 , 0 ≤ α t ≤ C
andQij=yiyjK(xi,xj),e=[1,1,...1,1]T Q i j = y i y j K ( x i , x j ) , e = [ 1 , 1 , . . .1 , 1 ] T.
- Solver.py
We will solve such a problem
min 12αTQα+pTαs.t.yTα=Δ,0≤αt≤C m i n 1 2 α T Q α + p T α s . t . y T α = Δ , 0 ≤ α t ≤ C
Then, the following step is described in the Paper [libsvm].