第二次总结支持向量机,主要涉及到实现SVM的算法—SMO算法,分为简易版和完整版。打卡,7月1号之前完成~
文章目录
一、SMO算法的最优化问题分析
SMO算法要解决的凸二次规划的对偶问题: min α 1 2 ∑ i = 1 N ∑ j = 1 N α i α j y i y j K ( x i , x j ) − ∑ i = 1 N α i s . t . ∑ i = 1 N α i y i = 0 0 ⩽ α i ⩽ C , i = 1 , 2 , … , N \min\limits_{\alpha}\frac{1}{2}\sum\limits_{i=1}^{N}\sum\limits_{j=1}^{N}\alpha_{i}\alpha_{j}y_{i}y_{j}K(x_{i},x_{j})-\sum\limits_{i=1}^{N}\alpha_{i}\\s.t.\hspace{0.5cm}\sum\limits_{i=1}^{N}\alpha_{i}y_{i}=0\hspace{3.5cm}\\0\leqslant\alpha_{i}\leqslant C,i =1,2,\dots,N αmin21i=1∑Nj=1∑NαiαjyiyjK(xi,xj)−i=1∑Nαis.t.i=1∑Nαiyi=00⩽αi⩽C,i=1,2,…,N
注:在写数学公式的时候,光标出现了问题,之前光标是一条线,想在哪里修改就在哪里修改。不知道哪里出现了问题,光标变成了一个蓝框,公式编起来非常麻烦。
解决方法:
切换光标的模式的方法为按 insert 键。笔记本电脑有的需要同时按 Fn + insert 键能在三种状态之间切换。
SMO算法是一种启发式算法,其主要思想为:选择两个变量 α 1 , α 2 \alpha_{1},\alpha_{2} α1,α2,固定其他变量,针对这两个变量建立一个二次规划问题,形式为:
min α 1 , α 2 W ( α 1 , α 2 ) = 1 2 K 11 α 1 2 + 1 2 K 22 α 2 2 + y 1 y 2 K 12 α 1 α 2 − ( α 1 + α 2 ) + y 1 α 1 ∑ i = 3 N y i α i K i 1 + y 2 α 2 ∑ i = 3 N y i α i K i , 2 s . t . α 1 y 1 + α 2 y 2 = − ∑ i = 3 N y i α i = ς 0 ⩽ α i ⩽ C , i = 1 , 2 \min\limits_{\alpha_{1},\alpha_{2}}\ W(\alpha_{1},\alpha_{2})=\frac{1}{2}K_{11}\alpha_{1}^{2}+\frac{1}{2}K_{22}\alpha_{2}^{2}+y_{1}y_{2}K_{12}\alpha_{1}\alpha_{2}-(\alpha_{1}+\alpha_{2})\\+y_{1}\alpha_{1}\sum\limits_{i=3}^{N}y_{i}\alpha_{i}K_{i1}+y_{2}\alpha_{2}\sum\limits_{i=3}^{N}y_{i}\alpha_{i}K_{i,2}\\s.t.\hspace{1cm}\alpha_{1}y_{1}+\alpha_{2}y_{2}=-\sum\limits_{i=3}^{N}y_{i}\alpha_{i}=\varsigma\hspace{4cm}\\0\leqslant\alpha_{i}\leqslant C,i =1,2\hspace{2.5cm} α1,α2min W(α1,α2)=21K11α12+21K22α22+y1y2K12α1α2−(α1+α2)+y1α1i=3∑NyiαiKi1+y2α2i=3∑NyiαiKi,2s.t.α1y1+α2y2=−i=3∑Nyiαi=ς0⩽αi⩽C,i=1,2
1.无约束下的二次规划问题的极值
令 ν i = ∑ j = 3 N = g ( x i ) − ∑ j = 1 2 α j y j K ( x i , x j ) − b , i = 1 , 2 \nu_{i}=\sum\limits_{j=3}^{N}=g(x_{i})-\sum\limits_{j=1}^{2}\alpha_{j}y_{j}K(x_{i},x_{j})-b, \ i=1,2 νi=j=3∑N=g(xi)−j=1∑2αjyjK(xi,xj)−b, i=1,2
目标函数为:
W ( α 1 , α 2 ) = = 1 2 K 11 α 1 2 + 1 2 K 22 α 2 2 + y 1 y 2 K 12 α 1 α 2 − ( α 1 + α 2 ) + y 1 ν 1 α 1 + y 2 ν 2 α 2 W(\alpha_{1},\alpha_{2})==\frac{1}{2}K_{11}\alpha_{1}^{2}+\frac{1}{2}K_{22}\alpha_{2}^{2}+y_{1}y_{2}K_{12}\alpha_{1}\alpha_{2}-(\alpha_{1}+\alpha_{2})\\+y_{1}\nu_{1}\alpha_{1}+y_{2}\nu_{2}\alpha_{2} W(α1,α2)==21K11α12+21K22α22+y1y2K12α1α2−(α1+α2)+y1ν1α1+y2ν2α2
由 α 1 y 1 = ς − α 2 y 2 \alpha_{1}y_{1}=\varsigma-\alpha_{2}y_{2} α1y1=ς−α2y2及 y 1 2 = 1 y_{1}^{2}=1 y12=1,可将 α 1 \alpha_{1} α1表示为:
α 1 = ( ς − y 2 α 2 ) y 1 \alpha_{1}=(\varsigma-y_{2}\alpha_{2})y_{1} α1=(ς−y2α2)y1
带入到 W ( α 1 , α 2 ) W(\alpha_{1},\alpha_{2}) W(α1,α2)的表达式中,
W ( α 2 ) = 1 2 K 11 ( ς − y 2 α 2 ) 2 + 1 2 K 22 α 2 2 + y 2 K 12 ( ς − y 2 α 2 ) α 2 − ( ς − y 2 α 2 ) y 1 − α 2 + ν 1 ( ς − y 2 α 2 ) + y 2 ν 2 α 2 W(\alpha_{2})=\frac{1}{2}K_{11}(\varsigma-y_{2}\alpha_{2})^{2}+\frac{1}{2}K_{22}\alpha_{2}^{2}+y_{2}K_{12}(\varsigma-y_{2}\alpha_{2})\alpha_{2}\\-(\varsigma-y_{2}\alpha_{2})y_{1}-\alpha_{2}+\nu_{1}(\varsigma-y_{2}\alpha_{2})+y_{2}\nu_{2}\alpha_{2} W(α2)=21K11(ς−y2α<