机器学习——SVM(3 Kernel SVM)

Kernel的概念

Kernel = Transform + inner Product

在SVM的对偶问题中, q n , m = y n y m z n T z m {q_{n,m}} = {y_n}{y_m}z_n^T{z_m} qn,m=ynymznTzm在计算内积 z n T z m ( ϕ ( x n ) T ϕ ( x m ) ) z_n^T{z_m}(\phi {({x_n})^T}\phi ({x_m})) znTzm(ϕ(xn)Tϕ(xm)) 时没有真正移除对 d ~ \tilde d d~的依赖,这里以Kernel的方式进行计算,以此来替代在z空间里做内积,可以很明显的提高计算效率。常用的kernel有多项式kernel(Polynomial Kernel)高斯kernel(Gaussian Kernel)

多项式kernel(Polynomial Kernel)

考虑一个2次的多项式转换(2nd order polynomial transform),即, ϕ 2 ( x ) = ( 1 , x 1 , ⋯   , x d , x 1 2 , x 1 x 2 , ⋯   , x 1 x d , x 2 x 1 , x 2 2 , ⋯   , x 2 x d , ⋯   , x d 2 ) {\phi _2}(x) = (1,{x_1}, \cdots ,{x_d},x_1^2,{x_1}{x_2}, \cdots ,{x_1}{x_d},{x_2}{x_1},x_2^2, \cdots ,{x_2}{x_d}, \cdots ,x_d^2) ϕ2(x)=(1,x1,,xd,x12,x1x2,,x1xd,x2x1,x22,,x2xd,,xd2)
,有以下的转换,

ϕ 2 ( x ) T ϕ 2 ( x ′ ) = 1 + ∑ i = 1 d x i x i ′ + ∑ i = 1 d ∑ j = 1 d x i x j x i ′ x j ′                      = 1 + ∑ i = 1 d x i x i ′ + ∑ i = 1 d x i x i ′ ∑ j = 1 d x j x j ′                      = 1 + x T x ′ + ( x T x ′ ) ( x T x ′ ) \begin{array}{l} {\phi _2}{(x)^T}{\phi _2}({x^{'}}) = 1 + \sum\limits_{i = 1}^d {{x_i}x_i^{'}} + \sum\limits_{i = 1}^d {\sum\limits_{j = 1}^d {{x_i}{x_j}{x_i}^{'}{x_j}^{'}} } \\ \\ \;\;\;\;\;\;\;\;\;\; = 1 + \sum\limits_{i = 1}^d {{x_i}x_i^{'}} + \sum\limits_{i = 1}^d {{x_i}{x_i}^{'}\sum\limits_{j = 1}^d {{x_j}{x_j}^{'}} } \\ \\ \;\;\;\;\;\;\;\;\;\; = 1 + {x^T}{x^{'}} + ({x^T}{x^{'}})({x^T}{x^{'}})\\ \end{array} ϕ2(x)Tϕ2(x)=1+i=1dxixi+i=1dj=1dxixjxixj=1+i=1dxixi+i=1dxixij=1dxjxj=1+xTx+(xTx)(xTx)

对于二次多项式转换所对应的kernel为,
K Φ 2 ( x , x ′ ) = 1 + ( x T x ′ ) + ( x T x ′ ) 2 {K_{{\Phi _2}}}(x,x') = 1 + ({x^T}x') + {({x^T}x')^2} KΦ2(x,x)=1+(xTx)+(xTx)2

再一般点,
K 2 ( x , x ′ ) = ( 1 + γ x T x ′ ) 2 , γ > 0 {K_2}(x,x') = {(1 + \gamma {x^T}x')^2},\gamma > 0 K2(x,x)=(1+γxTx)2,γ>0

再一般点,即可得到多项式Kernel(Polynomial Kernel),

K Q ( x , x ′ ) = ( ζ + γ x T x ′ ) Q , γ > 0 , ζ ≥ 0 {K_Q}(x,x') = {(\zeta + \gamma {x^T}x')^Q},\gamma > 0,\zeta \ge 0 KQ(x,x)=(ζ+γxTx)Q,γ>0,ζ0

这里的参数 ( ς , γ ) (\varsigma ,\gamma ) (ς,γ)影响SVM边界的属性。

常用的多项式Kernel有Linear Kernel,即, K 1 = x T x ′ {K_1} = {x^T}x' K1=xTx


Gaussian Kernel(Radial Basis Function kernel , RBF kernel)

高斯核函数实际上表示的是一个无限维度的转换,即表示的是以下转换,
Φ ( x ) = exp ⁡ ( − x 2 ) ( 1 , 2 1 ! x , 2 2 2 ! x 2 , ⋯   ) \Phi (x) = \exp ( - {x^2})(1,\sqrt {\frac{2}{{1!}}} x,\sqrt {\frac{{{2^2}}}{{2!}}} {x^2}, \cdots ) Φ(x)=exp(x2)(1,1!2 x,2!22 x2,)
即,

K ( x , x ′ ) = exp ⁡ ( − ( x − x ′ ) 2 )                = exp ⁡ ( − x 2 ) exp ⁡ ( − x ′ 2 ) exp ⁡ ( 2 x x ′ )              = T a y l o r exp ⁡ ( − x 2 ) exp ⁡ ( − x ′ 2 ) ( ∑ i = 0 ∞ ( 2 x x ′ ) i i ! )                = ∑ i = 0 ∞ ( exp ⁡ ( − x 2 ) exp ⁡ ( − x ′ 2 ) 2 i i 2 i i x i x ′ i )                = Φ ( x ) T Φ ( x ′ ) \begin{array}{l} K(x,x') = \exp ( - {(x - x')^2})\\ \\ \;\;\;\;\;\;\; = \exp ( - {x^2})\exp ( - {{x'}^2})\exp (2xx')\\ \\ \;\;\;\;\;\;\mathop = \limits^{Taylor} \exp ( - {x^2})\exp ( - {{x'}^2})(\sum\limits_{i = 0}^\infty {\frac{{{{(2xx')}^i}}}{{i!}}} )\\ \\ \;\;\;\;\;\;\; = \sum\limits_{i = 0}^\infty {(\exp ( - {x^2})\exp ( - {{x'}^2})\sqrt {\frac{{{2^i}}}{{i}}} \sqrt {\frac{{{2^i}}}{{i}}} {x^i}{{x'}^i})} \\ \\ \;\;\;\;\;\;\; = \Phi {(x)^T}\Phi (x') \end{array} K(x,x)=exp((xx)2)=exp(x2)exp(x2)exp(2xx)=Taylorexp(x2)exp(x2)(i=0i!(2xx)i)=i=0(exp(x2)exp(x2)i2i i2i xixi)=Φ(x)TΦ(x)

更一般的,高斯核的形式为,

K ( x , x ′ ) = exp ⁡ ( − γ ∥ x − x ′ ∥ 2 ) , γ > 0 K(x,x') = \exp ( - \gamma {\left\| {x - x'} \right\|^2}),\gamma > 0 K(x,x)=exp(γxx2),γ>0

高斯核用在SVM里的形式为,

g S V M ( x ) = s i g n ( ∑ S V α n y n K ( x n , x ) + b )              = s i g n ( ∑ S V α n y n exp ⁡ ( − γ ∥ x − x n ∥ 2 ) + b ) \begin{array}{l} {g_{SVM}}(x) = sign(\sum\limits_{SV} {{\alpha _n}{y_n}K({x_n},x) + b} )\\ \\ \;\;\;\;\;\; = sign(\sum\limits_{SV} {{\alpha _n}{y_n}\exp ( - \gamma {{\left\| {x - {x_n}} \right\|}^2}) + b} ) \end{array} gSVM(x)=sign(SVαnynK(xn,x)+b)=sign(SVαnynexp(γxxn2)+b)

由上式,可看出,sign里面的式子实际上是许多个高斯函数的线性组合,这些高斯函数是以支持向量(SVs)为均值,即中心在SVs处的高斯函数,所以Gaussian Kernel又叫做Radial Basis Function(RBF) Kernel。

对于高斯核参数 γ \gamma γ的选取, γ \gamma γ越大,所得到的SVM的边界就越复杂,越容易过拟合。


Kernel SVM

在SVM中,总共有3处使用到了Kernel,

  • q n , m = y n y m z n T z m = y n y m K ( x n , x m ) {q_{n,m}} = {y_n}{y_m}z_n^T{z_m} = {y_n}{y_m}K({x_n},{x_m}) qn,m=ynymznTzm=ynymK(xn,xm)
  • 计算b(from SV ( x s , y s ) ({x_s},{y_s}) (xs,ys)), b = y s − w T z s = y s − ( ∑ n = 1 N α n y n z n ) T z s = y s − ∑ n = 1 N α n y n ( K ( x n , x s ) ) b = {y_s} - {w^T}{z_s} = {y_s} - {(\sum\limits_{n = 1}^N {{\alpha _n}{y_n}} {z_n})^T}{z_s} = {y_s} - \sum\limits_{n = 1}^N {{\alpha _n}{y_n}} (K({x_n},{x_s})) b=yswTzs=ys(n=1Nαnynzn)Tzs=ysn=1Nαnyn(K(xn,xs))
  • 得出结果 g S V M {g_{SVM}} gSVM g S V M ( x ) = s i g n ( w T Φ ( x ) + b ) = s i g n ( ∑ n = 1 N α n y n K ( x n , x ) + b ) {g_{SVM}}(x) = sign({w^T}\Phi (x) + b) = sign(\sum\limits_{n = 1}^N {{\alpha _n}{y_n}K({x_n},x) + b} ) gSVM(x)=sign(wTΦ(x)+b)=sign(n=1NαnynK(xn,x)+b)
Kernel Hard-Margin SVM Algorithm
1. q n , m = y n y m K ( x n , x m ) ; p = − 1 N ; ( A , c )    {q_{n,m}} = {y_n}{y_m}K({x_n},{x_m});p = - {1_N};(A,c)\; qn,m=ynymK(xn,xm);p=1N;(A,c)

2. α ← Q P ( Q D , p , A , c ) \alpha \leftarrow QP({Q_D},p,A,c) αQP(QD,p,A,c)

3. b ← ( y s − ∑ S V α n y n K ( x n , x s ) ) , S V ( x s , y s ) b \leftarrow ({y_s} - \sum\limits_{SV} {{\alpha _n}{y_n}K({x_n},{x_s})} ),SV({x_s},{y_s}) b(ysSVαnynK(xn,xs)),SV(xs,ys)

4.返回参数 S V s & α n & b SVs\& {\alpha _n}\& b SVs&αn&b,对于一个新的数据x,其预测值为, g S V M ( x ) = s i g n ( ∑ S V α n y n K ( x n , x ) + b ) {g_{SVM}}(x) = sign(\sum\limits_{SV} {{\alpha _n}{y_n}K({x_n},x)} + b) gSVM(x)=sign(SVαnynK(xn,x)+b)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值