SVM

SVM

Suppot Vector Machine
支持向量机有三宝,间隔,对偶,核技巧
简而言之,SVM是一个二分类问题模型,总而言之,SVM就是找到一个超平面 w T x + b w^Tx+b wTx+b,使得正类 w T x + b > 0 w^Tx+b>0 wTx+b>0,相反,负类 w T x + b < 0 w^Tx+b<0 wTx+b<0。本质是一个最大间隔分类器。
{ ( x i , y i ) } , x i ∈ R , y i ∈ { − 1 , 1 } {\lbrace(x_i,y_i)\rbrace}, x_i \in R ,y_i\in {\lbrace-1,1\rbrace} {(xi,yi)},xiR,yi{1,1}

我们首先定义距离distance,假设一个样本点 ( x i , y i ) (x_i,y_i) (xi,yi)距离超平面 w T x + b w^Tx+b wTx+b的距离定义为 d i s t a n c e = 1 ∣ ∣ w ∣ ∣ ∣ w T x i + b ∣ distance = \frac{1}{||w||}|w^Tx_i+b| distance=w1wTxi+b,
然后间隔就是 m a r g i n ( w , b ) = min ⁡ w , b , x i { d i s t a n c e ( w , b , x i ) } margin(w,b) = \min_{w,b,x_i}\lbrace distance(w,b,x_i) \rbrace margin(w,b)=minw,b,xi{distance(w,b,xi)}

硬间隔SVM

总的而言:
w T x + b > 0 , y i = + 1 w^Tx+b>0 , y_i=+1 wTx+b>0,yi=+1
w T x + b < 0 , y i = − 1 w^Tx+b<0 , y_i=-1 wTx+b<0,yi=1
上面的两个式子可以转换成一个
y i ( w T x i + b ) > 0 , ∀ i = 1... N y_i(w^Tx_i+b)>0,\forall i=1...N yi(wTxi+b)>0,i=1...N
m a r g i n ( w , b ) = min ⁡ w , b , x i 1 ∣ ∣ w ∣ ∣ ∣ w T x i + b ∣ margin(w,b)=\min_{w,b,x_i}\frac{1}{||w||}|w^Tx_i+b| margin(w,b)=w,b,ximinw1wTxi+b
最大间隔:
max ⁡ w , b min ⁡ x i 1 ∣ ∣ w ∣ ∣ ∣ w T x i + b ∣ , s t ∀ i = 1... N , y i ( w T x i + b ) > 0 \max_{w,b}\min_{x_i}\frac{1}{||w||}|w^Tx_i+b|,st \forall i=1...N,y_i(w^Tx_i+b)>0 w,bmaxximinw1wTxi+b,sti=1...N,yi(wTxi+b)>0
max ⁡ w , b min ⁡ x i 1 ∣ ∣ w ∣ ∣ ∣ w T x i + b ∣ = max ⁡ w , b min ⁡ x i 1 ∣ ∣ w ∣ ∣ y i ( x T x i + b ) = max ⁡ w , b 1 ∣ ∣ w ∣ ∣ min ⁡ x i y i ( w T x i + b ) \max_{w,b}\min_{x_i}\frac{1}{||w||}|w^Tx_i+b| =\max_{w,b}\min_{x_i} \frac{1}{||w||} y_i(x^Tx_i+b)=\max_{w,b}\frac{1}{||w||}\min_{x_i}y_i(w^Tx_i+b) w,bmaxximinw1wTxi+b=w,bmaxximinw1yi(xTxi+b)=w,bmaxw1ximinyi(wTxi+b)
∃ γ > 0 , s t min ⁡ x i , y i y i ( w T x i + b ) = γ \exists \gamma>0,st\min_{x_i,y_i}y_i(w^Tx_i+b)=\gamma γ>0,stxi,yiminyi(wTxi+b)=γ
γ = 1 \gamma = 1 γ=1则:
max ⁡ w , b 1 ∣ ∣ w ∣ ∣ min ⁡ x i y i ( w T x i + b ) = max ⁡ w , b 1 ∣ ∣ w ∣ ∣ \max_{w,b}\frac{1}{||w||}\min_{x_i}y_i(w^Tx_i+b) = \max_{w,b}\frac{1}{||w||} w,bmaxw1ximinyi(wTxi+b)=w,bmaxw1
这样就转化成为了一个凸优化问题
∀ i = 1.. N , s t   y i ( w T x i + b ) > = 1   ,   min ⁡ w , b 1 2 w T w \forall i=1..N,st ~ y_i(w^Tx_i+b)>=1~,~\min_{w,b}\frac{1}{2}w^Tw i=1..N,st yi(wTxi+b)>=1 , w,bmin21wTw
拉格朗日乘子法
L ( w , b , λ ) = 1 2 w T w + ∑ i = 1 N λ i [ 1 − y i ( w T x i + b ) ] \mathcal{L}(w,b,\lambda) = \frac{1}{2}w^Tw+\sum_{i=1}^{N}\lambda_i[1-y_i(w^Tx_i+b)] L(w,b,λ)=21wTw+i=1Nλi[1yi(wTxi+b)]
转化为无参约束问题的解释:
i f   1 − y i ( w T x i + b ) > 0   ,   max ⁡ λ L ( w , b , λ ) = 1 2 w T w + ∞ = ∞ if ~ 1-y_i(w^Tx_i+b)>0 ~,~\max_{\lambda}\mathcal{L}(w,b,\lambda)=\frac{1}{2} w^T w + \infty = \infty if 1yi(wTxi+b)>0 , λmaxL(w,b,λ)=21wTw+=
i f   1 − y i ( w T x i + b ) < = 0   ,   max ⁡ λ L ( w , b , λ ) = 1 2 w T w + 0 = 1 2 w T w if ~ 1-y_i(w^Tx_i+b)<=0 ~,~\max_{\lambda}\mathcal{L}(w,b,\lambda)=\frac{1}{2} w^T w + 0 = \frac{1}{2} w^T w if 1yi(wTxi+b)<=0 , λmaxL(w,b,λ)=21wTw+0=21wTw
满足kkt条件,将凸优化问题转换成为一个无参数约束问题:
λ i > = 0   ,   min ⁡ w , b max ⁡ λ L ( w , b , λ ) \lambda_i>=0~,~ \min_{w,b}\max_{\lambda}\mathcal{L}(w,b,\lambda) λi>=0 , w,bminλmaxL(w,b,λ)
转化为对偶问题
λ i > = 0   ,   max ⁡ λ min ⁡ w , b L ( w , b , λ ) \lambda_i>=0~,~ \max_{\lambda}\min_{w,b}\mathcal{L}(w,b,\lambda) λi>=0 , λmaxw,bminL(w,b,λ)
接下来就是计算 min ⁡ w , b L ( w , b , λ ) \min_{w,b}\mathcal{L}(w,b,\lambda) minw,bL(w,b,λ)问题
∂ L ∂ b = − ∑ i = 1 N λ i y i = 0 \frac{\partial \mathcal{L} }{\partial b} = -\sum_{i=1}^N \lambda_iy_i = 0 bL=i=1Nλiyi=0
将上述计算结果带入 L ( w , b , λ ) \mathcal{L}(w,b,\lambda) L(w,b,λ)
L ( w , b , λ ) = 1 2 w T w + ∑ i = 1 N λ i − ∑ i = 1 N λ i y i w T x i \mathcal{L}(w,b,\lambda) = \frac{1}{2}w^Tw+\sum_{i=1}^N\lambda_i-\sum_{i=1}^N\lambda_iy_iw^Tx_i L(w,b,λ)=21wTw+i=1Nλii=1NλiyiwTxi
∂ L ∂ w = 1 / 2 ∗ 2 w − ∑ i = 1 N λ i y i x i = 0 = > w = ∑ i = 1 N λ i y i x i \frac{\partial \mathcal{L} }{\partial w} = 1/2 * 2w-\sum_{i=1}^N\lambda_iy_ix_i = 0 => w = \sum_{i=1}^N\lambda_iy_ix_i wL=1/22wi=1Nλiyixi=0=>w=i=1Nλiyixi
L ( w , b , λ ) = − 1 2 ∑ i = 1 N ∑ j = 1 N λ i λ j y i y j x i T x j + ∑ i = 1 N λ i \mathcal{L}(w,b,\lambda) = -\frac{1}{2}\sum_{i=1}^N \sum_{j=1} ^N \lambda_i \lambda_j y_iy_j{x_i}^T x_j + \sum_{i=1}^N \lambda_i L(w,b,λ)=21i=1Nj=1NλiλjyiyjxiTxj+i=1Nλi
原问题转化成为
λ i > = 0   ,   max ⁡ λ − 1 2 ∑ i = 1 N ∑ j = 1 N λ i λ j y i y j x i T x j + ∑ i = 1 N λ i \lambda_i>=0~,~\max_{\lambda}-\frac{1}{2}\sum_{i=1}^N \sum_{j=1} ^N \lambda_i \lambda_j y_iy_j{x_i}^T x_j + \sum_{i=1}^N \lambda_i λi>=0 , λmax21i=1Nj=1NλiλjyiyjxiTxj+i=1Nλi
λ i > = 0   ,   min ⁡ λ 1 2 ∑ i = 1 N ∑ j = 1 N λ i λ j y i y j x i T x j + ∑ i = 1 N λ i \lambda_i>=0~,~\min_{\lambda} \frac{1}{2}\sum_{i=1}^N \sum_{j=1} ^N \lambda_i \lambda_j y_iy_j{x_i}^T x_j + \sum_{i=1}^N \lambda_i λi>=0 , λmin21i=1Nj=1NλiλjyiyjxiTxj+i=1Nλi
强对偶关系需要满足KKT条件:
KKT
∂ L ∂ w = 0 , ∂ L ∂ b = 0 , ∂ L ∂ λ = 0 , \frac{\partial \mathcal{L} }{\partial w}=0, \frac{\partial \mathcal{L} }{\partial b}=0, \frac{\partial \mathcal{L} }{\partial \lambda}=0, wL=0,bL=0,λL=0,
λ i ( 1 − y i ( w T x i + b ) ) = 0 \lambda_i(1-y_i(w^Tx_i+b))=0 λi(1yi(wTxi+b))=0
λ i > = 0 \lambda_i>=0 λi>=0
1 − y i ( w T x i + b ) < = 0 1-y_i(w^Tx_i+b)<=0 1yi(wTxi+b)<=0
根据kkt条件求得:
w ∗ = ∑ i = 0 N λ i y i x i w* = \sum_{i=0}^N\lambda_iy_ix_i w=i=0Nλiyixi
b ∗ = y k − ∑ i = 0 N λ i y i x i T x k b* = y_k-\sum_{i=0}^N\lambda_iy_i{x_i}^Tx_k b=yki=0NλiyixiTxk
判别面的方程为:
w ∗ T x + b ∗ w*^Tx+b* wTx+b
1

软间隔SVM

硬间隔SVM默认数据是可分的,但是,数据有时候往往是不可分的,或者是存在噪声点,这时候就引入软间隔SVM,加上一个loss(距离)
min ⁡ w , b 1 2 w T w + l o s s \min_{w,b} \frac{1}{2}w^Tw + loss w,bmin21wTw+loss
i f    y i ( w T x i + b ) > = 1   ,   l o s s = 0 if ~~ y_i(w^Tx_i+b)>=1 ~,~ loss=0 if  yi(wTxi+b)>=1 , loss=0
i f    y i ( w T x i + b ) < 1   ,   l o s s = 1 − y i ( w T x i + b ) if ~~ y_i(w^Tx_i+b)<1 ~,~ loss=1-y_i(w^Tx_i+b) if  yi(wTxi+b)<1 , loss=1yi(wTxi+b)
conclude
l o s s = m a x { 0 , 1 − y i ( w T x i + b ) } loss = max\lbrace 0,1-y_i(w^Tx_i+b)\rbrace loss=max{0,1yi(wTxi+b)}
优化函数就变成(C是参数,需要自己调整)
min ⁡ w , b 1 2 w T w + C ∑ i = 1 N m a x { 0 , 1 − y i ( w T x i + b ) } \min_{w,b} \frac{1}{2}w^Tw +C \sum_{i=1}^Nmax\lbrace 0,1-y_i(w^Tx_i+b)\rbrace w,bmin21wTw+Ci=1Nmax{0,1yi(wTxi+b)}
ξ i = 1 − y i ( w T x i + b ) \xi_i = 1-y_i(w^Tx_i+b) ξi=1yi(wTxi+b)

y i ( w T x i + b ) > = 1 − ξ i   ,   min ⁡ w , b 1 2 w T w + C ∑ i = 1 N ξ i y_i(w^Tx_i+b)>=1-\xi_i~,~\min_{w,b} \frac{1}{2}w^Tw +C \sum_{i=1}^N\xi_i yi(wTxi+b)>=1ξi , w,bmin21wTw+Ci=1Nξi

约束优化问题
弱对偶关系
对偶关系
kkt条件
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值