周志华Watermelon Book SVM部分公式补充、一些原理解释

Z.F Zhou Watermelon Book SVM.This article is a supplementary material for SVM.

Watermelon Book

6.3

Suppose training data set is linear separable.Minimum margin is δ \delta δ.
{ w T x + b &gt; = + δ , y i = + 1 w T x + b &lt; = − δ , y i = − 1 → { w T δ x + b δ &gt; = + 1 , y i = + 1 w T δ x + b δ &lt; = − 1 , y i = − 1 \begin{cases} w^Tx+b&gt;=+\delta,y_i=+1\\ w^Tx+b&lt;=-\delta,y_i=-1 \end{cases}\\ \rightarrow \begin{cases} \frac{w^T}{\delta}x+\frac{b}{\delta}&gt;=+1,y_i=+1\\ \frac{w^T}{\delta}x+\frac{b}{\delta}&lt;=-1,y_i=-1 \end{cases} {wTx+b>=+δ,yi=+1wTx+b<=δ,yi=1{δwTx+δb>=+1,yi=+1δwTx+δb<=1,yi=1

6.8

Refer to the below link about lagrange and KKT.

6.9

L = 1 / 2 w T w + ∑ i = 1 m a i ( 1 − y i ( w t x i + b ) ) d e r i v a t i o n 0 = w − ∑ i = 1 m a i y i x i → w = ∑ i = 1 m a i y i x i L=1/2w^Tw+\sum_{i=1}^ma_i(1-y_i(w^tx_i+b))\\ derivation\\ 0=w-\sum_{i=1}^ma_iy_ix_i\\ \rightarrow w=\sum_{i=1}^ma_iy_ix_i L=1/2wTw+i=1mai(1yi(wtxi+b))derivation0=wi=1maiyixiw=i=1maiyixi

6.11

You can compute 1 2 ∣ ∣ w ∣ ∣ 2 \frac{1}{2}||w||^2 21w2 and ∑ i = 1 m a i ( 1 − y i ( w t x i + b ) ) \sum_{i=1}^ma_i(1-y_i(w^tx_i+b)) i=1mai(1yi(wtxi+b)) respectively.I am lazy.

6.18

Because the a i a_i ai may have error with theory value using SMO.

6.41

In below of 6.41,the interpretation about support vectors is based on a i ! = 0 a_i!=0 ai!=0.

[1] Partial Reference

KKT

part1

在这里插入图片描述
What is clear interpretation is:
Three equations base on:
u = 0   o r   g = 0   w h e r e   u   a n d   g   a r e   v e c t o r s . ( u i = 0   o r   g i = 0 ) u=0\ or\ g=0\ where\ u\ and\ g\ are\ vectors.(u_i=0\ or\ g_i=0) u=0 or g=0 where u and g are vectors.(ui=0 or gi=0)
For three equations,we can compute a x x x minimizing f ( x ) f(x) f(x) by compute another formulation,e,g, min ⁡ x max ⁡ u L ( x , u ) \min_x\max_u L(x,u) minxmaxuL(x,u).Their x x x and u u u are common.

part2

在这里插入图片描述
Why is the L ( x ^ , u ) L(\hat{x},u) L(x^,u) equivalent to the min ⁡ x L ( x , u ) \min_x L(x,u) minxL(x,u)?
I think it can use proof by contradiction.Suppose L ( x ′ , u ) = min ⁡ x L ( x , u ) L(x&#x27;,u)=\min_x L(x,u) L(x,u)=minxL(x,u).Then max ⁡ u max ⁡ x L ( x , u ) = max ⁡ u L ( x ′ , u ) = f ( x ′ ) ! = f ( x ^ ) \max_u\max_xL(x,u)=\max_uL(x&#x27;,u)=f(x&#x27;)!=f(\hat{x}) maxumaxxL(x,u)=maxuL(x,u)=f(x)!=f(x^).It is contradictory.

Lagrange

I think the link in below is very detailed.

Reference

[1]Lagrange and KKT condition.
[2]拉格朗日乘数

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值