SVM优化--对偶

SVM--基本思想中讲到SVM的优化目标,这里再贴出来,如下:

\min_{W,b,\xi}||W||^2/2 + C \sum_i{\xi _i} \\ s.t. \;\;\; && y_i (W^T X_i + b) \geq 1-\xi_i, \forall i \\ \xi _i \geq 0 , \forall i

这是一个二次规划问题,虽然可以直接应用KKT条件进行求解,但是还是太过复杂,不易求解;

首先对该问题进行一下转化,设

L(W,b,\xi,\alpha,\beta) = f(W,b,\xi) + \sum_i{\alpha_i g(X_i)} + \sum_i{\beta_i h(\xi_i)}

其中g(X_i) = 1-\xi_i - y_i (W^T X_i + b) \leq 0h(\xi_i) = -\xi_i \leq 0\alpha_i \geq 0, \; \beta_i \geq 0, \; \forall i

f(W,b,\xi) = \frac{​{\left \| W \right \|}^2}{2} + C \sum_i{\xi_i}

\because \alpha_i g(X_i) \leq 0, \; \beta_i h(\xi_i) \leq 0

\therefore \max_{\alpha,\beta}L(W,b,\xi,\alpha,\beta) = f(W,b,\xi),也即有

\min_{W,b,\xi}{f(W,b,\xi)} = \min_{W,b,\xi}\max_{\alpha,\beta}L(W,b,\xi,\alpha,\beta),参照KKT的证明

易知,当\alpha_i g(X_i) = 0, \; \beta_i h(\xi_i) = 0, \; \forall iKKT)时,

\min_{W,b,\xi}\max_{\alpha,\beta}L(W,b,\xi,\alpha,\beta) = \max_{\alpha,\beta}\min_{W,b,\xi}L(W,b,\xi,\alpha,\beta),等式右边是等式左边的对偶

最终,SVM的优化目标等价于

\max_{\alpha,\beta}\min_{W,b,\xi}L(W,b,\xi,\alpha,\beta) \\ s.t. \;\; \forall i \left\{\begin{matrix} \alpha_i g(X_i) = 0, \; \beta_i h(\xi_i) = 0 \\ g(X_i) \leq 0 , \; h(\xi_i) \leq 0 \\ \alpha_i \geq 0, \; \beta_i \geq 0 \end{matrix}\right.

先求\min_{W,b,\xi}L(W,b,\xi,\alpha,\beta),即

\left\{\begin{matrix} \partial L/\partial W = W - \sum_{i}\alpha_i y_i X_i = 0 \; &\rightarrow& W = \sum_{i}\alpha_i y_i X_i \\ \partial L/\partial b = - \sum_{i}\alpha_i y_i = 0 \; &\rightarrow& \sum_{i}\alpha_i y_i = 0\\ \partial L/\partial \xi_i = C - \alpha_i - \beta_i = 0 \; &\rightarrow& \alpha_i = C-\beta_i \leq C \\ \end{matrix}\right.将这些代入上公式,即有

\min_{a}L = \min_{a}\left \| \sum_{i}a_i y_i X_i \right \|^2/2 - \sum_{i}a_i \\ s.t. \;\; \forall i \left\{\begin{matrix} \alpha_i g(X_i) = 0, \; \beta_i h(\xi_i) = 0 \\ g(X_i) \leq 0 , \; h(\xi_i) \leq 0 \\ \alpha_i \geq 0, \; \beta_i \geq 0 \\ 0 \leq a_i \leq C, \;\; \sum_{i}a_i y_i = 0 \end{matrix}\right.

\alpha_i = 0时,\beta_i = C - \alpha_i = C > 0 \; \rightarrow \xi_i = 0,有y_i(W^T X_i + b) \geq 1-\xi_i=1

0 < \alpha_i < C时,0 < \beta_i = C - \alpha_i < C \; \rightarrow \xi_i = 0,有y_i(W^T X_i + b) = 1-\xi_i=1

\alpha_i = C时,\beta_i = C - \alpha_i = 0 \; \rightarrow \xi_i \geq 0,有y_i(W^T X_i + b) = 1-\xi_i \leq 1

即有\left\{\begin{matrix} y_i(W^T X_i + b) \geq 1 &,& \alpha_i = 0 \\ y_i(W^T X_i + b) = 1 &,& 0 < \alpha_i < C \\ y_i(W^T X_i + b) \leq 1 &,& \alpha_i = C \\ \end{matrix}\right.

所以最终\min_{\alpha}L = \min_{\alpha}\left \| \sum_{i}\alpha_i y_i X_i \right \|^2/2 - \sum_{i}\alpha_i \\ s.t. \;\; 0 \leq \alpha_i \leq C, \; \sum_{i}\alpha_i y_i = 0 \\ KKT: \forall i \left\{\begin{matrix} y_i(W^T X_i + b) \geq 1 &,& \alpha_i = 0 \\ y_i(W^T X_i + b) = 1 &,& 0 < \alpha_i < C \\ y_i(W^T X_i + b) \leq 1 &,& \alpha_i = C \\ \end{matrix}\right.

在取得最优解后,即可根据\left\{\begin{matrix} W = \sum_{i}\alpha_i y_i X_i \\ b = y_i - W^T X_i, \; \forall i \in \{i: 0 < \alpha_i < C\} \end{matrix}\right.求得W、b

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值