【机器学习笔记】第10章：支持向量机

最新推荐文章于 2024-09-07 10:28:21 发布

Keveonnn

最新推荐文章于 2024-09-07 10:28:21 发布

阅读量173

点赞数

分类专栏：机器学习文章标签：机器学习支持向量机

本文链接：https://blog.csdn.net/qq_45474860/article/details/104781559

版权

机器学习专栏收录该内容

15 篇文章 0 订阅

订阅专栏

文章目录

第10章：支持向量机

第10章：支持向量机

10.1 代价函数 Cost function

支持向量机SVM（Support Vector Machine）可以用于分类和回归。SVM将向量映射到高维空间中，在空间中建立一个最大间隔的超平面，这个超平面两边建有两个相互平行的分开数据的超平面，使得其与中间的超平面距离最大化。

在逻辑回归中，代价函数为： $J(\theta) = -\frac{1}{m}\sum_{i=1}^m[y^{(i)}log(h_\theta(x^{(i)}))+(1-y^{(i)})log(1-h_\theta(x^{(i)}))]+\frac{\lambda}{2m}\sum_{j=1}^n\theta_j^2$ 其中，当 $y = 1$ 或 $y = 0$ 时，代价函数图像分别如下图所示：

对其进行修改，得到SVM的代价函数： $J(\theta) =C\sum_{i=1}^m[y^{(i)}cost_1(\theta^Tx^{(i)})+(1-y^{(i)})cost_0(\theta^Tx^{(i)})]+\frac{1}{2}\sum_{j=1}^n\theta_j^2$ 其中， $C$ 是一个类似权重的系数， $cost_1(\theta^Tx),cost_0(\theta^Tx)$ 的函数图像如下图所示：在这里插入图片描述
于是对代价函数的要求如下：
$\begin{cases} if \ y=1\Rightarrow \theta^Tx\geq 1 \\ if \ y=0\Rightarrow \theta^Tx\leq -1 \end{cases}$
注：如果 $C$ 太大，会导致SVM代价函数前一部分的值很小，后一部分的值很大，容易造成过拟合（对异常点敏感）。

10.2 假设函数 Hypothesis

SVM的假设函数为： $h_\theta(x)=\begin{cases} 1,\ if \ \theta^Tx \geq 0 \\ 0,\ else \end{cases}$

10.3 范数表示

设向量 $u=\begin{bmatrix} u_1 \\ u_2 \end{bmatrix},v=\begin{bmatrix} v_1 \\ v_2 \end{bmatrix}$ ，如下图所示：
在这里插入图片描述
$\Vert u\Vert=\sqrt{u_1^2+u_2^2}$ 称为 $u$ 的范数，即 $u$ 的长度； $p$ 为 $v$ 在 $u$ 上的投影长度；满足 $u^Tv=p \cdot\Vert u \Vert$ 。

所以，SVM代价函数的后一部分可以表示如下： $\frac{1}{2}\sum_{j=1}^n\theta_j^2=\frac{1}{2}(\theta_1^2+\theta_2^2+\cdots+\theta_n^2)=\frac{1}{2}(\sqrt{\theta_1^2+\theta_2^2+\cdots+\theta_n^2})^2=\frac{1}{2}\Vert \theta \Vert^2$

则代价函数的要求可以表示为： $\begin{cases} if \ y=1\Rightarrow p \cdot\Vert \theta \Vert\geq 1 \\ if \ y=0\Rightarrow p \cdot\Vert \theta \Vert\leq -1 \end{cases}$

10.4 高斯核函数 Gaussian Kernel

已知，在线性SVM中，计算的是 $\theta^Tx$ ，如果对其进行修改，计算 $\theta^Tf$ ，则是高斯核函数的SVM， $f$ 的定义如下： $f_1=similarity(x,l^{(1)})=exp(-\frac{\Vert x-l^{(1)} \Vert^2}{2\sigma^2})$ $f_2=similarity(x,l^{(2)})=exp(-\frac{\Vert x-l^{(2)} \Vert^2}{2\sigma^2})$ $\cdots$ 其中， $l$ 称为标记点， $l^{(1)},l^{(2)},\cdots,l^{(m)}$ ，每一个标记点与每一个样本数据在空间中位于相同位置。于是就有：

如果 $x$ 与 $l$ 相隔很近 $\Rightarrow f \approx exp(0)\approx1$
如果 $x$ 与 $l$ 相隔很远 $\Rightarrow f =exp(-\infty)\approx0$

高斯核函数的SVM流程可表示为：

给定数据集 $(x^{(1)},y^{(1)}),\cdots,(x^{(m)},y^{(m)})$ ；
设 $l^{(1)}=x^{(1)},\cdots,l^{(m)}=x^{(m)}$ ；
对于测试样本 $x$ ，计算 $f=\begin{bmatrix} f_1 \\ \vdots \\ f_m \end{bmatrix}$ ；
$\theta^Tf\geq 0 \Rightarrow y=1 , \theta^Tf\leq 0 \Rightarrow y=0$

此时代价函数修改为： $J(\theta) =C\sum_{i=1}^m[y^{(i)}cost_1(\theta^Tf^{(i)})+(1-y^{(i)})cost_0(\theta^Tf^{(i)})]+\frac{1}{2}\sum_{j=1}^n\theta_j^2$
注： $\sigma$ 较大，容易造成欠拟合， $\sigma$ 较小，容易造成过拟合。