机器学习笔记——支持向量机(II)核函数

最新推荐文章于 2022-10-13 23:54:04 发布

王先生的副业

最新推荐文章于 2022-10-13 23:54:04 发布

阅读量691

点赞数

分类专栏：机器学习机器学习文章标签：机器学习函数

本文链接：https://blog.csdn.net/uncle_gy/article/details/78959421

版权

机器学习同时被 2 个专栏收录

46 篇文章 3 订阅

订阅专栏

机器学习

39 篇文章 5 订阅

订阅专栏

从低维空间映射到高维空间

异或问题式线性不可分的，但是可以通过把它映射到高维度空间实现线性可分。
令 $\phi(\mathbf{x})$ 表示将 $\mathbf{x}$ 后的特征向量。于是，在特征空间中划分超平面所对应的模型可以表示为：

f (x) = w T ϕ (x) + b

$f(\mathbf{x})=\mathbf{w}^T\phi(\mathbf{x})+b$
于是

prototype $prototype$ 可以表示为：

min w, b 1 2 | | w | | T s . t . y i (w T ϕ (x i) + b) \geq 1, i = 1, 2, \dots, m .

$\begin{aligned} &\min_{\mathbf{w},b}\dfrac{1}{2}||\mathbf{w}||^T\\ &s.t. y_i\left(\mathbf{w}^T\phi(\mathbf{x}_i)+b\right)\geq1,i=1,2,\dots,m. \end{aligned}$
对偶问题：

max α s . t . \sum i = 1 m α i - 1 2 \sum i = 1 m \sum j = 1 m α i y i α j y j ϕ (x i) T ϕ (x j) \sum i = 1 m α i y i = 0, α i \geq 0, i = 1, 2, \dots, m .

$\begin{aligned} \mathop{\max}_{\alpha}&\sum_{i=1}^{m}\alpha_i-\dfrac{1}{2}\sum_{i=1}^{m}\sum_{j=1}^{m}\alpha_iy_i\alpha_jy_j\phi(\mathbf{x}_i)^T\phi(\mathbf{x}_j)\\ s.t.&\sum_{i=1}^{m}\alpha_iy_i=0,\\ &\alpha_i\geq0,i=1,2,\dots,m. \end{aligned}$
因为涉及到计算

ϕ(xi)Tϕ(xj) $\phi(\mathbf{x}_i)^T\phi(\mathbf{x}_j)$ ,这是样本映射到特征空间后的内积，由于特征空间维数可能很高，甚至可能式无穷维，因此直接计算

ϕ(xi)Tϕ(xj) $\phi(\mathbf{x}_i)^T\phi(\mathbf{x}_j)$ 通常十分困难。

核函数

为了避开直接计算的障碍，可以设想一个函数。

k (x i, x j) = ⟨ ϕ (x i), ϕ (x j) ⟩ = ϕ (x i) T ϕ (x j)

$\mathcal{k}(\mathbf{x}_i,\mathbf{x}_j)=\left\langle\phi(\mathbf{x}_i),\phi(\mathbf{x}_j)\right\rangle=\phi(\mathbf{x}_i)^T\phi(\mathbf{x}_j)$
即

xi $\mathbf{x}_i$ 和

xj $\mathbf{x}_j$ 在特征空间内的内积等于它们在原始样本空间通过函数

k(⋅,⋅) $\mathcal{k}(\cdot,\cdot)$ 计算的结果。
于是

prototype $prototype$ 可以被重写为：

max α s . t . \sum i = 1 m α i - 1 2 \sum i = 1 m \sum j = 1 m α i y i α j y j k (x i, x j) \sum i = 1 m α i y i = 0, α i \geq 0, i = 1, 2, \dots, m .

$\begin{aligned} \mathop{\max}_{\alpha}&\sum_{i=1}^{m}\alpha_i-\dfrac{1}{2}\sum_{i=1}^{m}\sum_{j=1}^{m}\alpha_iy_i\alpha_jy_j\mathcal{k}(\mathbf{x}_i,\mathbf{x}_j)\\ s.t.&\sum_{i=1}^{m}\alpha_iy_i=0,\\ &\alpha_i\geq0,i=1,2,\dots,m. \end{aligned}$
求解后得到：

f (x) = w T ϕ (x) + b = \sum i = 1 m a i y i ϕ (x i) T ϕ (x i) + b = \sum i = 1 m a i y i k (x i, x j) + b (A)

$\begin{aligned} f(x)&=\mathbf{w}^T\phi(\mathbf{x})+b\\ &=\sum_{i=1}^{m}a_iy_i\phi(\mathbf{x}_i)^T\phi(\mathbf{x}_i)+b\\ &=\sum_{i=1}^{m}a_iy_i\mathcal{k}(\mathbf{x}_i,\mathbf{x}_j)+b\\ \end{aligned}\tag{A}$

定义：

这里的函数 $\mathcal{k}(\cdot,\cdot)$ 就是“核函数”,展开式 $A$ 就是“支持向量展开式”。

核函数定理

令 $\chi$ 为输入空间， $\mathcal{k}(\cdot,\cdot)$ 是定义在 $\chi\times\chi$ 上的对称函数，则 $\mathcal{k}$ 是核函数当且仅当对于任意的数据 $D=\left\{\mathbf{x}_1,\mathbf{x}_2,\dots,\mathbf{x}_m\right\}$ ,核矩阵 $\mathbf{K}$ 总是半正定的

K = ⎡ ⎣ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ k (x 1, x 1) ⋮ k (x i, x 1) ⋮ k (x m, x 1) \dots ⋱ \dots ⋱ \dots k (x 1, x j) ⋮ k (x i, x j) ⋮ k (x m, x j) \dots ⋱ \dots ⋱ \dots k (x 1, x m) ⋮ k (x i, x m) ⋮ k (x m, x m) ⎤ ⎦ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥

$\mathbf{K}=\begin{aligned} \left[ \begin{matrix} \mathcal{k}(\mathbf{x}_1,\mathbf{x}_1)&\dots&\mathcal{k}(\mathbf{x}_1,\mathbf{x}_j)&\dots&\mathcal{k}(\mathbf{x}_1,\mathbf{x}_m)\\ \vdots&\ddots&\vdots&\ddots&\vdots\\ \mathcal{k}(\mathbf{x}_i,\mathbf{x}_1)&\dots&\mathcal{k}(\mathbf{x}_i,\mathbf{x}_j)&\dots&\mathcal{k}(\mathbf{x}_i,\mathbf{x}_m)\\ \vdots&\ddots&\vdots&\ddots&\vdots\\ \mathcal{k}(\mathbf{x}_m,\mathbf{x}_1)&\dots&\mathcal{k}(\mathbf{x}_m,\mathbf{x}_j)&\dots&\mathcal{k}(\mathbf{x}_m,\mathbf{x}_m)\\ \end{matrix} \right] \end{aligned}$

常用核函数

名称	表达式	参数
线性核	$\mathcal{k}(\mathbf{x}_i,\mathbf{x}_j)=\mathbf{x}_i^T\mathbf{x}_j$
多项式核	$\mathcal{k}(\mathbf{x}_i,\mathbf{x}_j)=\left(\mathbf{x}_i^T\mathbf{x}_j\right)^d$	$d\geq1为多项式的次数$
高斯核	$\mathcal{k}(\mathbf{x}_i,\mathbf{x}_j)=\exp\left(-\dfrac{\|\|\mathbf{x}_i-\mathbf{x}_j\|\|^2}{2\delta^2}\right)$	$\delta\gt0为高斯核的带宽$
拉普拉斯核	$\mathcal{k}(\mathbf{x}_i,\mathbf{x}_j)=\exp\left(-\dfrac{\|\|\mathbf{x}_i-\mathbf{x}_j\|\|}{\delta}\right)$	$\delta\gt0$
$Sigmoid$ 核	$\mathcal{k}(\mathbf{x}_i,\mathbf{x}_j)=\tanh\left(\beta\mathbf{x}_i^T\mathbf{x}_j+\theta\right)$	$\tanh$ 为双曲正切函数， $\beta\gt0,\theta\lt0$

核函数的组合

若 $\mathcal{k}_1$ 和 $\mathcal{k}_2$ 为核函数，则对于任意的正数 $\gamma_1$ 、 $\gamma_2$ 以下三种组合可以得到新的核函数：
线性组合：

γ 1 k 1 + γ 2 k 2

$\gamma_1\mathcal{k}_1+\gamma_2\mathcal{k}_2$
核函数直积：

k 1 \otimes k 2 (x, z) = k 1 (x, z) k 2 (x, z)

$\mathcal{k}_1\otimes\mathcal{k}_2(\mathbf{x},\mathbf{z})=\mathcal{k}_1(\mathbf{x},\mathbf{z})\mathcal{k}_2(\mathbf{x},\mathbf{z})$
对任意的函数

g(x) $g(\mathbf{x})$ :

k (x, z) = g (x) k 1 (x, z) g (z)

$\mathcal{k}(\mathbf{x},\mathbf{z})=g(\mathbf{x})\mathcal{k}_1(\mathbf{x},\mathbf{z})g(\mathbf{z})$

王先生的副业

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
机器学习笔记——支持向量机(II)核函数

从低维空间映射到高维空间异或问题式线性不可分的，但是可以通过把它映射到高维度空间实现线性可分。令ϕ(x)\phi(\mathbf{x})表示将x\mathbf{x}后的特征向量。于是，在特征空间中划分超平面所对应的模型可以表示为： f(x)=wTϕ(x)+bf(\mathbf{x})=\mathbf{w}^T\phi(\mathbf{x})+b 于是prototypeprototype可
复制链接

扫一扫

专栏目录