SVR,support vector regression,是SVM(支持向量机support vector machine)对回归问题的一种运用。
1. 原理
给定训练样本
x
i
∈
R
p
,
i
=
1
,
…
,
n
x_{i} \in \mathbb{R}^{p}, i=1, \ldots, n
xi∈Rp,i=1,…,n,
y
∈
R
n
y \in \mathbb{R}^{n}
y∈Rn,
min
w
,
b
,
ζ
,
ζ
∗
1
2
w
T
w
+
C
∑
i
=
1
n
(
ζ
i
+
ζ
i
∗
)
subject to
y
i
−
w
T
ϕ
(
x
i
)
−
b
≤
ε
+
ζ
i
w
T
ϕ
(
x
i
)
+
b
−
y
i
≤
ε
+
ζ
i
ζ
i
,
ζ
i
≥
0
,
i
=
1
,
…
,
n
\begin{array}{c}{\min _{w, b, \zeta, \zeta^{*}} \frac{1}{2} w^{T} w+C \sum_{i=1}^{n}\left(\zeta_{i}+\zeta_{i}^{*}\right)} \\ {\text { subject to } y_{i}-w^{T} \phi\left(x_{i}\right)-b \leq \varepsilon+\zeta_{i}} \\ {w^{T} \phi\left(x_{i}\right)+b-y_{i} \leq \varepsilon+\zeta_{i}} \\ {\zeta_{i}, \zeta_{i} \geq 0, i=1, \ldots, n}\end{array}
minw,b,ζ,ζ∗21wTw+C∑i=1n(ζi+ζi∗) subject to yi−wTϕ(xi)−b≤ε+ζiwTϕ(xi)+b−yi≤ε+ζiζi,ζi≥0,i=1,…,n
对偶形式:
min
α
,
α
∗
1
2
(
α
−
α
∗
)
T
Q
(
α
−
α
∗
)
+
ε
e
T
(
α
+
α
∗
)
−
y
T
(
α
−
α
∗
)
subject to
e
T
(
α
−
α
∗
)
=
0
0
≤
α
i
,
α
i
∗
≤
C
,
i
=
1
,
…
,
n
\begin{array}{c}{\min _{\alpha, \alpha^{*}} \frac{1}{2}\left(\alpha-\alpha^{*}\right)^{T} Q\left(\alpha-\alpha^{*}\right)+\varepsilon e^{T}\left(\alpha+\alpha^{*}\right)-y^{T}\left(\alpha-\alpha^{*}\right)} \\ {\text { subject to } e^{T}\left(\alpha-\alpha^{*}\right)=0} \\ {0 \leq \alpha_{i}, \alpha_{i}^{*} \leq C, i=1, \ldots, n}\end{array}
minα,α∗21(α−α∗)TQ(α−α∗)+εeT(α+α∗)−yT(α−α∗) subject to eT(α−α∗)=00≤αi,αi∗≤C,i=1,…,n
2. SVR实践
#-*-coding:utf-8-*-
'''
Created on 2016年5月4日
@author: Gamer Think
'''
import numpy as np
from sklearn.svm import SVR
import matplotlib.pyplot as plt
###############################################################################
# Generate sample data
X = np.sort(5 * np.random.rand(40, 1), axis=0) #产生40组数据,每组一个数据,axis=0决定按列排列,=1表示行排列
y = np.sin(X).ravel() #np.sin()输出的是列,和X对应,ravel表示转换成行
###############################################################################
# Add noise to targets
y[::5] += 3 * (0.5 - np.random.rand(8))
###############################################################################
# Fit regression model
svr_rbf = SVR(kernel='rbf', C=1e3, gamma=0.1)
svr_lin = SVR(kernel='linear', C=1e3)
svr_poly = SVR(kernel='poly', C=1e3, degree=2)
y_rbf = svr_rbf.fit(X, y).predict(X)
y_lin = svr_lin.fit(X, y).predict(X)
y_poly = svr_poly.fit(X, y).predict(X)
###############################################################################
# look at the results
lw = 2
plt.scatter(X, y, color='darkorange', label='data')
plt.hold('on')
plt.plot(X, y_rbf, color='navy', lw=lw, label='RBF model')
plt.plot(X, y_lin, color='c', lw=lw, label='Linear model')
plt.plot(X, y_poly, color='cornflowerblue', lw=lw, label='Polynomial model')
plt.xlabel('data')
plt.ylabel('target')
plt.title('Support Vector Regression')
plt.legend()
plt.show()
2.2 几种核
线性核:
⟨
x
,
x
′
⟩
\left\langle x, x^{\prime}\right\rangle
⟨x,x′⟩
多项式核:
(
γ
⟨
x
,
x
′
⟩
+
r
)
d
\left(\gamma\left\langle x, x^{\prime}\right\rangle+ r\right)^{d}
(γ⟨x,x′⟩+r)d
rbf核:
exp
(
−
γ
∥
x
−
x
′
∥
2
)
.
\exp \left(-\gamma\left\|x-x^{\prime}\right\|^{2}\right) .
exp(−γ∥x−x′∥2).
sigmoid核:
(
tanh
(
γ
⟨
x
,
x
′
⟩
+
r
)
)
\left(\tanh \left(\gamma\left\langle x, x^{\prime}\right\rangle+ r\right)\right)
(tanh(γ⟨x,x′⟩+r))
最近开通了个公众号,主要分享python原理与应用,推荐系统,风控等算法相关的内容,感兴趣的伙伴可以关注下。
公众号相关的学习资料会上传到QQ群596506387,欢迎关注。
参考: