机器学习_3_支持向量机SVM
写在前面:
新手入门,非权威,有理解不到位的地方欢迎批评指正。
1. 理论
该方法旨在样本空间中找到一个分割超平面,把样本集按某种规则分为几部分。从而达到分类的目的。
SVM在理论上可以分为三种:
- 线性可分支持向量机(硬间隔支持向量机)
通过硬间隔最大化(hard margin maximization)来求该超平面- 线性支持向量机(软间隔支持向量机)
用过软间隔最大化(Soft Margin Maximization)来求该超平面- 非线性支持向量机
核函数kernel function
-
线性可分SVM:
线性可分SVM指两个类别可以用线性的分隔超平面完全分开,无存在分割错误。如下图。
在这种情况下,我们可以得到很多分隔超平面,但从这些超平面中选择出最优最合理的才是我们需要的。
-
给定训练样本集:
D = { ( x 1 , y 1 ) , ( x 2 , y 2 ) , … , ( x m , y m ) } , y i ∈ { − 1 , + 1 } D=\left\{\left(\boldsymbol{x}_{1}, y_{1}\right),\left(\boldsymbol{x}_{2}, y_{2}\right), \ldots,\left(\boldsymbol{x}_{m}, y_{m}\right)\right\}, y_{i} \in\{-1,+1\} D={(x1,y1),(x2,y2),…,(xm,ym)},yi∈{−1,+1} -
假设分割超平面为:
w T x + b = 0 \boldsymbol{w}^{\mathrm{T}} \boldsymbol{x}+b=0 wTx+b=0 -
分类决策函数:
y ( x ) = s i g n ( f ( x ) ) = s i g n ( w T x + b ) \begin{aligned} y(\boldsymbol{x})&= sign(f(\boldsymbol{x}))\\ &=sign(\boldsymbol{w}^{\mathrm{T}} \boldsymbol{x}+b) \end{aligned} y(x)=sign(f(x))=sign(wTx+b) -
目标函数:
通过最大化硬间隔来求解,间隔指两个类别集合中到超平面的最小距离的和,这些距离超平面最近的样本点称为支撑向量或支持向量。“硬”指的是对分类错误零容忍,不接受错误,对应于线性可分。
样本点到分隔超平面的距离为:
r = ∣ w T x + b ∣ ∥ w ∥ r=\frac{\left|\boldsymbol{w}^{\mathrm{T}} \boldsymbol{x}+b\right|}{\|\boldsymbol{w}\|} r=∥w∥∣∣wTx+b∣∣
已知这两类样本点是线性可分的,通过对 w \boldsymbol{w} w等比例缩放,两个异类样本中到分隔超平面距离最小的点必存在:
{ w T x i + + b = + 1 w T x i − + b = − 1 \left\{\begin{array}{ll} \boldsymbol{w}^{\mathrm{T}} \boldsymbol{x}_{i+}+b = +1 \\ \boldsymbol{w}^{\mathrm{T}} \boldsymbol{x}_{i-}+b = -1 \end{array}\right. {wTxi++b=+1wTxi−+b=−1
即对两类样本点来说,总存在:
{ w T x i + b ⩾ + 1 , y i = + 1 w T x i + b ⩽ − 1 , y i = − 1 \left\{\begin{array}{ll} \boldsymbol{w}^{\mathrm{T}} \boldsymbol{x}_{i}+b \geqslant+1, & y_{i}=+1 \\ \boldsymbol{w}^{\mathrm{T}} \boldsymbol{x}_{i}+b \leqslant-1, & y_{i}=-1 \end{array}\right. {wTxi+b⩾+1,wTxi+b⩽−1,yi=+1yi=−1
即:
y i ( w T x i + b ) ⩾ + 1 y_{i}(\boldsymbol{w}^{\mathrm{T}} \boldsymbol{x}_{i}+b) \geqslant+1 yi(wTxi+b)⩾+1两类样本的间隔为:
γ = 2 ∥ w ∥ \gamma=\frac{2}{\|\boldsymbol{w}\|} γ=∥w∥2目标函数为:
max w , b 2 ∥ w ∥ s.t. y i ( w T x i + b ) ⩾ 1 , i = 1 , 2 , … , m \begin{array}{l} \max _{\boldsymbol{w}, b} \frac{2}{\|\boldsymbol{w}\|} \\ \text { s.t. } y_{i}\left(\boldsymbol{w}^{\mathrm{T}} \boldsymbol{x}_{i}+b\right) \geqslant 1, \quad i=1,2, \ldots, m \end{array} maxw,b∥w∥2 s.t. yi(wTxi+b)⩾1,i=1,2,…,m
即:
min w , b 1 2 ∥ w ∥ 2 s.t. y i ( w T x i + b ) ⩾ 1 , i = 1 , 2 , … , m \begin{array}{l} \min _{\boldsymbol{w}, b} \frac{1}{2}\|\boldsymbol{w}\|^{2} \\ \text { s.t. } y_{i}\left(\boldsymbol{w}^{\mathrm{T}} \boldsymbol{x}_{i}+b\right) \geqslant 1, \quad i=1,2, \ldots, m \end{array} minw,b21∥w∥2 s.t. yi(wTxi+b)⩾1,i=1,2,…,m -
目标函数求解:
构造拉格朗日函数:
L ( w , b , α ) = 1 2 ∥ w ∥ 2 + ∑ i = 1 m α i ( 1 − y i ( w T x i + b ) ) ; α i ⩾ 0 L(\boldsymbol{w}, b, \boldsymbol{\alpha})=\frac{1}{2}\|\boldsymbol{w}\|^{2}+\sum_{i=1}^{m} \alpha_{i}\left(1-y_{i}\left(\boldsymbol{w}^{\mathrm{T}} \boldsymbol{x}_{i}+b\right)\right); \alpha_{i} \geqslant 0 L(w,b,α)=21∥w∥2+i=1∑mαi(1−yi(wTxi+b));αi⩾0
则原目标函数等价于:
min w , b max α L ( w , b , α ) \min _{\boldsymbol{w}, b} \max _{\boldsymbol{\alpha}} L(\boldsymbol{w}, b, \boldsymbol{\alpha}) w,bminαmaxL(w,b,α)
上式的对偶问题是:
max α min w , b L ( w , b , α ) \max _{\boldsymbol{\alpha}} \min _{\boldsymbol{w}, b} L(\boldsymbol{w}, b, \boldsymbol{\alpha}) αmaxw,bminL(w,b,α)
令 L ( w , b , α ) L(\boldsymbol{w}, b, \boldsymbol{\alpha}) L(w,b,α)对 w \boldsymbol{w} w和 α \boldsymbol{\alpha} α的偏导数为 0 0 0,可求得:
w = ∑ i = 1 m α i y i x i 0 = ∑ i = 1 m α i y i \begin{aligned} \boldsymbol{w} &=\sum_{i=1}^{m} \alpha_{i} y_{i} \boldsymbol{x}_{i} \\ 0 &=\sum_{i=1}^{m} \alpha_{i} y_{i} \end{aligned} w0=i=1∑mαiyixi=i=1∑mαiyi
带入化简后,原目标函数转变为:
max α ∑ i = 1 m α i − 1 2 ∑ i = 1 m ∑ j = 1 m α i α j y i y j x i T x j s.t. ∑ i = 1 m α i y i = 0 α i ⩾ 0 , i = 1 , 2 , … , m \begin{aligned} \max _{\alpha} &\sum_{i=1}^{m} \alpha_{i}-\frac{1}{2} \sum_{i=1}^{m} \sum_{j=1}^{m} \alpha_{i} \alpha_{j} y_{i} y_{j}{\boldsymbol{x}}_{i}^{\mathrm{T}} \boldsymbol{x}_{j}\\ \text { s.t. } &\sum_{i=1}^{m} \alpha_{i} y_{i}=0 \\ &\alpha_{i} \geqslant 0, \quad i=1,2, \ldots, m \end{aligned} αmax s.t. i=1∑mαi−21i=1∑mj=1∑mαiαjyiyjxiTxji=1∑mαiyi=0αi⩾0,i=1,2,…,m
利用SMO算法求解出 α \boldsymbol{\alpha} α,即可求得模型:
f ( x ) = w T x + b = ∑ i = 1 m α i y i x i T x + b \begin{aligned} f(\boldsymbol{x}) &=\boldsymbol{w}^{\mathrm{T}} \boldsymbol{x}+b \\ &=\sum_{i=1}^{m} \alpha_{i} y_{i} \boldsymbol{x}_{i}^{\mathrm{T}} \boldsymbol{x}+b \end{aligned} f(x)=wTx+b=i=1∑mαiyixiTx+b
在求出的 α i {\alpha_i} αi中对应的样本点为支持向量,非支持向量对应的 α i {\alpha_i} αi都为 0 0 0 -
-
线性SVM:
样本数据线性不可分。用线性的分隔超平面分开,存在分割错误,对分类错误有一定的容忍度。如下图,显然虚线比实线分得好。
-
由于数据线性不可分,存在误差,即需要损失函数来刻画这一误差:
0/1损失:
ℓ 0 / 1 ( z ) = { 1 , if z < 0 0 , otherwise \ell_{0 / 1}(z)=\left\{\begin{array}{ll} 1, & \text { if } z<0 \\ 0, & \text { otherwise } \end{array}\right. ℓ0/1(z)={1,0, if z<0 otherwise
hinge 损失:
ℓ hinge ( z ) = max ( 0 , 1 − z ) \ell_{\text {hinge}}(z)=\max (0,1-z) ℓhinge(z)=max(0,1−z)
指数损失(exponential loss):
ℓ exp ( z ) = exp ( − z ) \ell_{\exp }(z)=\exp (-z) ℓexp(z)=exp(−z)
对率损失(logistic loss):
ℓ log ( z ) = log ( 1 + exp ( − z ) ) \ell_{\log }(z)=\log (1+\exp (-z)) ℓlog(z)=log(1+exp(−z))
-
取hinge损失,引入增加松弛因子 ξ i ≥ 0 \xi_{i}≥0 ξi≥0,使函数间隔加上松弛变量大于等于 1 1 1。这样,约束条件变成:
y i ( w T ⋅ x i + b ) ≥ 1 − ξ i y_{i}\left(\boldsymbol{w}^{\mathrm{T}} \cdot \boldsymbol{x}_{i}+b\right) \geq 1-\xi_{i} yi(wT⋅xi+b)≥1−ξi -
目标函数变为:
min w , b , ξ 1 2 ∥ w ∥ 2 + C ∑ i = 1 N ξ i s.t. y i ( w T ⋅ x i + b ) ≥ 1 − ξ i , i = 1 , 2 , ⋯ , n ξ i ≥ 0 , i = 1 , 2 , ⋯ , n \begin{array}{l} \min _{\boldsymbol{w}, b, \xi} \frac{1}{2}\|\boldsymbol{w}\|^{2}+C \sum_{i=1}^{N} \xi_{i} \\ \begin{aligned} \text { s.t. } \quad &y_{i}\left(\boldsymbol{w}^{\mathrm{T}} \cdot \boldsymbol{x}_{i}+b\right) \geq 1-\xi_{i}, \quad i=1,2, \cdots, n \\ \quad &\xi_{i} \geq 0, \quad i=1,2, \cdots, n \end{aligned} \end{array} minw,b,ξ21∥w∥2+C∑i=1Nξi s.t. yi(wT⋅xi+b)≥1−ξi,i=1,2,⋯,nξi≥0,i=1,2,⋯,n -
构造拉格朗日函数求对偶:
max α − 1 2 ∑ i = 1 n ∑ j = 1 n α i α j y i y j ( x i ⋅ x j ) + ∑ i = 1 n α i s.t. ∑ i = 1 n α i y i = 0 0 ≤ α i ≤ C , i = 1 , 2 , … , n \begin{array}{l} \max _{\alpha} -\frac{1}{2} \sum_{i=1}^{n} \sum_{j=1}^{n} \alpha_{i} \alpha_{j} y_{i} y_{j}\left(x_{i} \cdot x_{j}\right)+\sum_{i=1}^{n} \alpha_{i} \\ \begin{aligned} \text { s.t. } &\sum_{i=1}^{n} \alpha_{i} y_{i}=0 \\ \quad &\quad 0 \leq \alpha_{i} \leq C, \quad i=1,2, \ldots, n \end{aligned} \end{array} maxα−21∑i=1n∑j=1nαiαjyiyj(xi⋅xj)+∑i=1nαi s.t. i=1∑nαiyi=00≤αi≤C,i=1,2,…,n
-
-
非线性SVM:
对于一个样本集,原始样本空间下不可分(异或问题),可以通过某种映射将样本空间的维度提高,实现在高位空间下对样本进行分类。
-
令 ϕ ( x ) \phi(\boldsymbol{x}) ϕ(x)表示将 x \boldsymbol{x} x 映射后的特征向量,于是在特征空间中划分超平面所对应的模型可表示为 :
f ( x ) = w T ϕ ( x ) + b f(\boldsymbol{x})=\boldsymbol{w}^{\mathrm{T}} \phi(\boldsymbol{x})+b f(x)=wTϕ(x)+b -
目标函数:
min w , b 1 2 ∥ w ∥ 2 s.t. y i ( w T ϕ ( x i ) + b ) ⩾ 1 , i = 1 , 2 , … , m \begin{array}{l} \min _{\boldsymbol{w}, b} \frac{1}{2}\|\boldsymbol{w}\|^{2} \\ \text { s.t. } y_{i}\left(\boldsymbol{w}^{\mathrm{T}} \phi\left(\boldsymbol{x}_{i}\right)+b\right) \geqslant 1, \quad i=1,2, \ldots, m \end{array} minw,b21∥w∥2 s.t. yi(wTϕ(xi)+b)⩾1,i=1,2,…,m -
对偶问题:
max α ∑ i = 1 m α i − 1 2 ∑ i = 1 m ∑ j = 1 m α i α j y i y j ϕ ( x i ) T ϕ ( x j ) s.t. ∑ i = 1 n α i y i = 0 α i ⩾ 0 , i = 1 , 2 , … , n \begin{array}{l} \max _{\alpha} \sum_{i=1}^{m} \alpha_{i}-\frac{1}{2} \sum_{i=1}^{m} \sum_{j=1}^{m} \alpha_{i} \alpha_{j} y_{i} y_{j} \phi\left(\boldsymbol{x}_{i}\right)^{\mathrm{T}} \phi\left(\boldsymbol{x}_{j}\right) \\ \begin{aligned} \text { s.t. } &\sum_{i=1}^{n} \alpha_{i} y_{i}=0 \\ \quad &\alpha_{i} \geqslant 0, \quad i=1,2, \ldots, n \end{aligned} \end{array} maxα∑i=1mαi−21∑i=1m∑j=1mαiαjyiyjϕ(xi)Tϕ(xj) s.t. i=1∑nαiyi=0αi⩾0,i=1,2,…,n -
定义核函数:
κ ( x i , x j ) = ⟨ ϕ ( x i ) , ϕ ( x j ) ⟩ = ϕ ( x i ) T ϕ ( x j ) \kappa\left(\boldsymbol{x}_{i}, \boldsymbol{x}_{j}\right)=\left\langle\phi\left(\boldsymbol{x}_{i}\right), \phi\left(\boldsymbol{x}_{j}\right)\right\rangle=\phi\left(\boldsymbol{x}_{i}\right)^{\mathrm{T}} \phi\left(\boldsymbol{x}_{j}\right) κ(xi,xj)=⟨ϕ(xi),ϕ(xj)⟩=ϕ(xi)Tϕ(xj)
常见核函数: -
原目标函数可写为:
max α ∑ i = 1 m α i − 1 2 ∑ i = 1 m ∑ j = 1 m α i α j y i y j κ ( x i , x j ) s.t. ∑ i = 1 n α i y i = 0 α i ⩾ 0 , i = 1 , 2 , … , n \begin{array}{l} \max _{\alpha} \sum_{i=1}^{m} \alpha_{i}-\frac{1}{2} \sum_{i=1}^{m} \sum_{j=1}^{m} \alpha_{i} \alpha_{j} y_{i} y_{j} \kappa(\boldsymbol{x}_{i}, \boldsymbol{x}_{j}) \\ \begin{aligned} \text { s.t. } &\sum_{i=1}^{n} \alpha_{i} y_{i}=0 \\ \quad &\alpha_{i} \geqslant 0, \quad i=1,2, \ldots, n \end{aligned} \end{array} maxα∑i=1mαi−21∑i=1m∑j=1mαiαjyiyjκ(xi,xj) s.t. i=1∑nαiyi=0αi⩾0,i=1,2,…,n -
求解后得到:
f ( x ) = w T ϕ ( x ) + b = ∑ i = 1 m α i y i ϕ ( x i ) T ϕ ( x ) + b = ∑ i = 1 m α i y i κ ( x , x i ) + b \begin{aligned} f(\boldsymbol{x}) &=\boldsymbol{w}^{\mathrm{T}} \phi(\boldsymbol{x})+b \\ &=\sum_{i=1}^{m} \alpha_{i} y_{i} \phi\left(\boldsymbol{x}_{i}\right)^{\mathrm{T}} \phi(\boldsymbol{x})+b \\ &=\sum_{i=1}^{m} \alpha_{i} y_{i} \kappa\left(\boldsymbol{x}, \boldsymbol{x}_{i}\right)+b \end{aligned} f(x)=wTϕ(x)+b=i=1∑mαiyiϕ(xi)Tϕ(x)+b=i=1∑mαiyiκ(x,xi)+b
-
2. 代码实现
代码环境:
- win7+PyCharm 2018.3.1 Professional+python3.7
MNIST数据集:
# -*- coding:utf-8 -*-
import numpy as np
from sklearn import svm
import matplotlib.colors
import matplotlib.pyplot as plt
from PIL import Image
from sklearn.metrics import accuracy_score
import pandas as pd
import os
import csv
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from time import time
from pprint import pprint
def save_image(im, i):
im = 255 - im
a = im.astype(np.uint8)
output_path = 'HandWritten'
if not os.path.exists(output_path):
os.mkdir(output_path)
Image.fromarray(a).save(output_path + ('\\%d.png' % i))
def save_result(model):
data_test_hat = model.predict(data_test)
with open('Prediction.csv', 'wb') as f:
writer = csv.writer(f)
writer.writerow(['ImageId', 'Label'])
for i, d in enumerate(data_test_hat):
writer.writerow([i, d])
# writer.writerows(zip(np.arange(1, len(data_test_hat) + 1), data_test_hat))
if __name__ == "__main__":
classifier_type = 'RF'
print '载入训练数据...'
t = time()
data = pd.read_csv('.\\MNIST.train.csv', header=0, dtype=np.int)
print '载入完成,耗时%f秒' % (time() - t)
y = data['label'].values
x = data.values[:, 1:]
print '图片个数:%d,图片像素数目:%d' % x.shape
images = x.reshape(-1, 28, 28)
y = y.ravel()
print '载入测试数据...'
t = time()
data_test = pd.read_csv('.\\MNIST.test.csv', header=0, dtype=np.int)
data_test = data_test.values
images_test_result = data_test.reshape(-1, 28, 28)
print '载入完成,耗时%f秒' % (time() - t)
np.random.seed(0)
x, x_test, y, y_test = train_test_split(x, y, train_size=0.8, random_state=1)
images = x.reshape(-1, 28, 28)
images_test = x_test.reshape(-1, 28, 28)
print x.shape, x_test.shape
matplotlib.rcParams['font.sans-serif'] = [u'SimHei']
matplotlib.rcParams['axes.unicode_minus'] = False
plt.figure(figsize=(15, 9), facecolor='w')
for index, image in enumerate(images[:16]):
plt.subplot(4, 8, index + 1)
plt.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
plt.title(u'训练图片: %i' % y[index])
for index, image in enumerate(images_test_result[:16]):
plt.subplot(4, 8, index + 17)
plt.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
save_image(image.copy(), index)
plt.title(u'测试图片')
plt.tight_layout()
plt.show()
# SVM
if classifier_type == 'SVM':
# params = {'C':np.logspace(1, 4, 4, base=10), 'gamma':np.logspace(-10, -2, 9, base=10)}
# clf = svm.SVC(kernel='rbf')
# model = GridSearchCV(clf, param_grid=params, cv=3)
model = svm.SVC(C=1000, kernel='rbf', gamma=1e-10)
print 'SVM开始训练...'
t = time()
model.fit(x, y)
t = time() - t
print 'SVM训练结束,耗时%d分钟%.3f秒' % (int(t/60), t - 60*int(t/60))
# print '最优分类器:', model.best_estimator_
# print '最优参数:\t', model.best_params_
# print 'model.cv_results_ ='
# pprint(model.cv_results_)
t = time()
y_hat = model.predict(x)
t = time() - t
print 'SVM训练集准确率:%.3f%%,耗时%d分钟%.3f秒' % (accuracy_score(y, y_hat)*100, int(t/60), t - 60*int(t/60))
t = time()
y_test_hat = model.predict(x_test)
t = time() - t
print 'SVM测试集准确率:%.3f%%,耗时%d分钟%.3f秒' % (accuracy_score(y_test, y_test_hat)*100, int(t/60), t - 60*int(t/60))
save_result(model)
elif classifier_type == 'RF':
rfc = RandomForestClassifier(100, criterion='gini', min_samples_split=2,
min_impurity_split=1e-10, bootstrap=True, oob_score=True)
print '随机森林开始训练...'
t = time()
rfc.fit(x, y)
t = time() - t
print '随机森林训练结束,耗时%d分钟%.3f秒' % (int(t/60), t - 60*int(t/60))
print 'OOB准确率:%.3f%%' % (rfc.oob_score_*100)
t = time()
y_hat = rfc.predict(x)
t = time() - t
print '随机森林训练集准确率:%.3f%%,预测耗时:%d秒' % (accuracy_score(y, y_hat)*100, t)
t = time()
y_test_hat = rfc.predict(x_test)
t = time() - t
print '随机森林测试集准确率:%.3f%%,预测耗时:%d秒' % (accuracy_score(y_test, y_test_hat)*100, t)
save_result(rfc)
err = (y_test != y_test_hat)
err_images = images_test[err]
err_y_hat = y_test_hat[err]
err_y = y_test[err]
print err_y_hat
print err_y
plt.figure(figsize=(10, 8), facecolor='w')
for index, image in enumerate(err_images):
if index >= 12:
break
plt.subplot(3, 4, index + 1)
plt.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')
plt.title(u'错分为:%i,真实值:%i' % (err_y_hat[index], err_y[index]))
plt.suptitle(u'数字图片手写体识别:分类器%s' % classifier_type, fontsize=18)
plt.tight_layout(rect=(0, 0, 1, 0.95))
plt.show()
Iris数据集:
#!/usr/bin/python
# -*- coding:utf-8 -*-
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# 'sepal length', 'sepal width', 'petal length', 'petal width'
iris_feature = u'花萼长度', u'花萼宽度', u'花瓣长度', u'花瓣宽度'
if __name__ == "__main__":
path = 'iris.data' # 数据文件路径
data = pd.read_csv(path, header=None)
x, y = data[range(4)], data[4]
y = pd.Categorical(y).codes
x = x[[0, 1]]
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=1, train_size=0.6)
# 分类器
clf = svm.SVC(C=0.1, kernel='linear', decision_function_shape='ovr')
# clf = svm.SVC(C=0.8, kernel='rbf', gamma=20, decision_function_shape='ovr')
clf.fit(x_train, y_train.ravel())
# 准确率
print clf.score(x_train, y_train) # 精度
print '训练集准确率:', accuracy_score(y_train, clf.predict(x_train))
print clf.score(x_test, y_test)
print '测试集准确率:', accuracy_score(y_test, clf.predict(x_test))
# decision_function
print 'decision_function:\n', clf.decision_function(x_train)
print '\npredict:\n', clf.predict(x_train)
# 画图
x1_min, x2_min = x.min()
x1_max, x2_max = x.max()
x1, x2 = np.mgrid[x1_min:x1_max:500j, x2_min:x2_max:500j] # 生成网格采样点
grid_test = np.stack((x1.flat, x2.flat), axis=1) # 测试点
# print 'grid_test = \n', grid_test
# Z = clf.decision_function(grid_test) # 样本到决策面的距离
# print Z
grid_hat = clf.predict(grid_test) # 预测分类值
grid_hat = grid_hat.reshape(x1.shape) # 使之与输入的形状相同
mpl.rcParams['font.sans-serif'] = [u'SimHei']
mpl.rcParams['axes.unicode_minus'] = False
cm_light = mpl.colors.ListedColormap(['#A0FFA0', '#FFA0A0', '#A0A0FF'])
cm_dark = mpl.colors.ListedColormap(['g', 'r', 'b'])
plt.figure(facecolor='w')
plt.pcolormesh(x1, x2, grid_hat, cmap=cm_light)
plt.scatter(x[0], x[1], c=y, edgecolors='k', s=50, cmap=cm_dark) # 样本
plt.scatter(x_test[0], x_test[1], s=120, facecolors='none', zorder=10) # 圈中测试集样本
plt.xlabel(iris_feature[0], fontsize=13)
plt.ylabel(iris_feature[1], fontsize=13)
plt.xlim(x1_min, x1_max)
plt.ylim(x2_min, x2_max)
plt.title(u'鸢尾花SVM二特征分类', fontsize=16)
plt.grid(b=True, ls=':')
plt.tight_layout(pad=1.5)
plt.show()
参考资料:
- 西瓜书
- 统计学习方法第二版_李航
- https://www.bilibili.com/video/BV1Tb411H7uC