Python之SVM

最新推荐文章于 2024-08-08 10:33:25 发布

来自西伯利亚

最新推荐文章于 2024-08-08 10:33:25 发布

阅读量1.7k

点赞数 1

分类专栏： ML 机器视觉文章标签： SVM Python 支持向量机

本文链接：https://blog.csdn.net/qq_33810188/article/details/80293839

版权

机器视觉同时被 2 个专栏收录

13 篇文章 7 订阅

订阅专栏

2 篇文章 0 订阅

订阅专栏

利用Python的第三方库，学习SVM。

数据来源：https://www.jianshu.com/p/bfcf645bd56a

理解SVM，同时加深对matplotlib库的了解，代码如下：

---------------------------------------------------------------

#有关线性和软间隔的SVM
'''
sklearn.svm.SVC(c=1.0,kernel='rbf',degree=3,gamma='auto',coef0=0.0,
shrinking=True,probability=False,tol=0.001,cache_size=200,class_weight=None,verbose=False,max_iter=-1,
decision_function_shape=None,random_state=None)
----------------------------------------------------------------
C-SVC的惩罚参数，默认值为1.0
kernel 核函数，默认值是rbf，可以是 linear poly rbf sigmoid precomputed
degree 多项式poly函数的维度，默认值为3，选择其他核函数会被忽略
gamma rbf poly sigmoid 核函数参数，默认为auto
coef0 核函数常数项，对于poly sigmoid有用
probability 是否采用概率估计，默认为False
shrinking 是否采用shrinking heuristic方法，默认为ture
tol 停止训练的误差值大小，默认为1.0e-3
cache_size 核函数cache缓存大小，默认200
class_weight 类别权重，字典形式传递
verbose 允许冗余输出？
max_iter 最大迭代次数，-1为无限制
decision_function_shape ovo ovr or None
random_state 数据洗牌时的种子值

----------------------------------------------------------------
'''
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm

plt.rcParams['font.sans-serif']=['SimHei']#用来正常显示中文标签
plt.rcParams['axes.unicode_minus']=False#用来正常显示负号
#create 40 separable points
np.random.seed(0)#设置seed可使随机数据可预测，每次使用会使之一样
x=np.r_[np.random.randn(20,2)-[2,2],np.random.randn(20,2)+[2,2]]
y=[0]*20+[1]*20#一维40个长度
print('训练数据：\n' ,x)
print('--------------------------------------------')
plt.figure('SVM')
plt.subplot(3,3,1)
plt.title('二维数据')
plt.plot(x[:,0],x[:,1],'ro')
plt.subplot(3,3,2)
plt.title('类别一')
plt.plot(x[1:20,0],x[1:20,1],'ro')
plt.axis([-5,5,-5,5])
plt.subplot(3,3,3)
plt.title('类别二')
plt.plot(x[21:40,0],x[21:40,1],'ro')
plt.axis([-5,5,-5,5])

#fit the model
clf=svm.SVC(kernel='linear',C=2)
clf.fit(x,y)

#get the separating hyperplane
w=clf.coef_[0]
a=-w[0]/w[1]#等效斜率
#numpy.linspace(start,stop,num=50,endpoint=True,retstep=False,dtype=None)
xx=np.linspace(-5,5)
yy=a*xx-(clf.intercept_[0])/w[1]
'''
注意以上的计算原理：
wx+b=f(x)
f(x)=0，表示x点集在超平面上
x=[xi;yi]，表示数据集合，在二维平面上体现为横纵坐标
w，表示超平面的倾斜程度（支持向量）
[w[0],w[1]]*[xi;yi]+b=0 ==>等效二维平面上的直线方程
==> w[0]*xi+w[1]*yi+b=0
==> yi=-(w[0]/w[1])*xi-b/w[1] ===转化为点斜式
以上的点斜式方程即为超平面方程
'''
print('超平面方程：Wx+b=0')
print('超平面的权重向量W：' ,w)
print('超平面的常数项b：',clf.intercept_[0])
print('---------------------------------------------')

plt.subplot(3,3,4)
plt.title('超平面')
plt.plot(xx,yy)

print('支持向量点索引：\n',clf.support_)
print('支持向量点：\n',clf.support_vectors_)
print('每个class支持向量点：\n',clf.n_support_)
support_points=clf.support_vectors_
plt.subplot(3,3,5)
plt.title('支持向量点')
for Index in range((int)(support_points.size/2)):
    plt.plot(support_points[Index][0],support_points[Index][1],'go')
plt.axis([-5,5 ,-5 ,5])
plt.subplot(3,3,6)
plt.title('超平面+支持向量点')
plt.plot(support_points[:,0],support_points[:,1],'go')
plt.plot(xx,yy,'b-')

#plot the parallels to the separating hyperplane that pass through the support vectors
b=clf.support_vectors_[0]
yy_down=a*xx+(b[1]-a*b[0])#根据支持向量点得到截距，再得到值
b=clf.support_vectors_[-1]#倒数第一个支持向量点
yy_up=a*xx+(b[1]-a*b[0])

#plot the line, the points,and the nearest vectors to the plane
for Index in [7,8,9]:
    plt.subplot(3,3,Index)
    plt.title('分类效果')
    plt.plot(xx,yy,'k-')
    plt.plot(xx,yy_down,'--')
    plt.plot(xx,yy_up,'k--')
    if (Index ==7):
        plt.title('')
        plt.plot(x[:,0],x[:,1],'bo')
    if (Index ==8 ):
        plt.plot(clf.support_vectors_[:,0],clf.support_vectors_[:,1],'ro')
        plt.scatter(x[:,0],x[:,1],c=y,cmap=plt.cm.Paired)
    if (Index ==9 ):
        plt.plot(clf.support_vectors_[:,0],clf.support_vectors_[:,1],'ro')
        plt.scatter(x[:,0],x[:,1],c=y)

plt.axis('tight')
plt.show()