sklearn-datawhale学习笔记2

最新推荐文章于 2024-07-16 17:02:33 发布

膜众dalao的小仙女

最新推荐文章于 2024-07-16 17:02:33 发布

阅读量123

点赞数

文章标签： sklearn 机器学习 python

本文链接：https://blog.csdn.net/qq_38048065/article/details/122019551

版权

一. 线性SVM

初始化x，y的值，第一列为x，第二列为y，生成6组label为1，6组label为0的数据

import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
data = np.array([
    [0.1, 0.7],
    [0.3, 0.6],
    [0.4, 0.1],
    [0.5, 0.4],
    [0.8, 0.04],
    [0.42, 0.6],
    [0.9, 0.4],
    [0.6, 0.5],
    [0.7, 0.2],
    [0.7, 0.67],
    [0.27, 0.8],
    [0.5, 0.72]
])
label = [1] * 6 + [0] * 6
x_min, x_max = data[:, 0].min() - 0.2, data[:, 0].max() + 0.2
y_min, y_max = data[:, 1].min() - 0.2, data[:, 1].max() + 0.2
# 生成网格，参数为x坐标和y'坐标，用于画图使用
# np.arrange()返回一个有起点和终点的步长排列
xx, yy = np.meshgrid(np.arange(x_min, x_max, 0.002),
                     np.arange(y_min, y_max, 0.002)) # meshgrid如何生成网格

sklearn中的svm.SVC和fit()训练，使用predict预测

支持向量回归算法可以用来解决回归问题

C-Support Vector Classification，实现基于libsvm

# 参数C值越小，对误分类的惩罚减小，允许容错
model_linear = svm.SVC(kernel='linear', C = 0.001)
model_linear.fit(data, label) # 训练
Z = model_linear.predict(np.c_[xx.ravel(), yy.ravel()]) # 预测
Z = Z.reshape(xx.shape)

绘制图

# 画布，坐标及颜色，alpha为透明度，cmap-colormap
plt.contourf(xx, yy, Z, cmap = plt.cm.ocean, alpha=0.6)
# s - marker size
plt.scatter(data[:6, 0], data[:6, 1], marker='o', color='r', s=100, lw=3) 
plt.scatter(data[6:, 0], data[6:, 1], marker='x', color='k', s=100, lw=3)
plt.title('Linear SVM')
plt.show()

二、多项式SVM

对比不同最高次数的分类情况，发现最高次项越高，分类效果越好，但次数到7之后，效果就差不多了

plt.figure(figsize=(16, 15))
 
for i, degree in enumerate([1, 3, 5, 7, 9, 12]):
    # C: 惩罚系数，gamma: 高斯核的系数
    model_poly = svm.SVC(C=0.0001, kernel='poly', degree=degree) # 多项式核
    model_poly.fit(data, label)
    # ravel - flatten
    # c_ - vstack
    # 把后面两个压扁之后变成了x1和x2，然后进行判断，得到结果在压缩成一个矩形
    Z = model_poly.predict(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    plt.subplot(3, 2, i + 1)
    plt.contourf(xx, yy, Z, cmap=plt.cm.ocean, alpha=0.6)
    # 画出训练点
    plt.scatter(data[:6, 0], data[:6, 1], marker='o', color='r', s=100, lw=3)
    plt.scatter(data[6:, 0], data[6:, 1], marker='x', color='k', s=100, lw=3)
    plt.title('Poly SVM with $\degree=$' + str(degree))
plt.show()

三、高斯核SVM

对比不同gamma下的分类情况，效果如下文所言，个人理解为gamma越大会使得同一类的数据点圈更趋向于在同一范围区间内，其中区域的半径大小逐渐变小。

gamma值较低表示相似半径较大，这会导致将更多的点组合在一起。对于gamma值较高的情况，点之间必须非常接近，才能将其视为同一组(或类)。因此，具有非常大gamma值的模型往往过拟合。

ref:svm核函数gamma参数_支持向量机超参数的可视化解释_即有大吉也有大利的博客-CSDN博客

 model_rbf = svm.SVC(kernel='rbf', gamma=gamma, C= 0.0001)

测试不同SVM在Mnist数据集上的分类情况

# C:软间隔惩罚系数
C_linear = 100
model_linear = svm.SVC(C = C_linear, kernel='linear').fit(X_train,y_train) # 线性核
print(f"Linear Kernel 's score: {model_linear.score(X_test,y_test)}")
for degree in range(1,10,2):
    model_poly = svm.SVC(C=100, kernel='poly', degree=degree).fit(X_train,y_train) # 多项式核
    print(f"Polynomial Kernel with Degree = {degree} 's score: {model_poly.score(X_test,y_test)}")

for gamma in range(1,10,2):
    gamma = round(0.01 * gamma,3)
    model_rbf = svm.SVC(C = 100, kernel='rbf', gamma = gamma).fit(X_train,y_train) # 高斯核
    print(f"Polynomial Kernel with Gamma = {gamma} 's score: {model_rbf.score(X_test,y_test)}")

结果：

Linear Kernel 's score: 0.955
Polynomial Kernel with Degree = 1 's score: 0.955
Polynomial Kernel with Degree = 3 's score: 0.93
Polynomial Kernel with Degree = 5 's score: 0.855
Polynomial Kernel with Degree = 7 's score: 0.735
Polynomial Kernel with Degree = 9 's score: 0.66
Polynomial Kernel with Gamma = 0.01 's score: 0.96
Polynomial Kernel with Gamma = 0.03 's score: 0.96
Polynomial Kernel with Gamma = 0.05 's score: 0.945
Polynomial Kernel with Gamma = 0.07 's score: 0.9
Polynomial Kernel with Gamma = 0.09 's score: 0.835

可以看到degree和gamma都是值越小越好

ref:

datawhale scikit-learn教程

膜众dalao的小仙女

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
sklearn-datawhale学习笔记2

一. 线性SVM初始化x，y的值，第一列为x，第二列为y，生成6组label为1，6组label为0的数据import numpy as npimport matplotlib.pyplot as pltfrom sklearn import svmdata = np.array([ [0.1, 0.7], [0.3, 0.6], [0.4, 0.1], [0.5, 0.4], [0.8, 0.04], [0.42, 0.6], [0.
复制链接

扫一扫