机器学习——有关《Python机器学习》书籍支持向量机的实验，出错原因解析。

UnnamedDWO

已于 2024-04-17 19:12:24 修改

阅读量662

点赞数 27

文章标签：机器学习支持向量机人工智能

于 2024-04-17 19:10:05 首次发布

本文链接：https://blog.csdn.net/m0_73979260/article/details/137879231

版权

一、支持向量机概念

支持向量机（Support Vector Machine，SVM）是一种用于二分类和多分类问题的监督学习算法。它的目标是找到一个最优的超平面或者决策边界，可以最好地将不同类别的数据样本分开。在这个分割的过程中，SVM会选择尽可能大地使不同类别的样本之间的间隔最大化的超平面。

在SVM中，数据样本被表示为特征向量的形式，而每个特征向量都在N维空间中的某个点上。SVM通过寻找一个具有最大间隔的超平面，将不同类别的样本分开。这个超平面可以用作新的数据样本的分类器，根据新样本的位置与超平面的关系来进行分类。支持向量是离超平面最近的数据点。其特点是对超平面的位置有重要影响，如果移动这些支持向量，超平面的位置也会随之改变。SVM的训练过程就是找到这些支持向量，并找到一个超平面，使得这些支持向量到超平面的距离最小。

如上图所示，距离直线最近的两个点即为支持向量。

二、支持向量机的实验。

根据《Python机器学习》一书中的实验，这里给出代码及其结果。

1、对应《Python机器学习》书籍第九章例题9.1：

from sklearn.linear_model import Perceptron
from sklearn.model_selection import train_test_split
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd

def loaddata():
    people = pd.read_csv("credit-overdue.csv", header = 0)
    X = people[['debt', 'income']].values
    y = people['overdue'].values
    return X,y

print("Step1:read data...")
x, y = loaddata()

#拆分为训练数据和测试数据
print("Step2:fit by Perception...")
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state = 0)

#将两类值分别存放，以方便显示
positive_x1 = [x[i, 0] for i in range(len(y)) if y[i] == 1]
positive_x2 = [x[i, 1] for i in range(len(y)) if y[i] == 1]
negetive_x1 = [x[i, 0] for i in range(len(y)) if y[i] == 0]
negetive_x2 = [x[i, 1] for i in range(len(y)) if y[i] == 0]

#定义感知机
clf = Perceptron(max_iter=100, tol=0, penalty='l2', eta0=0.1, random_state=0)
clf.fit(x_train, y_train)
print("Step3:get the weights and bias...")

#得到参数结果
weights = clf.coef_
bias = clf.intercept_
print(' 权重为:', weights, '\n  截距为:', bias)
print("Step4:compute the accuracy...")

#使用测试集对模型进行验证
acc = clf.score(x_test, y_test)
print(' 精确度:%.2f'%(acc * 100.0))

#绘制两类样本散点图
print("Step5:draw with the weights and bias...")
plt.scatter(positive_x1, positive_x2, marker = '^', c = 'red')
plt.scatter(negetive_x1, negetive_x2, c = 'blue')

#显示感知器生成的分类线
line_x = np.arange(0, 4)
line_y = line_x * (-weights[0][0] / weights[0][1]) - bias
plt.plot(line_x, line_y)
plt.show()

运行结果如下：

Step1:read data...
Step2:fit by Perception...
Step3:get the weights and bias...
 权重为: [[ 0.20934094 -0.20843792]] 
  截距为: [0.]
Step4:compute the accuracy...
 精确度:100.00
Step5:draw with the weights and bias...

运行结果警告：

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\linear_model\stochastic_gradient.py:561: ConvergenceWarning: Maximum number of iteration reached before convergence. Consider increasing max_iter to improve the fit.
  ConvergenceWarning)

具体图例：

报错原因分析：

ConvergenceWarning 是 scikit-learn 中 StochasticGradientDescent 类（或相关线性模型类）在训练模型时发出的一种警告。这个警告意味着在达到最大迭代次数 max_iter 之前，算法没有收敛到指定的容差 tol。这通常意味着模型还没有找到一个足够好的解决方案，或者可能需要更多的迭代次数来找到更好的解。

出现这个警告的原因可能有几个：

数据集问题：数据可能不是线性可分的，或者特征与目标之间的关系非常复杂，导致算法难以找到一个好的解。

参数设置不当：max_iter 设置得太低，或者 tol 设置得太严格，导致算法在达到最大迭代次数之前无法收敛。

学习率问题：学习率（learning_rate 或 eta0）可能设置得不合适，太大可能导致模型在最优解附近震荡而无法收敛，太小则可能导致收敛速度过慢。

为了解决这个问题，可以尝试以下几个方法：

        增加 max_iter：增加最大迭代次数可能会让算法有更多的机会找到一个好的解。
        调整 tol：如果 tol 设置得太低，可以尝试增加它，让算法更容易收敛。
        调整学习率：尝试不同的学习率值，看是否能够改善收敛情况。
        标准化数据：对数据进行标准化（即特征缩放）有时可以帮助算法更快地收敛。
        简化模型或数据：如果数据集过于复杂或噪声太多，可能需要通过特征选择或数据清洗来简化问题。
        尝试其他模型：如果 SGD 持续不收敛，可以尝试其他类型的模型，比如岭回归（Ridge）、逻辑回归（Logistic Regression）或支持向量机（SVM）。
        最后，注意，即使出现了 ConvergenceWarning，模型可能仍然能够给出一个可接受的预测性能。最好通过交叉验证和其他评估手段来检查模型的性能，并确保模型满足你的应用需求。

2、对应《Python机器学习》书籍第九章例题9.2：

from sklearn import svm
import numpy as np
from matplotlib import pyplot as plt

#随机生成两组数据，并通过(-2, 2)距离调整为明显的0/1两类
data = np.concatenate((np.random.randn(30,2)-[-2,2], np.random.randn(30,2)+[-2,2]))
target = [0] * 30 + [1] * 30

#建立SVC模型
clf = svm.SVC(kernel = 'linear')
clf.fit(data, target)

#显示结果
w = clf. coef_[0]
a = -w[0] / w[1]
print("参数w: ", w)
print("参数a: ", a)
print("支持向量: ", clf.support_vectors_)
print("参数 coef_: ", clf.coef_)

#使用结果参数生成分类线
xx = np.linspace(-5, 5)
yy = a * xx - (clf.intercept_[0] / w[1])

#绘制穿过正支持向量的虚线
b = clf.support_vectors_[0]
yy_Neg = a * xx + (b[1] - a * b[0])

#绘制穿过负支持向量的虚线
b = clf.support_vectors_[-1]
yy_Pos = a * xx + (b[1] - a * b[0])

#绘制黑色实践
plt.plot(xx, yy, 'r-')
#绘制黑色虚线
plt.plot(xx, yy_Neg, 'k--')
plt.plot(xx, yy_Pos, 'k--')

#绘制样本散点图
plt.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1])
plt.scatter(data[:, 0], data[:, 1], c = target, cmap = plt.cm.coolwarm)

plt.xlabel("X")
plt.ylabel("Y")
plt.title("Support Vector Classification")

plt.show()

运行结果如下：

参数w:  [-0.52166992  0.65630009]
参数a:  0.7948649241176944
支持向量:  [[ 1.00320045 -0.60854632]
 [-0.48120049  1.25894197]]
参数 coef_:  [[-0.52166992  0.65630009]]

具体图例：

注意：这里要注意一下的是，如果你按书上的代码去运行，有很大的概率会直接报错并出现如下报错原因。

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-10-55566be0e063> in <module>
      4 
      5 #随机生成两组数据，并通过(-2, 2)距离调整为明显的0/1两类
----> 6 data = np.concatenate(np.random.randn(30,2)-[-2,2], np.random.randn(30,2)+[-2,2])
      7 target = [0] * 30 + [1] * 30
      8 

TypeError: only integer scalar arrays can be converted to a scalar inde
----------------------------------------------------------------------------

报错原因分析：

错误信息提示 TypeError: only integer scalar arrays can be converted to a scalar index 意味着在尝试执行数组操作时，有一个地方期望得到一个整数标量（scalar index），但是却得到了一个不符合要求的数据类型。

具体到你的代码行（对应书上示例当中的line6）：

data = np.concatenate(np.random.randn(30,2)-[-2,2], np.random.randn(30,2)+[-2,2])

正确的代码：

data = np.concatenate((np.random.randn(30,2)-[-2,2], np.random.randn(30,2)+[-2,2]))

发现什么了吗？问题出在了 np.concatenate 的用法上。np.concatenate 需要两个数组作为输入参数，但这里你只给了它一个数组和一些运算，而不是两个独立的数组。同时，np.concatenate 的第一个参数应该是一个元组或列表，包含所有要连接的数组。

如果你想要两组随机生成的数据，也可以这样做：

# 第一组数据，减去[-2, 2]  
group1 = np.random.randn(30, 2) - [-2, 2]  
  
# 第二组数据，加上[-2, 2]  
group2 = np.random.randn(30, 2) + [-2, 2]  
  
# 连接两组数据  
data = np.concatenate((group1, group2))

第一组减去 [-2, 2]，第二组加上 [-2, 2]，然后将这两组数据连接起来。

3、对应《Python机器学习》书籍第九章例题9.3：

from sklearn import svm

#样本特征
x = [[2, 0], [1, 1], [2, 3]]
#样本的标签
y = [0, 0, 1]

#建立SVC分类器
clf = svm.SVC(kernel = 'linear')
#训练模型
clf.fit(x, y)
print(clf)

#获得支持向量
print(clf.support_vectors_)

#获得支持向量点在元数据中的下标
print(clf.support_)

#获得每个类支持向量的个数
print(clf.n_support_)

#预测(2, 0)的类别
print(clf.predict([[2, 0]]))

运行结果如下：

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
    decision_function_shape='ovr', degree=3, gamma='auto_deprecated',
    kernel='linear', max_iter=-1, probability=False, random_state=None,
    shrinking=True, tol=0.001, verbose=False)
[[1. 1.]
 [2. 3.]]
[1 2]
[1 1]
[0]

4、对应《Python机器学习》书籍第九章例题9.4：

from sklearn import svm

#样本特征
x = [[0, 0], [0, 1], [1, 0],[1, 1]]
#样本的标签
y = [0, 1, 1, 0]

#建立SVC分类器
clf = svm.SVC(kernel = 'rbf')
#训练模型
clf.fit(x, y)

#分别预测4个样本点的类型
print('样本[0, 0]的预测结果为：',clf.predict([[0, 0]]))
print('样本[0, 1]的预测结果为：',clf.predict([[0, 1]]))
print('样本[1, 0]的预测结果为：',clf.predict([[1, 0]]))
print('样本[1, 1]的预测结果为：',clf.predict([[1, 1]]))

运行结果如下：

样本[0, 0]的预测结果为： [0]
样本[0, 1]的预测结果为： [1]
样本[1, 0]的预测结果为： [1]
样本[1, 1]的预测结果为： [0]

运行结果警告：

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\svm\base.py:193: FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning.
  "avoid this warning.", FutureWarning)

报错原因分析：

这个警告信息是在说，在将来的某个版本（具体来说，是版本0.22）中，sklearn.svm（支持向量机）模块中的默认gamma参数值将从'auto'更改为'scale'。gamma是SVM中RBF（径向基函数）核的一个参数，用于控制决策边界的形状。

当前的默认行为（'auto'）是根据训练数据自动选择gamma的值，而未来的默认行为（'scale'）则是根据特征的数量来自动缩放gamma的值。这样做通常是为了更好地处理未缩放的特征，以避免由于特征尺度差异而导致的模型性能问题。为了消除这个警告，你可以在创建SVM模型时显式地设置gamma参数的值。将其设置为'auto'来保持当前的行为，或者设置为'scale'来提前适应未来的变化。

# 显式地设置gamma为'auto'来避免警告  
clf = svm.SVC(gamma='auto')  
  
# 或者，如果你希望适应将来的变化，可以设置gamma为'scale'  
clf = svm.SVC(gamma='scale')

5、对应《Python机器学习》书籍第九章实验9-1

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_moons
from sklearn.preprocessing import PolynomialFeatures
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC
from sklearn.pipeline import Pipeline

#生成半环形数据
X, y = make_moons(n_samples = 100, noise = 0.1, random_state = 1)
moonAxe = [-1.5, 2.5, -1, 1.5]

#显示数据样本
def dispData(x, y, moonAxe):
    pos_x0 = [x[i, 0] for i in range(len(y)) if y[i] == 1]
    pos_x1 = [x[i, 1] for i in range(len(y)) if y[i] == 1]
    neg_x0 = [x[i, 0] for i in range(len(y)) if y[i] == 0]
    neg_x1 = [x[i, 1] for i in range(len(y)) if y[i] == 0]
    
    plt.plot(pos_x0, pos_x1, "bo")
    plt.plot(neg_x0, neg_x1, "r^")
    
    plt.axis(moonAxe)
    
    plt.xlabel("x")
    plt.ylabel("y")
    
#显示决策线
def dispPredict(clf, moonAxe):
    #生成区间内的数据
    d0 = np.linspace(moonAxe[0], moonAxe[1], 200)
    d1 = np.linspace(moonAxe[2], moonAxe[3], 200)
    x0, x1 = np.meshgrid(d0, d1)
    X = np.c_[x0.ravel(), x1.ravel()]
    #进行预测并绘制预测结果
    y_pred = clf.predict(X).reshape(x0.shape)
    plt.contourf(x0, x1, y_pred, alpha = 0.8)
    
#1.显示样本
dispData(X, y, moonAxe)
#2.构建模型组合，整合三个数
polynomial_svm_clf = Pipeline(
    (("multiFeature", PolynomialFeatures(degree = 3)),
    ("NumScale", StandardScaler()),
    ("SVC", LinearSVC(C = 100)))
)
#3.使用模型组合进行训练
polynomial_svm_clf.fit(X,y)
#4.显示分类线
dispPredict(polynomial_svm_clf, moonAxe)
#5.显示图表数据
plt.title('Linear SVM classifies Moons data')
plt.show()

运行结果如下（具体图例）：

报错原因分析：

这个ConvergenceWarning警告表明，在使用Liblinear作为SVM求解器时，算法没有收敛到最优解。Liblinear是一个用于线性分类的支持向量机库，当它被集成到scikit-learn中时，可能会遇到收敛问题，尤其是在数据集复杂或参数设置不当的情况下。

要解决这个问题，你可以尝试以下几个方法：

增加迭代次数：通过增加max_iter参数的值，给算法更多时间来找到最优解。
```
clf = svm.SVC(kernel='linear', max_iter=10000)  # 增加max_iter的值
```
调整正则化参数C：正则化参数C控制错误项与决策边界的简单性之间的权衡。如果C设置得太大，可能会导致模型过于复杂，从而难以收敛。你可以尝试减小C的值来观察是否有助于收敛。
标准化特征：确保你的数据已经被标准化或归一化，因为Liblinear对特征的尺度很敏感。你可以使用sklearn.preprocessing.StandardScaler来标准化你的数据。
更改求解器：如果Liblinear持续不收敛，你可以尝试使用其他求解器，比如sag或saga，它们是为大规模数据集设计的，并且可能更适合你的特定问题。
检查数据集：有时候，数据集本身可能存在问题，比如存在异常值或噪声，这些都可能影响算法的收敛。对数据进行清洗和预处理可能有助于解决收敛问题。
更新scikit-learn：确保你使用的scikit-learn库是最新版本，因为新版本可能修复了旧版本中的收敛问题。