支持向量机学习笔记（1）

羊咩咩咩咩咩

已于 2022-07-20 10:22:18 修改

阅读量365

点赞数 1

分类专栏：机器学习 python 文章标签：支持向量机机器学习

于 2022-07-15 17:22:27 首次发布

本文链接：https://blog.csdn.net/lovexyyforever/article/details/125808447

版权

机器学习同时被 2 个专栏收录

17 篇文章 1 订阅

订阅专栏

python

17 篇文章 1 订阅

订阅专栏

支持向量机（Support Vector Machine, SVM）是一类按监督学习（supervised learning）方式对数据进行的广义线性分类器（generalized linear classifier），其是对学习样本求解的最大边距超平面（maximum-margin hyperplane）。

SVM使用铰链损失函数（hinge loss）计算经验风险（empirical risk）并在求解系统中加入了正则化项以优化结构风险（structural risk），是一个具有稀疏性和稳健性的分类器。SVM可以通过核方法（kernel method）进行非线性分类，是常见的核学习（kernel learning）方法之一。

支持向量机的分类问题

（1）线性SVM分类

通过建立一个超平面，优化模型使最靠近超平面的点的间隔最大，称之为大间隔分类。由于SVM对特征的缩放特别敏感，对函数间隔影响较大，所以要对特征先进行标准化，用于特征的缩放。

如果将实例严格的分类，让所有的实例都处于正确的一方，称之为硬间隔分类；反之，称之为软间隔分类。

import numpy as np
from sklearn import datasets
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC
iris = datasets.load_iris()
X = iris['data'][:,(2,3)]
Y =(iris['target']==2).astype(np.float64)
svm_scaler = Pipeline([('std_scaler',StandardScaler()),
                      ('svm',LinearSVC(C=1,loss = "hinge")),])
svm_scaler.fit(X,Y)

LinearSVC的超参数中C为惩罚项，C越大对错误实例的惩罚越大。loss为损失函数。

接着使用另两种方法来实现

##使用svc与线性内核
from sklearn.svm import SVC
svm_scaler = Pipeline([('std_scaler',StandardScaler()),
                      ('svm',SVC(kernel = 'linear',C=1))])
svm_scaler.fit(X,Y)
##使用随机梯度下降求解
from sklearn.linear_model import SGDClassifier
sgd_scaler = Pipeline([('std_scaler',StandardScaler()),
                      ('sgd',SGDClassifier(loss='hinge',alpha=1))])##aplha为惩罚系数
sgd_scaler.fit(X,Y)

发现效果差不多

（2）非线性SVM分类

处理方法1：使用PolynomialFeatures让高维的特征变成新的特征

from sklearn.datasets import make_moons
from sklearn.preprocessing import PolynomialFeatures##将高维的特征变成一个新的特征
X,Y = make_moons(n_samples=100,noise=0.15)
polynomial_svm_scaler = Pipeline([('polynomial',PolynomialFeatures(degree=3)),
                                  ('std_scaler',StandardScaler()),
                                  ('svm',LinearSVC(C=10,loss='hinge'))])
polynomial_svm_scaler.fit(X,Y)
import matplotlib.pyplot as plt
def plot_dataset(X, y, axes):
    plt.plot(X[:, 0][y==0], X[:, 1][y==0], "bs")
    plt.plot(X[:, 0][y==1], X[:, 1][y==1], "g^")
    plt.axis(axes)
    plt.grid(True, which='both')
    plt.xlabel(r"$x_1$", fontsize=20)
    plt.ylabel(r"$x_2$", fontsize=20, rotation=0)

plot_dataset(X, Y, [-1.5, 2.5, -1, 1.5])
plt.show()
def plot_predictions(clf, axes):
    x0s = np.linspace(axes[0], axes[1], 100)
    x1s = np.linspace(axes[2], axes[3], 100)
    x0, x1 = np.meshgrid(x0s, x1s)
    X = np.c_[x0.ravel(), x1.ravel()]
    y_pred = clf.predict(X).reshape(x0.shape)
    y_decision = clf.decision_function(X).reshape(x0.shape)
    plt.contourf(x0, x1, y_pred, cmap=plt.cm.brg, alpha=0.2)
    plt.contourf(x0, x1, y_decision, cmap=plt.cm.brg, alpha=0.1)

plot_predictions(polynomial_svm_scaler, [-1.5, 2.5, -1, 1.5])
plot_dataset(X, Y, [-1.5, 2.5, -1, 1.5])

效果图

处理方法2：使用支持向量机中的核技巧

##多项式内核
poly_kernel_svm_clf = Pipeline([('scaler',StandardScaler()),
                               ('svm_clf',SVC(kernel="poly",degree=3,coef0=1,C=5))])##参数coef0控制的是模型受高阶多项式影响多还是低阶的
poly_kernel_svm_clf.fit(X,Y)
plot_predictions(poly_kernel_svm_clf, [-1.5, 2.5, -1, 1.5])
plot_dataset(X, Y, [-1.5, 2.5, -1, 1.5])

高斯RBF内核

通过建立样本上的相似特征，使样本有多重特征，最简单的方法就是对每一个实例的位置创建一个地标，这样子可以创建许多维度

rbf_kernel_svm_clf =Pipeline([('std_scaler',StandardScaler()),
                             ('svm_clf',SVC(kernel='rbf',gamma=5,C=0.001))])
rbf_kernel_svm_clf.fit(X,Y)
plot_predictions(poly_kernel_svm_clf, [-1.5, 2.5, -1, 1.5])
plot_dataset(X, Y, [-1.5, 2.5, -1, 1.5])

对参数的解读：gamma是控制钟形曲线的大小，就是影响单个特征的影响范围，而C为惩罚项，kernel为核函数的选择

SVM回归

使用LinearSVR

##SVM回归
np.random.seed(42)
m = 50
X = 2 * np.random.rand(m, 1)
Y = (4 + 3 * X + np.random.randn(m, 1)).ravel()
from sklearn.svm import LinearSVR
svm_reg = LinearSVR(epsilon=1.5)
svm_reg.fit(X,Y)

使用SVR

from sklearn.svm import SVR
svm_poly_reg =SVR(kernel='poly',degree=2,C=100,epsilon=0.1)
svm_poly_reg.fit(X,Y)

羊咩咩咩咩咩

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
支持向量机学习笔记（1）

支持向量机学习笔记
复制链接

扫一扫

专栏目录