【SVM】soft margin SVM 代码以及详解

miracleo_

于 2021-05-16 22:29:32 发布

阅读量819

点赞数

分类专栏： python 机器学习文章标签：支持向量机深度学习机器学习

本文链接：https://blog.csdn.net/miracleoa/article/details/116903573

版权

python 同时被 2 个专栏收录

25 篇文章 18 订阅

订阅专栏

机器学习

12 篇文章 1 订阅

订阅专栏

该博客介绍了如何使用Python实现支持向量机（SVM）的软间隔分类，并详细阐述了线性核函数、多项式核函数和高斯核函数的使用。代码包括数据预处理、优化问题设置、支持向量计算、模型训练和预测等功能，展示了SVM在二分类问题中的应用。

摘要由CSDN通过智能技术生成

文章目录

导包
10-------------------------------------------------
11-------------------------------------------------
12-------------------------------------------------
13-------------------------------------------------
14-------------------------------------------------
15-------------------------------------------------
16-------------------------------------------------
17-------------------------------------------------
18-------------------------------------------------
测试部分
整体代码

导包

import matplotlib.pyplot as plt
import numpy as np
import scipy as sp
import cvxopt
from cvxopt import matrix
from cvxopt import solvers
from numpy import linalg

10-------------------------------------------------

Make the function H = make_H_k(X,t, kernel_function, param) to compute the matrix $H$ for a data matrix X and an associated target vector t given a kernel function kernel_function with parameter param.


def make_H_k(X, t, kernel_function, param):
    n_samples, n_feature = X.shape
    X_prime = np.zeros((n_samples, n_samples))
    for i in range(n_samples):
        for j in range(n_samples):
            X_prime[i, j] = kernel_function(X[i], X[j], param)
    H = np.outer(t, t) * X_prime
    return H

- 在这一个步骤，kernel_function可以为线性核函数（对应教案中的例子），多项式核函数（对应soft margin SVM），和高斯核函数。这里核函数的输入 $x_i, x_j \in{X}$ ，

11-------------------------------------------------

Make the function P, q, G, h, A, b = setup_optimization_soft_k(X, t, C, kernel_function, param) to compute the necessary matrices to solve the QP problem associated to the soft margin kernelized classification task using the library cvxopt. The function receives in input a data matrix X, an associated target vector t, a regularization parameter C and a kernel function kernel_function with parameter param.

def setup_optimization_soft_k(X, t, C, kernel_function, param):
    m, n = X.shape
    t = t.reshape(-1, 1).astype(float)
    H = make_H_k(X, t, kernel_function, param)
    P = matrix(H)
    q = matrix(-np.ones((m, 1)))
    G = matrix(np.vstack((-np.eye(m), np.eye(m))))
    h = matrix(np.hstack((np.zeros(m), np.ones(m) * C)))
    A = matrix(t.reshape(1, -1))
    b = matrix(np.zeros(1))

    return P, q, G, h, A, b

- 此处未作更改

12-------------------------------------------------

Make the function support_vectors, support_targets, support_alphas, support_ids = compute_support_vectors(X,t,alphas) to compute:

support_vectors the support vectors
support_targets the target vector associated to the support vectors
support_alphas the vector of the dual variables $\alpha_i$ associated to the support vectors
support_ids a boolean vector that indicates if a position is associated to a support vector (w.r.t. the original data matrix) (i.e. if the first instance is a support vector and the second one it is not, then support_ids = [True, False])


def compute_support_vectors(X, t, alphas):
    t = t.reshape(-1, 1).astype(float)
    support_ids = (alphas > 1e-4).flatten()
    support_vectors = X[support_ids]
    support_targets = t[support_ids]
    support_alphas = alphas[support_ids]
    return support_vectors, support_targets, support_alphas, support_ids

- 此处未作更改

13-------------------------------------------------

Make the function model = train_slmkc(X, t, C, kernel_function, param) to compute the necessary elements to represent a kernelized soft large margin classifier trained over an input data matrix X with an associated target vector t, for a regularization parameter C and a kernel function kernel_function with parameter param.

Note: you may implement model as a tuple containing information on the support vectors and the dual variables, i.e. model = (support_vectors,support_targets,support_alphas, support_ids).


def train_slmkc(X, t, C, kernel_function, param):
    P, q, G, h, A, b = setup_optimization_soft_k(X, t, C, kernel_function, param)

    # Setting solver parameters (change default to decrease tolerance)
    solvers.options['show_progress'] = False
    solvers.options['abstol'] = 1e-10
    solvers.options['reltol'] = 1e-10
    solvers.options['feastol'] = 1e-10

    # Run solver
    solution_dict = solvers.qp(P, q, G, h, A, b)
    alphas = np.array(solution_dict['x'])

    support_vectors, support_targets, support_alphas, support_ids = compute_support_vectors(X, t, alphas)

    model = (support_vectors, support_targets, support_alphas, support_ids)

    return model

- 此处未作更改

14-------------------------------------------------

Given the kernelized soft large margin discriminant function of the form:
$\sum_{n \in SV} \alpha_n t_n K(x_n,x) + b$

and given the encoded model to represent the necessary elements for a trained kernelized soft large margin classifier

make the function b = compute_bias(kernel_function, param, C, model) to compute the offset/bias term b, for a regularization parameter C and a kernel function kernel_function with parameter param.

def compute_bias(kernel_function, param, C, model):
    support_vectors, support_targets, support_alphas, support_ids = model
    if param == None:
        w = ((support_targets * support_alphas).T @ support_vectors).reshape(-1, 1)
        b = np.mean(support_targets - np.dot(support_vectors, w))
    else:
        w = None
        b = 0
        for n in range(len(support_targets)):
            b += support_targets[n]
            b -= np.sum(
                support_alphas * support_targets * kernel_function(support_ids[n], support_alphas, param))
        b /= len(support_alphas)
    return w, b

- 此处b的计算为重点，对于线性核函数，与教案中计算公式相同。对于非线性核函数或者高斯核函数，应该对上方公式移项，得下方公式计算b值，w值为None，上方因此没有给出w得计算公式。

$\sum_{n \in SV} \alpha_n t_n K(x_n,x)$

15-------------------------------------------------

Make the function scores = score_slmkc(X_test, kernel_function, param, C, model) to compute the scores for the kernelized soft large margin discriminant function over the data matrix X_test, for a regularization parameter C, a kernel function kernel_function with parameter param and an encoded model model.

def score_slmkc(X_test, kernel_function, param, C, model):
    w, b = compute_bias(kernel_function, param, C, model)
    if param == None:
        scores = np.dot(X_test, w) + b
        return scores.flatten()
    else:
        support_vectors, support_targets, support_alphas, support_ids = model
        X = X_test
        y_predict = np.zeros(len(X))
        for i in range(len(X)):
            s = 0
            for a, sv_y, sv in zip(support_alphas, support_targets, support_vectors):
                s += a * sv_y * kernel_function(X[i], sv, param)
            y_predict[i] = s
        return y_predict + b

分为线性与非线性情况，线性下，与教案相同，非线性下，如上述代码所示

16-------------------------------------------------

Make the function preds = test_slmkc(X_test, kernel_function, param, C, model) to predict the class label for the kernelized soft large margin discriminant function over the data matrix X_test, for a regularization parameter C, a kernel function kernel_function with parameter param and an encoded model model.

def test_slmkc(X_test, kernel_function, param, C, model):
    preds = np.sign(score_slmkc(X_test, kernel_function, param, C, model)).astype(int)
    return preds.flatten()

# this function is provided to you, you do not need to alter it in any way.
class MySoftLargeMarginKernelClassfier(object):
    def __init__(self, C, kernel_function, param):
        self.kernel_function = kernel_function
        self.param = param
        self.C = C
        self.support_vectors = None
        self.support_targets = None
        self.support_alphas = None
        self.support_ids = None

    def fit(self, X_train, y_train):
        self.model = train_slmkc(X_train, y_train, self.C, self.kernel_function, self.param)
        self.support_vectors, self.support_targets, self.support_alphas, self.support_ids = self.model
        return self

    def predict(self, X_test):
        return test_slmkc(X_test, self.kernel_function, self.param, self.C, self.model)

    def decision_function(self, X_test):
        return score_slmkc(X_test, self.kernel_function, self.param, self.C, self.model)

    def predict_proba(self, X_test):
        scores = self.decision_function(X_test)
        probs = 1 / (1 + np.exp(-scores))
        return probs

- 此处未作改变

17-------------------------------------------------

Make the function linear_kernel_function(x1,x2,param=None) corresponding to the linear kernel defined as:
$K(x_1,x_2) = x_1^T x_2$
where $x_1$ and $x_2$ are vectors.

def linear_kernel_function(x1, x2, param):
    x_prime = x1.T.dot(x2)
    return x_prime

此处为线性核函数

18-------------------------------------------------

Make the function polynomial_kernel_function(x1,x2,param) corresponding to the polynomial kernel defined by the parameter $r =$ param:
$K(x_1,x_2) = (1 + x_1^T x_2)^r$
where $x_1$ and $x_2$ are vectors.

def polynomial_kernel_function(x1, x2, param):
    result = np.power(np.dot(x1.T, x2) + 1, param)
    return result

def gaussian_kernel(x, y, sigma=5.0):
    return np.exp(-linalg.norm(x - y) ** 2 / (2 * (sigma ** 2)))

此处为多项式核函数与高斯核函数，高斯核函数是我扩充的内容

测试部分

def plot(X, y, support_vectors=None, predict_func=None):
    cmap = 'bwr'
    res = 100
    x_min, x_max = X[:, 0].min(), X[:, 0].max()
    y_min, y_max = X[:, 1].min(), X[:, 1].max()
    delta = max((x_max - x_min) / 5, (y_max - y_min) / 5)
    x_min -= delta
    y_min -= delta
    x_max += delta
    y_max += delta
    xx, yy = np.meshgrid(np.linspace(x_min, x_max, res), np.linspace(y_min, y_max, res))
    if predict_func is None:
        Z = np.ones(xx.shape) * .5
    else:
        Z = predict_func(np.hstack([xx.reshape(-1, 1), yy.reshape(-1, 1)]))
        if Z.ndim == 1:
            Z = Z.reshape(xx.shape)
        else:
            Z = Z[:, -1].reshape(xx.shape)
    if support_vectors is None:
        edge_colors = 'w'
        sizes = 50
    else:
        edge_colors = ['k' if s else 'w' for s in support_vectors]
        sizes = [100 if s else 50 for s in support_vectors]
    plt.figure(figsize=(5, 5))
    plt.contourf(xx, yy, Z, levels=20, cmap=cmap, alpha=.5)
    plt.scatter(X[:, 0], X[:, 1], c=y, s=sizes, edgecolors=edge_colors, cmap=cmap)
    plt.grid()
    plt.axis('equal')
    plt.show()


# Just run the following code, do not modify it

from sklearn.datasets import make_classification

X, y = make_classification(n_samples=100, n_features=2, n_informative=2, n_redundant=0, n_repeated=0, n_classes=2,
                           n_clusters_per_class=2, weights=None, flip_y=0.0, class_sep=1.0, random_state=5)
y[y == 0] = -1

线性核函数

est = MySoftLargeMarginKernelClassfier(C=1e2, kernel_function=linear_kernel_function, param=None).fit(X, y)
plot(X, y, support_vectors=est.support_ids, predict_func=est.predict_proba)

在这里插入图片描述

多项式核函数

est = MySoftLargeMarginKernelClassfier(C=1e-2, kernel_function=polynomial_kernel_function, param=5).fit(X, y)
plot(X, y, support_vectors=est.support_ids, predict_func=est.predict_proba)

在这里插入图片描述

高斯核函数

est = MySoftLargeMarginKernelClassfier(C=1e-2, kernel_function=gaussian_kernel, param=5).fit(X, y)
plot(X, y, support_vectors=est.support_ids, predict_func=est.predict_proba)

在这里插入图片描述

整体代码

import matplotlib.pyplot as plt
import numpy as np
import scipy as sp
import cvxopt
from cvxopt import matrix
from cvxopt import solvers
from numpy import linalg


# 10-------------------------------------------------
def make_H_k(X, t, kernel_function, param):
    n_samples, n_feature = X.shape
    X_prime = np.zeros((n_samples, n_samples))
    for i in range(n_samples):
        for j in range(n_samples):
            X_prime[i, j] = kernel_function(X[i], X[j], param)
    H = np.outer(t, t) * X_prime
    return H


# 11-------------------------------------------------
def setup_optimization_soft_k(X, t, C, kernel_function, param):
    m, n = X.shape
    t = t.reshape(-1, 1).astype(float)
    H = make_H_k(X, t, kernel_function, param)
    P = matrix(H)
    q = matrix(-np.ones((m, 1)))
    G = matrix(np.vstack((-np.eye(m), np.eye(m))))
    h = matrix(np.hstack((np.zeros(m), np.ones(m) * C)))
    A = matrix(t.reshape(1, -1))
    b = matrix(np.zeros(1))

    return P, q, G, h, A, b


# 12-------------------------------------------------
def compute_support_vectors(X, t, alphas):
    t = t.reshape(-1, 1).astype(float)
    support_ids = (alphas > 1e-4).flatten()
    support_vectors = X[support_ids]
    support_targets = t[support_ids]
    support_alphas = alphas[support_ids]
    return support_vectors, support_targets, support_alphas, support_ids


# 13-------------------------------------------------
def train_slmkc(X, t, C, kernel_function, param):
    P, q, G, h, A, b = setup_optimization_soft_k(X, t, C, kernel_function, param)

    # Setting solver parameters (change default to decrease tolerance)
    solvers.options['show_progress'] = False
    solvers.options['abstol'] = 1e-10
    solvers.options['reltol'] = 1e-10
    solvers.options['feastol'] = 1e-10

    # Run solver
    solution_dict = solvers.qp(P, q, G, h, A, b)
    alphas = np.array(solution_dict['x'])

    support_vectors, support_targets, support_alphas, support_ids = compute_support_vectors(X, t, alphas)

    model = (support_vectors, support_targets, support_alphas, support_ids)

    return model


# 14-------------------------------------------------
def compute_bias(kernel_function, param, C, model):
    support_vectors, support_targets, support_alphas, support_ids = model
    if param == None:
        w = ((support_targets * support_alphas).T @ support_vectors).reshape(-1, 1)
        b = np.mean(support_targets - np.dot(support_vectors, w))
    else:
        w = None
        b = 0
        for n in range(len(support_targets)):
            b += support_targets[n]
            b -= np.sum(
                support_alphas * support_targets * kernel_function(support_ids[n], support_vectors, param))
        b /= len(support_alphas)
    return w, b


# 15-------------------------------------------------
def score_slmkc(X_test, kernel_function, param, C, model):
    w, b = compute_bias(kernel_function, param, C, model)
    if param == None:
        scores = np.dot(X_test, w) + b
        return scores.flatten()
    else:
        support_vectors, support_targets, support_alphas, support_ids = model
        X = X_test
        y_predict = np.zeros(len(X))
        for i in range(len(X)):
            s = 0
            for a, sv_y, sv in zip(support_alphas, support_targets, support_vectors):
                s += a * sv_y * kernel_function(X[i], sv, param)
            y_predict[i] = s
        return y_predict + b


# 16-------------------------------------------------
def test_slmkc(X_test, kernel_function, param, C, model):
    preds = np.sign(score_slmkc(X_test, kernel_function, param, C, model)).astype(int)
    return preds.flatten()


# this function is provided to you, you do not need to alter it in any way.

class MySoftLargeMarginKernelClassfier(object):
    def __init__(self, C, kernel_function, param):
        self.kernel_function = kernel_function
        self.param = param
        self.C = C
        self.support_vectors = None
        self.support_targets = None
        self.support_alphas = None
        self.support_ids = None

    def fit(self, X_train, y_train):
        self.model = train_slmkc(X_train, y_train, self.C, self.kernel_function, self.param)
        self.support_vectors, self.support_targets, self.support_alphas, self.support_ids = self.model
        return self

    def predict(self, X_test):
        return test_slmkc(X_test, self.kernel_function, self.param, self.C, self.model)

    def decision_function(self, X_test):
        return score_slmkc(X_test, self.kernel_function, self.param, self.C, self.model)

    def predict_proba(self, X_test):
        scores = self.decision_function(X_test)
        probs = 1 / (1 + np.exp(-scores))
        return probs


# 17-------------------------------------------------
def linear_kernel_function(x1, x2, param):
    x_prime = x1.T.dot(x2)
    return x_prime


# 18-------------------------------------------------
def polynomial_kernel_function(x1, x2, param):
    result = np.power(np.dot(x1.T, x2) + 1, param)
    return result


def gaussian_kernel(x, y, sigma=5.0):
    return np.exp(-linalg.norm(x - y) ** 2 / (2 * (sigma ** 2)))


# 测试部分
def plot(X, y, support_vectors=None, predict_func=None):
    cmap = 'bwr'
    res = 100
    x_min, x_max = X[:, 0].min(), X[:, 0].max()
    y_min, y_max = X[:, 1].min(), X[:, 1].max()
    delta = max((x_max - x_min) / 5, (y_max - y_min) / 5)
    x_min -= delta
    y_min -= delta
    x_max += delta
    y_max += delta
    xx, yy = np.meshgrid(np.linspace(x_min, x_max, res), np.linspace(y_min, y_max, res))
    if predict_func is None:
        Z = np.ones(xx.shape) * .5
    else:
        Z = predict_func(np.hstack([xx.reshape(-1, 1), yy.reshape(-1, 1)]))
        if Z.ndim == 1:
            Z = Z.reshape(xx.shape)
        else:
            Z = Z[:, -1].reshape(xx.shape)
    if support_vectors is None:
        edge_colors = 'w'
        sizes = 50
    else:
        edge_colors = ['k' if s else 'w' for s in support_vectors]
        sizes = [100 if s else 50 for s in support_vectors]
    plt.figure(figsize=(5, 5))
    plt.contourf(xx, yy, Z, levels=20, cmap=cmap, alpha=.5)
    plt.scatter(X[:, 0], X[:, 1], c=y, s=sizes, edgecolors=edge_colors, cmap=cmap)
    plt.grid()
    plt.axis('equal')
    plt.show()


# Just run the following code, do not modify it

from sklearn.datasets import make_classification

X, y = make_classification(n_samples=100, n_features=2, n_informative=2, n_redundant=0, n_repeated=0, n_classes=2,
                           n_clusters_per_class=2, weights=None, flip_y=0.0, class_sep=1.0, random_state=5)
y[y == 0] = -1

est = MySoftLargeMarginKernelClassfier(C=1e2, kernel_function=linear_kernel_function, param=None).fit(X, y)
plot(X, y, support_vectors=est.support_ids, predict_func=est.predict_proba)

est = MySoftLargeMarginKernelClassfier(C=1e-2, kernel_function=polynomial_kernel_function, param=5).fit(X, y)
plot(X, y, support_vectors=est.support_ids, predict_func=est.predict_proba)

est = MySoftLargeMarginKernelClassfier(C=1e-2, kernel_function=gaussian_kernel, param=5).fit(X, y)
plot(X, y, support_vectors=est.support_ids, predict_func=est.predict_proba)

miracleo_

关注

0
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
【SVM】soft margin SVM 代码以及详解

文章目录导包10-------------------------------------------------11-------------------------------------------------12-------------------------------------------------13-------------------------------------------------14--------------------------------------------
复制链接

扫一扫

专栏目录