机器学习-支持向量机


一些概念

  • SVM,序列最小优化算法
  • 分隔超平面:将数据集分隔开来的超平面(随着维度提升而提升,例如二维时表现为直线)
  • 支持向量:离超平面最近的那些点
  • 分隔超平面的形式 y = w T x + b y=w^{T}x+b y=wTx+b, 这里的w指的是垂直于分割面的法向量

最大分隔和分类

函数间隔和几何间隔

在超平面 w T x + b = 0 w^{T}x+b=0 wTx+b=0确定的情况下, ∣ w T x + b = 0 ∣ |w^{T}x+b=0| wTx+b=0代表了点到超平面的距离。与Logistic回归分类为0或1不同,这里将数据集分为-1和1,这意味着在进行分类时,我们能通过判断类别标记y的符号和 w T x + b w^{T}x+b wTx+b的符号是否一致来确定分类正确与否。将两者相乘定义为函数间隔的概念。
函数间隔: γ ^ = y ( w T x + b ) = y f ( x ) \hat{\gamma}=y(w^{T}x+b)=yf(x) γ^=y(wTx+b)=yf(x)

而函数间隔只是认为定义的概念,并且函数间隔有个问题,就是当成比例扩大w和b时,会出现超平面不变而f(X)为原来的两倍的尴尬场面。真正直观正确地表示某点到超平面的间隔应该是几何间隔,是函数间隔除上w的模,这样使得上述尴尬场面荡然无存。
几何间隔: γ ~ ∣ w ∣ \frac{ \tilde{\gamma}}{|w|} wγ~

最大化间隔

在几何间隔的基础上进行变换,我们可以依据存在定理将 γ ~ \tilde{\gamma} γ~变为更容易计算的1,具体推导过程如下:
在这里插入图片描述

定义最大化间隔为: arg ⁡ w , b m a x 1 ∣ w ∣ , s . t , y i ( w T x i + b ) ⩾ 1 , i = 1 , 2 … … , m . \arg_{w,b} max\frac{1}{|w|},s.t,y_{i}(w^{T}x_{i}+b)\geqslant 1,i=1,2……,m. argw,bmaxw1,s.t,yi(wTxi+b)1,i=1,2,m.
而求 1 ∣ w ∣ \frac{ 1}{|w|} w1的最大值又等价于求 1 2 ∣ w ∣ 2 \frac{1}{2}|w|^{2} 21w2的最小值。此时,目标函数是二次的,约束条件是线性的,所以问题变成了求解凸二次规划问题,即在一定的条件下,目标最优,损失最小。

拉格朗日乘子法

拉格朗日函数:在最大化间隔中,我们虽说有目标函数,但是其约束条件我们很难去把控,而拉格朗日函数的作用就是通过给每个约束条件加上一个拉格朗日乘子,使得我们能够将约束条件融入到目标函数当中,也可以说是拉格朗日对偶性的应用:
L ( w , b , α ) = 1 2 ∣ w ∣ 2 − ∑ i = 0 m α i ( y i ( w i T x + b ) − 1 ) L(w,b,\alpha )=\frac{1}{2}|w|^{2}-\sum_{i=0}^{m} \alpha_i (y_i(w_i^{T}x+b)-1) L(w,b,α)=21w2i=0mαi(yi(wiTx+b)1)

再对w和b求偏导为零:
w = ∑ i = 1 m α i x i y i , b = y j − ∑ i = 1 m α i x i y i y j w=\sum_{i=1}^{m}\alpha _{i}x_{i}y_{i}, b=y_{j}-\sum_{i=1}^{m}\alpha _{i}x_{i}y_{i}y_{j} w=i=1mαixiyi,b=yji=1mαixiyiyj

将两者回代,得到:
m i n α 1 2 ∑ i = 1 m ∑ j = 1 m α i α j y i y j x i T x j − ∑ i = 1 m α i , s . t , ∑ i = 1 m α i y i = 0 , α i ⩾ 0 , i = 1 , 2 … … , m min_{\alpha }\frac{1}{2}\sum_{i=1}^{m}\sum_{j=1}^{m}\alpha _{i}\alpha _{j}y_{i}y_{j}x_{i}^{T}x_{j}-\sum_{i=1}^{m}\alpha _{i}, s.t,\sum_{i=1}^{m}\alpha _{i}y_{i}=0,\alpha _{i}\geqslant 0,i=1,2……,m minα21i=1mj=1mαiαjyiyjxiTxji=1mαi,s.t,i=1mαiyi=0,αi0,i=1,2,m

上式等价于:
m a x α ∑ i = 1 m α i − 1 2 ∑ i = 1 m ∑ j = 1 m α i α j y i y j x i T x j , s . t , ∑ i = 1 m α i y i = 0 , α i ⩾ 0 , i = 1 , 2 … … , m max_{\alpha }\sum_{i=1}^{m}\alpha _{i}-\frac{1}{2}\sum_{i=1}^{m}\sum_{j=1}^{m}\alpha _{i}\alpha _{j}y_{i}y_{j}x_{i}^{T}x_{j}, s.t,\sum_{i=1}^{m}\alpha _{i}y_{i}=0,\alpha _{i}\geqslant 0,i=1,2……,m maxαi=1mαi21i=1mj=1mαiαjyiyjxiTxj,s.t,i=1mαiyi=0,αi0,i=1,2,m

这里我们是做了假设的:我们假设了数据100%可分,但是所有数据并不是能全部分类正确的。这时我们引入松弛变量,允许有些数据点可以被分类错误,但此时的约束条件就变成了:
C ⩾ α ⩾ 0 , 和 ∑ i − 1 m α i y i = 0 C\geqslant \alpha \geqslant 0,和\sum _{i-1}^{m}\alpha_{i} y_{i}=0 Cα0,i1mαiyi=0

前面我们使用了对偶函数来求解问题的解,但原问题和对偶问题需要满足一定的条件才能等价,这个条件就叫做KKT条件。KKT条件下函数的最优质必定满足:

  1. L对各个x求导为零
  2. g(x)=0
  3. α i ⩾ 0 \alpha _{i}\geqslant 0 αi0
  4. ∑ α i g i ( x ) = 0 \sum \alpha _{i}g_{i}(x)=0 αigi(x)=0

在这里插入图片描述
这两张图表示的意思是,x只能在g(x)<0或g(x)=0的区域里取得:

  • 当x落在g(x)<0的区域时,极小化f(x)即可
  • 当x落在g(x)=0的边界时,等价于等式约束优化问题

总结下来就是,无论哪种情况都会得到:
λ g ( x ) = 0 \lambda g(x)=0 λg(x)=0

SMO求解

首先我们确定一个思路:我们找到了一个初始的 α \alpha α,它满足对偶问题的两个初始限制条件,即:
C ⩾ α ⩾ 0 , 和 ∑ i − 1 m α i y i = 0 C\geqslant \alpha \geqslant 0,和\sum _{i-1}^{m}\alpha_{i} y_{i}=0 Cα0,i1mαiyi=0
由它求出的超平面还能满足g(x)目标条件,那么它就是对偶问题目前的最优解。
那么接下来,我们要做的就是不断优化它,使得由它确定的超平面满足g(x)目标条件,并在优化的过程中确保它满足初始限制条件,这样就可以找到最优解。基本步骤如下所示:
在这里插入图片描述
到目前为止,我们已经能够解决线性可分的超平面的求取了,大部分时候数据并不是线性可分的,这个时候满足这样条件的超平面就根本不存在。

从线性不可分到高维可分

核函数

在上文中,我们已经了解到了SVM处理线性可分的情况,那对于非线性的数据SVM咋处理呢?对于非线性的情况,SVM 的处理方法是选择一个核函数 κ(⋅,⋅) ,通过将数据映射到高维空间,来解决在原始空间中线性不可分的问题。
在线性不可分的情况下,支持向量机首先在低维空间中完成计算,然后通过核函数将输入空间映射到高维特征空间,最终在高维特征空间中构造出最优分离超平面,从而把平面上本身不好分的非线性数据分开。
如图所示:
在这里插入图片描述
核函数具有多种类型,经过空间转换后我们可以在高维空间中解决线性问题,这也就等价于在低维空间解决非线性问题。


分类实践

测试核函数及图像显示

def testRbf(k1=1.3):
    dataArr, labelArr = loadDataSet('testSetRBF.txt')
    b, alphas = smoP(dataArr, labelArr, 200, 0.0001, 10000, ('rbf', k1))  # C=200 important
    datMat = mat(dataArr);
    labelMat = mat(labelArr).transpose()
    svInd = nonzero(alphas.A > 0)[0]
    sVs = datMat[svInd]  # get matrix of only support vectors
    labelSV = labelMat[svInd];
    print("there are %d Support Vectors" % shape(sVs)[0])
    m, n = shape(datMat)
    errorCount = 0
    for i in range(m):
        kernelEval = kernelTrans(sVs, datMat[i, :], ('rbf', k1))
        predict = kernelEval.T * multiply(labelSV, alphas[svInd]) + b
        if sign(predict) != sign(labelArr[i]): errorCount += 1
    print("the training error rate is: %f" % (float(errorCount) / m))
    dataArr, labelArr = loadDataSet('testSetRBF2.txt')
    errorCount = 0
    datMat = mat(dataArr);
    labelMat = mat(labelArr).transpose()
    m, n = shape(datMat)
    for i in range(m):
        kernelEval = kernelTrans(sVs, datMat[i, :], ('rbf', k1))
        predict = kernelEval.T * multiply(labelSV, alphas[svInd]) + b
        if sign(predict) != sign(labelArr[i]): errorCount += 1
    print("the test error rate is: %f" % (float(errorCount) / m))
    def showDataSet(dataSet, labelMat):
    data_plus = []
    data_minus = []

    for i in range(len(dataSet)):
        if (labelMat[i] > 0):
            data_plus.append(dataSet[i])
        else:
            data_minus.append(dataSet[i])

    data_plus_arr = array(data_plus)
    data_minus_arr = array(data_minus)

    plt.scatter(transpose(data_plus_arr)[0], transpose(data_plus_arr)[1])
    plt.scatter(transpose(data_minus_arr)[0], transpose(data_minus_arr)[1])
    plt.show()
    

在这里插入图片描述
在这里插入图片描述

基于SVM的手写体识别

def smoP(dataMatIn, classLabels, C, toler, maxIter, kTup=("lin", 0)):
    oS = optStruct(mat(dataMatIn), mat(classLabels).transpose(), C, toler,kTup)
    iter = 0
    entireSet = True
    alphaPairsChanged = 0
    while (iter < maxIter) and (alphaPairsChanged > 0 or entireSet):
        alphaPairsChanged = 0
        if entireSet:
            for i in range(oS.m):
                alphaPairsChanged += innerL(i, oS)
                print("fullSet,iter: %d i:%d,pairs changed %d " % (iter, i, alphaPairsChanged))
            iter += 1
        else:
            nonBoundIs = nonzero((oS.alphas.A > 0) * (oS.alphas.A < C))[0]
            for i in nonBoundIs:
                alphaPairsChanged += innerL(i, oS)
                print("non-bound,iter: %d i:%d,pairs changed %d " % (iter, i, alphaPairsChanged))
            iter += 1
        if entireSet:
            entireSet = False
        elif alphaPairsChanged == 0:
            entireSet = True
        print("iteration number: %d " % iter)
    return oS.b, oS.alphas

def kernelTrans(X, A, kTup):
    m, n = shape(X)
    K = mat(zeros((m, 1)))
    if kTup[0] == "lin":
        K = X * A.T
    elif kTup[0] == "rbf":
        for j in range(m):
            deltaRow = X[j, :] - A
            K[j] = deltaRow * deltaRow.T
        K = exp(K / (-1 * kTup[1] ** 2))
    else:
        raise NameError("houston we have a problem--that kernel is not recognized")
    return K


class optStruct:
    def __init__(self, dataMatIn, classLabels, C, toler, kTup):
        self.X = dataMatIn
        self.labelMat = classLabels
        self.C = C
        self.tol = toler
        self.m = shape(dataMatIn)[0]
        self.alphas = mat(zeros((self.m, 1)))
        self.b = 0
        self.eCache = mat(zeros((self.m, 2)))
        self.K = mat(zeros((self.m, self.m)))
        for i in range(self.m):
            self.K[:, i] = kernelTrans(self.X, self.X[i, :], kTup)



def img2vector(filename):
    returnVect = zeros((1, 1024))
    fr = open(filename)
    for i in range(32):
        lineStr = fr.readline()
        for j in range(32):
            returnVect[0, 32 * i + j] = int(lineStr[j])
    return returnVect


def loadImages(dirName):
    from os import listdir
    hwLabels = []
    trainingFileList = listdir(dirName)  # load the training set
    m = len(trainingFileList)
    trainingMat = zeros((m, 1024))
    for i in range(m):
        fileNameStr = trainingFileList[i]
        fileStr = fileNameStr.split('.')[0]  # take off .txt
        classNumStr = int(fileStr.split('_')[0])
        if classNumStr == 9:
            hwLabels.append(-1)
        else:
            hwLabels.append(1)
        trainingMat[i, :] = img2vector('%s/%s' % (dirName, fileNameStr))
    return trainingMat, hwLabels

def testDigits(kTup=('rbf', 10)):
    dataArr, labelArr = loadImages('D:\\python\\PyCode\\machinelearninginaction\\Ch02\\trainingDigits\\')
    b, alphas = smoP(dataArr, labelArr, 200, 0.0001, 10000, kTup)
    datMat = mat(dataArr);
    labelMat = mat(labelArr).transpose()
    svInd = nonzero(alphas.A > 0)[0]
    sVs = datMat[svInd]
    labelSV = labelMat[svInd];
    print("there are %d Support Vectors" % shape(sVs)[0])
    m, n = shape(datMat)
    errorCount = 0
    for i in range(m):
        kernelEval = kernelTrans(sVs, datMat[i, :], kTup)
        predict = kernelEval.T * multiply(labelSV, alphas[svInd]) + b
        if sign(predict) != sign(labelArr[i]): errorCount += 1
    print("the training error rate is: %f" % (float(errorCount) / m))
    dataArr, labelArr = loadImages('D:\\python\\PyCode\\machinelearninginaction\\Ch02\\testDigits\\')
    errorCount = 0
    datMat = mat(dataArr);
    labelMat = mat(labelArr).transpose()
    m, n = shape(datMat)
    for i in range(m):
        kernelEval = kernelTrans(sVs, datMat[i, :], kTup)
        predict = kernelEval.T * multiply(labelSV, alphas[svInd]) + b
        if sign(predict) != sign(labelArr[i]): errorCount += 1
    print("the test error rate is: %f" % (float(errorCount) / m))

收敛过程:
在这里插入图片描述
错误率:
在这里插入图片描述

代码全览

import random
from numpy import mat, shape, zeros, multiply, nonzero, exp, sign, array, transpose
import matplotlib.pyplot as plt


def loadDataSet(fileName):
    dataMat = []
    labelMat = []
    fr = open(fileName)
    for line in fr.readlines():
        lineArr = line.strip().split("\t")
        dataMat.append([float(lineArr[0]), float(lineArr[1])])
        labelMat.append(float(lineArr[2]))
    return dataMat, labelMat

def showDataSet(dataSet, labelMat):
    data_plus = []
    data_minus = []

    for i in range(len(dataSet)):
        if (labelMat[i] > 0):
            data_plus.append(dataSet[i])
        else:
            data_minus.append(dataSet[i])

    data_plus_arr = array(data_plus)
    data_minus_arr = array(data_minus)

    plt.scatter(transpose(data_plus_arr)[0], transpose(data_plus_arr)[1])
    plt.scatter(transpose(data_minus_arr)[0], transpose(data_minus_arr)[1])
    plt.show()


def selectJrand(i, m):
    j = i
    while j == i:
        j = int(random.uniform(0, m))
    return j


def clipAlpha(aj, H, L):
    if aj > H:
        aj = H
    if L > aj:
        aj = L
    return aj


def smoSimple(dataMatIn, classLabels, C, toler, maxIter):
    dataMatrix = mat(dataMatIn)
    labelMat = mat(classLabels).transpose()
    b = 0
    m, n = shape(dataMatrix)
    alphas = mat(zeros((m, 1)))
    iter = 0
    while iter < maxIter:
        alphaPairsChanged = 0
        for i in range(m):
            fXi = float(multiply(alphas, labelMat).T * (dataMatrix * dataMatrix[i, :].T)) + b
            Ei = fXi - float(labelMat[i])
            if ((labelMat[i] * Ei < -toler) and (alphas[i] < C)) or ((labelMat[i] * Ei > toler) and (alphas[i] > 0)):
                j = selectJrand(i, m)
                fXj = float(multiply(alphas, labelMat).T * (dataMatrix * dataMatrix[j, :].T)) + b
                Ej = fXj - float(labelMat[j])
                alphaIold = alphas[i].copy()
                alphaJold = alphas[j].copy()
                if labelMat[i] != labelMat[j]:
                    L = max(0, alphas[j] - alphas[i])
                    H = min(C, C + alphas[j] - alphas[i])
                else:
                    L = max(0, alphas[j] + alphas[i] - C)
                    H = min(C, alphas[j] + alphas[i])
                    if L == H:
                        print("L==H")
                        continue
                eta = 2.0 * dataMatrix[i, :] * dataMatrix[j, :].T - dataMatrix[j, :] * dataMatrix[j, :].T
                if eta >= 0:
                    print("eta>=0")
                    continue
                alphas[j] -= labelMat[j] * (Ei - Ej) / eta
                alphas[j] = clipAlpha(alphas[j], H, L)
                if (abs(alphas[j] - alphaJold) < 0.00001):
                    print("j not moving enough")
                    continue
                alphas[i] += labelMat[j] * labelMat[i] * (alphaJold - alphas[j])
                b1 = b - Ei - labelMat[i] * (alphas[i] - alphaIold) * dataMatrix[i, :] * dataMatrix[i, :].T - labelMat[
                    j] * (alphas[j] - alphaJold) * dataMatrix[i, :] * dataMatrix[j, :].T
                b2 = b - Ej - labelMat[i] * (alphas[i] - alphaIold) * dataMatrix[i, :] * dataMatrix[j, :].T - labelMat[
                    j] * (alphas[j] - alphaJold) * dataMatrix[j, :] * dataMatrix[j, :].T
                if 0 < alphas[i] < C:
                    b = b1
                elif 0 < alphas[j] < C:
                    b = b2
                else:
                    b = (b1 + b2) / 2.0
                alphaPairsChanged += 1
                print("iter: %d i: %d ,pairs changed %d" % (iter, i, alphaPairsChanged))
        if alphaPairsChanged == 0:
            iter += 1
        else:
            iter = 0
        print("iteration number:%d" % iter)
    return b, alphas


# class optStruct:
#     def __init__(self, dataMatIn, classLabels, C, toler):
#         self.X = dataMatIn
#         self.labelMat = classLabels
#         self.C = C
#         self.tol = toler
#         self.m = shape(dataMatIn)[0]
#         self.alphas = mat(zeros((self.m, 1)))
#         self.b = 0
#         self.eCache = mat(zeros((self.m, 2)))


# def calcEk(oS, k):  # 修改
#     fXk = float(multiply(oS.alphas, oS.labelMat).T * (oS.X * oS.X[k, :].T) + oS.b)
#     Ek = fXk - float(oS.labelMat[k])
#     return Ek
def calcEk(oS, k):
    fXk = float(multiply(oS.alphas, oS.labelMat).T * oS.K[:, k] + oS.b)
    Ek = fXk - float(oS.labelMat[k])
    return Ek


def selectJ(i, oS, Ei):
    maxK = -1
    maxDeltaE = 0
    Ej = 0
    oS.eCache[i] = [1, Ei]
    validEcacheList = nonzero(oS.eCache[:, 0].A)[0]
    if len(validEcacheList) > 1:
        for k in validEcacheList:
            if k == i:
                continue
            Ek = calcEk(oS, k)
            deltaE = abs(Ei - Ek)
            if deltaE > maxDeltaE:
                maxK = k
                maxDeltaE = deltaE
                Ej = Ek
        return maxK, Ej
    else:
        j = selectJrand(i, oS.m)
        Ej = calcEk(oS, j)
    return j, Ej


def updateEk(oS, k):
    Ek = calcEk(oS, k)
    oS.eCache[k] = [1, Ek]


def innerL(i, oS):
    Ei = calcEk(oS, i)
    if ((oS.labelMat[i] * Ei < -oS.tol) and (oS.alphas[i] < oS.C)) or (
            (oS.labelMat[i] * Ei > oS.tol) and (oS.alphas[i] > 0)):
        j, Ej = selectJ(i, oS, Ei)
        alphaIold = oS.alphas[i].copy()
        alphaJold = oS.alphas[j].copy()
        if oS.labelMat[i] != oS.labelMat[j]:
            L = max(0, oS.alphas[j] - oS.alphas[i])
            H = min(oS.C, oS.C + oS.alphas[j] - oS.alphas[i])
        else:
            L = max(0, oS.alphas[j] + oS.alphas[i] - oS.C)
            H = min(oS.C, oS.alphas[j] + oS.alphas[i])
        if L == H:
            print("L==H")
            return 0
        # eta = 2.0 * oS.X[i, :] * oS.X[j, :].T - oS.X[i, :] * oS.X[i, :].T - oS.X[j, :] * oS.X[j, :].T  # 修改
        eta = 2.0 * oS.K[i, j] - oS.K[i, i] - oS.K[j, j]
        if eta >= 0:
            print("eta>=0")
            return 0
        oS.alphas[j] -= oS.labelMat[j] * (Ei - Ej) / eta
        oS.alphas[j] = clipAlpha(oS.alphas[j], H, L)
        updateEk(oS, j)
        if abs(oS.alphas[j] - alphaJold) < 0.00001:
            print("j not moving enough")
            return 0
        oS.alphas[i] += oS.labelMat[j] * oS.labelMat[i] * (alphaJold - oS.alphas[j])
        updateEk(oS, i)
        # b1 = oS.b - Ei - oS.labelMat[i] * (oS.alphas[i] - alphaIold) * oS.X[i, :] * oS.X[i, :].T - oS.labelMat[j] * (
        #         oS.alphas[j] - alphaJold) * oS.X[i, :] * oS.X[j, :].T
        # b2 = oS.b - Ej - oS.labelMat[i] * (oS.alphas[i] - alphaIold) * oS.X[i, :] * oS.X[j, :].T - oS.labelMat[j] * (
        #         oS.alphas[j] - alphaJold) * oS.X[j, :] * oS.X[j, :].T     # 修改
        b1 = oS.b - Ei - oS.labelMat[i] * (oS.alphas[i] - alphaIold) * oS.K[i, i] - oS.labelMat[j] * (
                    oS.alphas[j] - alphaJold) * oS.K[i, j]
        b2 = oS.b - Ej - oS.labelMat[i] * (oS.alphas[j] - alphaIold) * oS.K[i, j] - oS.labelMat[j] * (
                    oS.alphas[j] - alphaJold) * oS.K[j, j]
        if 0 < oS.alphas[i] and oS.C > oS.alphas[i]:
            oS.b = b1
        elif 0 < oS.alphas[j] and oS.C > oS.alphas[j]:
            oS.b = b2
        else:
            oS.b = (b1 + b2) / 2.0
        return 1
    else:
        return 0


def smoP(dataMatIn, classLabels, C, toler, maxIter, kTup=("lin", 0)):
    oS = optStruct(mat(dataMatIn), mat(classLabels).transpose(), C, toler,kTup)
    iter = 0
    entireSet = True
    alphaPairsChanged = 0
    while (iter < maxIter) and (alphaPairsChanged > 0 or entireSet):
        alphaPairsChanged = 0
        if entireSet:
            for i in range(oS.m):
                alphaPairsChanged += innerL(i, oS)
                print("fullSet,iter: %d i:%d,pairs changed %d " % (iter, i, alphaPairsChanged))
            iter += 1
        else:
            nonBoundIs = nonzero((oS.alphas.A > 0) * (oS.alphas.A < C))[0]
            for i in nonBoundIs:
                alphaPairsChanged += innerL(i, oS)
                print("non-bound,iter: %d i:%d,pairs changed %d " % (iter, i, alphaPairsChanged))
            iter += 1
        if entireSet:
            entireSet = False
        elif alphaPairsChanged == 0:
            entireSet = True
        print("iteration number: %d " % iter)
    return oS.b, oS.alphas


def calcWs(alphas, dataArr, classLabels):
    X = mat(dataArr)
    labelMat = mat(classLabels).transpose()
    m, n = shape(X)
    w = zeros((n, 1))
    for i in range(m):
        w += multiply(alphas[i] * labelMat[i], X[i, :].T)
    return w


def kernelTrans(X, A, kTup):
    m, n = shape(X)
    K = mat(zeros((m, 1)))
    if kTup[0] == "lin":
        K = X * A.T
    elif kTup[0] == "rbf":
        for j in range(m):
            deltaRow = X[j, :] - A
            K[j] = deltaRow * deltaRow.T
        K = exp(K / (-1 * kTup[1] ** 2))
    else:
        raise NameError("houston we have a problem--that kernel is not recognized")
    return K


class optStruct:
    def __init__(self, dataMatIn, classLabels, C, toler, kTup):
        self.X = dataMatIn
        self.labelMat = classLabels
        self.C = C
        self.tol = toler
        self.m = shape(dataMatIn)[0]
        self.alphas = mat(zeros((self.m, 1)))
        self.b = 0
        self.eCache = mat(zeros((self.m, 2)))
        self.K = mat(zeros((self.m, self.m)))
        for i in range(self.m):
            self.K[:, i] = kernelTrans(self.X, self.X[i, :], kTup)


def testRbf(k1=1.3):
    dataArr, labelArr = loadDataSet('testSetRBF.txt')
    b, alphas = smoP(dataArr, labelArr, 200, 0.0001, 10000, ('rbf', k1))  # C=200 important
    datMat = mat(dataArr);
    labelMat = mat(labelArr).transpose()
    svInd = nonzero(alphas.A > 0)[0]
    sVs = datMat[svInd]  # get matrix of only support vectors
    labelSV = labelMat[svInd];
    print("there are %d Support Vectors" % shape(sVs)[0])
    m, n = shape(datMat)
    errorCount = 0
    for i in range(m):
        kernelEval = kernelTrans(sVs, datMat[i, :], ('rbf', k1))
        predict = kernelEval.T * multiply(labelSV, alphas[svInd]) + b
        if sign(predict) != sign(labelArr[i]): errorCount += 1
    print("the training error rate is: %f" % (float(errorCount) / m))
    dataArr, labelArr = loadDataSet('testSetRBF2.txt')
    errorCount = 0
    datMat = mat(dataArr);
    labelMat = mat(labelArr).transpose()
    m, n = shape(datMat)
    for i in range(m):
        kernelEval = kernelTrans(sVs, datMat[i, :], ('rbf', k1))
        predict = kernelEval.T * multiply(labelSV, alphas[svInd]) + b
        if sign(predict) != sign(labelArr[i]): errorCount += 1
    print("the test error rate is: %f" % (float(errorCount) / m))


def img2vector(filename):
    returnVect = zeros((1, 1024))
    fr = open(filename)
    for i in range(32):
        lineStr = fr.readline()
        for j in range(32):
            returnVect[0, 32 * i + j] = int(lineStr[j])
    return returnVect


def loadImages(dirName):
    from os import listdir
    hwLabels = []
    trainingFileList = listdir(dirName)  # load the training set
    m = len(trainingFileList)
    trainingMat = zeros((m, 1024))
    for i in range(m):
        fileNameStr = trainingFileList[i]
        fileStr = fileNameStr.split('.')[0]  # take off .txt
        classNumStr = int(fileStr.split('_')[0])
        if classNumStr == 9:
            hwLabels.append(-1)
        else:
            hwLabels.append(1)
        trainingMat[i, :] = img2vector('%s/%s' % (dirName, fileNameStr))
    return trainingMat, hwLabels


def testDigits(kTup=('rbf', 10)):
    dataArr, labelArr = loadImages('D:\\python\\PyCode\\machinelearninginaction\\Ch02\\trainingDigits\\')
    b, alphas = smoP(dataArr, labelArr, 200, 0.0001, 10000, kTup)
    datMat = mat(dataArr);
    labelMat = mat(labelArr).transpose()
    svInd = nonzero(alphas.A > 0)[0]
    sVs = datMat[svInd]
    labelSV = labelMat[svInd];
    print("there are %d Support Vectors" % shape(sVs)[0])
    m, n = shape(datMat)
    errorCount = 0
    for i in range(m):
        kernelEval = kernelTrans(sVs, datMat[i, :], kTup)
        predict = kernelEval.T * multiply(labelSV, alphas[svInd]) + b
        if sign(predict) != sign(labelArr[i]): errorCount += 1
    print("the training error rate is: %f" % (float(errorCount) / m))
    dataArr, labelArr = loadImages('D:\\python\\PyCode\\machinelearninginaction\\Ch02\\testDigits\\')
    errorCount = 0
    datMat = mat(dataArr);
    labelMat = mat(labelArr).transpose()
    m, n = shape(datMat)
    for i in range(m):
        kernelEval = kernelTrans(sVs, datMat[i, :], kTup)
        predict = kernelEval.T * multiply(labelSV, alphas[svInd]) + b
        if sign(predict) != sign(labelArr[i]): errorCount += 1
    print("the test error rate is: %f" % (float(errorCount) / m))


if __name__ == '__main__':
    # 准备数据和类别标签
    dataArr, labelArr = loadDataSet(".\\testSet.txt")
    print(dataArr)
    print(labelArr)

    # # 简化版smo
    # b,alphas=smoSimple(dataArr,labelArr,0.6,0.001,40)
    # print(b)
    # print(alphas[alphas>0])

    # b, alphas = smoP(dataArr, labelArr, 0.6, 0.001, 40)
    # calcWs(alphas,dataArr,labelArr)

    # 利用核函数进行分类的径向测试函数
    # testRbf()
    # showDataSet(dataArr,labelArr)

    # 手写体识别
    testDigits()

总结

优点

  1. 非线性映射是SVM方法的理论基础,SVM利用内积核函数代替向高维空间的非线性映射。
  2. 支持向量是SVM的训练结果,在SVM分类决策中起决定作用的是支持向量。
  3. SVM 的最终决策函数只由少数的支持向量所确定,计算的复杂性取决于支持向量的数目,而不是样本空间的维数,这在某种意义上避免了“维数灾难”,泛化性能比较好, 不容易过拟合。

缺点

  1. 大规模训练样本(m阶矩阵计算) 速度慢
  2. 传统的SVM不适合多分类
  3. 对缺失数据、参数、核函数敏感
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值