Machine Learning--SVMs

Machine Learning–SVMs

基本介绍

linearly sperate : 可以用直线分隔数据的两部分的情况

hyberplane: 决策边界,一般是 N-1 维的

margin : 最近元素离 hyberplane 的距离,margin = label * (WT X+ b)

support vecotors : 距离 hyberplane 最近的点(们)

SVM 就是要找到具有最大 marginhyberplane

先找到距离 hyberplane 最近的 n 个点,然后计算它们的 margin ,最后让这 n 个点的 margin 值之和最大。

为了简化计算,用 Largrange mulitiplier 来代替上面的计算:

注意这里面假设所有的数据都是 linearly sperate 的,为了提高准确性,引入 slack variables,c 是我们设置的常量。

因此 SVMs 的目标变成了寻找 αs

Sequential Minimal Optimization

SMO 是用来寻找 αs 的优化算法

simple SMO

# open the file and parse each line into class labels and data matrix

def loadDataSet(fileName):
    dataMat = []
    labelMat = []
    fr = open(fileName)
    for line in fr.readlines():
        lineArr = line.strip().split('\t')
        dataMat.append([float(lineArr[0]), float(lineArr[1])])
        labelMat.append(float(lineArr[2]))
    return dataMat, labelMat

# return an alpha as long as it is not the same as the present one(i), m is the total number of alphas
def selectJrand(i, m):
    j=i
    while (j==i):
        j = int(random.uniform(0, m))
    return j

# get ones higher than H, and lower than L
def clipAlpha(aj, H, L):
    if aj > H:
        aj = H
    if L > aj:
        aj = L
    return aj

# toler, tolerance; maxIter, max number of iterations before quitting;
def smoSimple(dataMatIn, classLabels, C, toler, maxIter):
    dataMatrix = mat(dataMatIn)
    # a column matrix
    labelMat = mat(classLabels).transpose()
    b = 0
    m, n = shape(dataMatrix)
    # a column matrix initialized to zero
    alphas = mat(zeros((m, 1)))
    # each time we go through the dataset without any alphas changed, iter will increase by 1.
    iter = 0
    while (iter < maxIter):
        # once getting into and completely going through the loop, alphaPairChanged will change to 1
        alphaPairsChanged = 0
        for i in range(m):
            # our prediction of the class
            fXi = float(multiply(alphas, labelMat).T * (dataMatrix*dataMatrix[i, :].T)) + b
            # the error between the real class
            Ei = fXi - float(labelMat[i])
            # the error is large enough and alphas[i](which will be changed later) is in the right range, which means \
            # alpha could be optimized.
            if((labelMat[i]*Ei < -toler) and (alphas[i] < C)) or ((labelMat[i]*Ei > toler) and (alphas[i] > 0)):
                # randomly select a second alpha[i](the first alpha is alphas[i])
                j = selectJrand(i, m)
                # calculate the prediction and error as done on alphas[i]
                fXj = float(multiply(alphas, labelMat).T * (dataMatrix*dataMatrix[j, :].T)) + b
                Ej = fXj - float(labelMat[j])
                # get the old value of calculated alphas[i] and alphas[j].
                alphaIold = alphas[i].copy()
                alphaJold = alphas[j].copy()
                # calculate the L and H in order to make alpha[j] between 0 and C.
                if(labelMat[i] != labelMat[j]):
                    L = max(0, alphas[j] - alphas[i])
                    H = min(C, C + alphas[j] - alphas[i])
                else:
                    L = max(0, alphas[j] + alphas[i] - C)
                    H = min(C, alphas[i] + alphas[j])
                # if L == H, we could not change anything.
                if L==H:
                    print("L == h")
                    continue
                # calculate the optimal amount to change alpha[j]
                eta = 2.0 * dataMatrix[i, :] * dataMatrix[j, :].T - dataMatrix[i, :] * dataMa
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值