机器学习_SVM实现多分类_持续更新

最新推荐文章于 2024-06-16 14:11:15 发布

Longtermevolution

最新推荐文章于 2024-06-16 14:11:15 发布

阅读量1.2k

点赞数

分类专栏：机器学习

本文链接：https://blog.csdn.net/Longtermevolution/article/details/103996254

版权

机器学习专栏收录该内容

1 篇文章 0 订阅

订阅专栏

导语:

直接法

间接法

（1）一对多法（one-versus-rest,简称OVR SVMs）

（2）一对一法（one-versus-one,简称OVO SVMs或者pairwise）

导语:

直接方法尽管看起来简洁，但是在最优化问题求解过程中的变量远远多于第一类方法，训练速度不及间接方法，而且在分类精度上也不占优。当训练样本数非常大时，这一问题更加突出。正因如此，间接方法更为常用。

直接法

直接在目标函数上进行修改，将多个分类面的参数求解合并到一个最优化问题中，通过求解该最优化问题“一次性”实现多类分类。这种方法看似简单，但其计算复杂度比较高，实现起来比较困难，只适合用于小型问题中；

间接法

（1）一对多法（one-versus-rest,简称OVR SVMs）

训练时依次把某个类别的样本归为一类,其他剩余的样本归为另一类，这样k个类别的样本就构造出了k个SVM。分类时将未知样本分类为具有最大分类函数值的那类。

　　假如我有四类要划分（也就是4个Label），他们是A、B、C、D。

　　于是我在抽取训练集的时候，分别抽取

　　（1）A所对应的向量作为正集，B，C，D所对应的向量作为负集；

　　（2）B所对应的向量作为正集，A，C，D所对应的向量作为负集；

　　（3）C所对应的向量作为正集，A，B，D所对应的向量作为负集；

　　（4）D所对应的向量作为正集，A，B，C所对应的向量作为负集；

　　使用这四个训练集分别进行训练，然后的得到四个训练结果文件。

　　在测试的时候，把对应的测试向量分别利用这四个训练结果文件进行测试。

　　最后每个测试都有一个结果f1(x),f2(x),f3(x),f4(x)。

　　于是最终的结果便是这四个值中最大的一个作为分类结果。

评价

优点：训练k个分类器，个数较少，其分类速度相对较快。

缺点：

①每个分类器的训练都是将全部的样本作为训练样本，这样在求解二次规划问题时，训练速度会随着训练样本的数量的增加而急剧减慢；

②同时由于负类样本的数据要远远大于正类样本的数据，从而出现了样本不对称的情况，且这种情况随着训练数据的增加而趋向严重。解决不对称的问题可以引入不同的惩罚因子，对样本点来说较少的正类采用较大的惩罚因子C；

③还有就是当有新的类别加进来时，需要对所有的模型进行重新训练。

#-*- coding:utf-8 -*-
'''


'''
#svm 高斯核函数实现多分类
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from sklearn import datasets

sess = tf.Session()

#加载数据集，并为每类分离目标值
iris = datasets.load_iris()
#提取数据的方法
x_vals = np.array([[x[0],x[3]] for x in iris.data])

y_vals1 = np.array([1 if y==0 else -1 for y in iris.target])
y_vals2 = np.array([1 if y==1 else -1 for y in iris.target])
y_vals3 = np.array([1 if y==2 else -1 for y in iris.target])
#合并数据的方法
y_vals = np.array([y_vals1,y_vals2,y_vals3])
#数据集四个特征，只是用两个特征就可以
class1_x = [x[0] for i,x in enumerate(x_vals) if iris.target[i]==0]
class1_y = [x[1] for i,x in enumerate(x_vals) if iris.target[i]==0]
class2_x = [x[0] for i,x in enumerate(x_vals) if iris.target[i]==1]
class2_y = [x[1] for i,x in enumerate(x_vals) if iris.target[i]==1]
class3_x = [x[0] for i,x in enumerate(x_vals) if iris.target[i]==2]
class3_y = [x[1] for i,x in enumerate(x_vals) if iris.target[i]==2]
#从单类目标分类到三类目标分类，利用矩阵传播和reshape技术一次性计算所有的三类SVM，一次性计算，y_target的占位符维度是[3,None]
batch_size = 50
x_data = tf.placeholder(shape = [None,2],dtype=tf.float32)
y_target = tf.placeholder(shape=[3,None],dtype=tf.float32)
#TODO
prediction_grid = tf.placeholder(shape=[None,2],dtype=tf.float32)
b = tf.Variable(tf.random_normal(shape=[3,batch_size]))
#计算高斯核函数 TODO
gamma = tf.constant(-10.0)
dist = tf.reduce_sum(tf.square(x_data),1)
dist = tf.reshape(dist,[-1,1])
sq_dists = tf.add(tf.subtract(dist,tf.multiply(2.,tf.matmul(x_data,tf.transpose(x_data)))),tf.transpose(dist))
my_kernel = tf.exp(tf.multiply(gamma,tf.abs(sq_dists)))
#扩展矩阵维度
def reshape_matmul(mat):
    v1 = tf.expand_dims(mat,1)
    v2 = tf.reshape(v1,[3,batch_size,1])
    return (tf.matmul(v2,v1))
#计算对偶损失函数
model_output = tf.matmul(b,my_kernel)
first_term = tf.reduce_sum(b)
b_vec_cross = tf.matmul(tf.transpose(b),b)
y_target_cross = reshape_matmul(y_target)
second_term = tf.reduce_sum(tf.multiply(my_kernel,tf.multiply(b_vec_cross,y_target_cross)),[1,2])
loss = tf.reduce_sum(tf.negative(tf.subtract(first_term,second_term)))
#创建预测核函数
rA = tf.reshape(tf.reduce_sum(tf.square(x_data),1),[-1,1])
rB = tf.reshape(tf.reduce_sum(tf.square(prediction_grid),1),[-1,1])
pred_sq_dist = tf.add(tf.subtract(rA,tf.multiply(2.,tf.matmul(x_data,tf.transpose(prediction_grid)))),tf.transpose(rB))
pred_kernel = tf.exp(tf.multiply(gamma,tf.abs(pred_sq_dist)))
#创建预测函数，这里实现的是一对多的方法，所以预测值是分类器有最大返回值的类别
prediction_output = tf.matmul(tf.multiply(y_target,b),pred_kernel)
prediction = tf.arg_max(prediction_output-tf.expand_dims(tf.reduce_mean(prediction_output,1),1),0)
accuracy = tf.reduce_mean(tf.cast(tf.equal(prediction,tf.arg_max(y_target,0)),tf.float32))
#准备好核函数，损失函数，预测函数以后，声明优化器函数和初始化变量
my_opt = tf.train.GradientDescentOptimizer(0.01)
train_step = my_opt.minimize(loss)
init = tf.initialize_all_variables()
sess.run(init)
#该算法收敛的相当快，所以迭代训练次数不超过100次
loss_vec = []
batch_accuracy = []
for i in range(100):
    rand_index = np.random.choice(len(x_vals),size=batch_size)
    rand_x = x_vals[rand_index]
    rand_y = y_vals[:,rand_index]
    sess.run(train_step,feed_dict={x_data:rand_x,y_target:rand_y})
    temp_loss = sess.run(loss,feed_dict={x_data:rand_x,y_target:rand_y})
    loss_vec.append(temp_loss)
    acc_temp = sess.run(accuracy,feed_dict={x_data:rand_x,y_target:rand_y,prediction_grid:rand_x})
    batch_accuracy.append(acc_temp)
    if(i+1)%25 ==0:
        print('Step #' + str(i+1))
        print('Loss #' + str(temp_loss))

x_min,x_max = x_vals[:,0].min()-1,x_vals[:,0].max()+1
y_min,y_max = x_vals[:,1].min()-1,x_vals[:,1].max()+1
xx,yy = np.meshgrid(np.arange(x_min,x_max,0.02),np.arange(y_min,y_max,0.02))
grid_points = np.c_[xx.ravel(),yy.ravel()]
grid_predictions = sess.run(prediction,feed_dict={x_data:rand_x,y_target:rand_y,prediction_grid:grid_points})
grid_predictions = grid_predictions.reshape(xx.shape)
#绘制训练结果，批量准确度和损失函数
#等高线图
plt.contourf(xx,yy,grid_predictions,cmap=plt.cm.Paired,alpha=0.8)
plt.plot(class1_x,class1_y,'ro',label = 'I.setosa')
plt.plot(class2_x,class2_y,'kx',label = 'I.versicolor')
plt.plot(class3_x,class3_y,'gv',label = 'T.virginica')
plt.title('Gaussian Svm Results on Iris Data')
plt.xlabel('Pedal Length')
plt.ylabel('Sepal Width')
plt.legend(loc='lower right')
plt.ylim([-0.5,3.0])
plt.xlim([3.5,8.5])
plt.show()

plt.plot(batch_accuracy,'k-',label='Accuracy')
plt.title('Batch Accuracy')
plt.xlabel('Generation')
plt.ylabel('Sepal Width')
plt.legend(loc = 'lower right')
plt.show()

plt.plot(loss_vec,'k--')
plt.title('Loss per Generation')
plt.xlabel('Generation')
plt.ylabel('Loss')
plt.show()

（2）一对一法（one-versus-one,简称OVO SVMs或者pairwise）

其做法是在任意两类样本之间设计一个SVM，因此k个类别的样本就需要设计k(k-1)/2个SVM。

　　当对一个未知样本进行分类时，最后得票最多的类别即为该未知样本的类别。

　　Libsvm中的多类分类就是根据这个方法实现的。

　　假设有四类A,B,C,D四类。在训练的时候我选择A,B; A,C; A,D; B,C; B,D;C,D所对应的向量作为训练集，然后得到六个训练结果，在测试的时候，把对应的向量分别对六个结果进行测试，然后采取投票形式，最后得到一组结果。

　　投票是这样的：
　　A=B=C=D=0;
　　(A,B)-classifier 如果是A win,则A=A+1;otherwise,B=B+1;
　　(A,C)-classifier 如果是A win,则A=A+1;otherwise, C=C+1;
　　...
　　(C,D)-classifier 如果是A win,则C=C+1;otherwise,D=D+1;
　　The decision is the Max(A,B,C,D)

评价：这种方法虽然好,但是当类别很多的时候,model的个数是n*(n-1)/2,代价还是相当大的。

评价：

优点：不需要重新训练所有的SVM，只需要重新训练和增加语音样本相关的分类器。在训练单个模型时，相对速度较快。

缺点：所需构造和测试的二值分类器的数量关于k成二次函数增长，总训练时间和测试时间相对较慢。

从“一对一”的方式出发，出现了有向无环图（DirectedAcyclic Graph）的分类方法。

Longtermevolution

关注

0
点赞
踩
8

收藏

觉得还不错? 一键收藏
0
评论
机器学习_SVM实现多分类_持续更新

目录导语:直接法间接法（1）一对多法（one-versus-rest,简称OVR SVMs）（2）一对一法（one-versus-one,简称OVO SVMs或者pairwise）导语:直接方法尽管看起来简洁，但是在最优化问题求解过程中的变量远远多于第一类方法，训练速度不及间接方法，而且在分类精度上也不占优。当训练样本数非常大时，这一问题更加突出。正因如此，...
复制链接

扫一扫

专栏目录