转自:关于SVM的那点破事 http://www.matlabsky.com/thread-10966-1-1.html
MATLAB技术论坛:MATLABsky.com;视频教学下载交流地址:http://www.matlabsky.com/forum-5-1.html
安装方法:http://v.youku.com/v_showMini/id_XMjc2NTY3MzYw_ft_131.html
1.设置路径:用Add with Subfolders添加目录(将工具箱所在文件夹的子目录也添加到MATLAB工作搜索目录)
2.选择编译器:mex -setup(mex后面有空格)
3.编译:make(要把MATLAB当前目录调整到libsvm工具箱所在文件夹)双击make.m文件
PS:运行help train得到的是MATLAB自带的svmtrain函数的帮助文件
运行help svmpredict会有报错:svmpredict not found
工具箱中的README稳健可以算是帮助文件
table键对函数进行补全
测试数据:
1.libsvm官方提供测试数据(原始数据格式是给libsvm C++版本使用,MATLAB平台下需要使用libsvmread进行格式转换)
http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
ps:数据文件所在必须是当前文件夹,不能为当前文件夹的子文件
数据文件是在C++平台下写的,MATLAB中不能直接load,用libsvmread
[label_vector, instance_matrix] = libsvmread('filename')(后面不能有;)
例:[label_vector, instance_matrix] = libsvmread('breast_cancer.txt')
ps:在工作区可查看当前数据集的数据规模与属性个数
模型训练:model = svmtrain(label_vector,instance_matrix)2.UCI数据库
http://archive.ics.uci.edu/ml/
导入MATLAB并转换为.mat文件:
1)导入:file--import file(2014a导入数据),其分隔符为comma(逗号)
2)load wine.data / load('wine.data')
3)抽取标签,抽取属性矩阵,保存为MATLAB可用的.mat文件
wine_label = wine(:,1);
wine_data = wine(:,2:end);
save winedat.mat
4)模型应用
load winedat;
modelw = svmtrain(wine_label,wine_data);
[plabelw,accuracyw] = svmpredict(wine_label,wine_data,modelw);
ps:File--import data:导入外部数据文件进入MATLAB文件(非自带)/uiimport命令
数据保存:36min
F9:右侧选中行运行(脚本函数)
LibSVM_README(部分翻译)
Returned Model Structure
========================
The 'svmtrain' function returns a model which canbe used for future prediction. It is astructure and is organized as [Parameters, nr_class, totalSV, rho, Label,ProbA, ProbB, nSV, sv_coef, SVs]:
-Parameters: parameters
-nr_class: number of classes; = 2 for regression/one-class svm
-totalSV: total #SV
-rho: -bof the decision function(s) wx+b
-Label:label of each class; empty for regression/one-class SVM
-sv_indices: values in [1,...,num_traning_data] to indicate SVs in thetraining set
-ProbA:pairwise probability information; empty if -b 0 or in one-class SVM
-ProbB:pairwise probability information; empty if -b 0 or in one-class SVM
-nSV:number of SVs for each class; empty for regression/one-class SVM
-sv_coef: coefficients for SVs in decision functions
-SVs:support vectors
If you do notuse the option '-b 1', ProbA and ProbB are empty matrices. If the '-v' optionis specified, cross validation is conducted and the returned model is just ascalar: cross-validation accuracy for classification and mean-squared error forregression.
More details about this model can be found in LIBSVMFAQ
(http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html)and LIBSVM
implementation document
(http://www.csie.ntu.edu.tw/~cjlin/papers/libsvm.pdf).
‘svmtrain’返回一个用于预测的模型,该模型为[Parameters, nr_class, totalSV,rho, Label, ProbA, ProbB, nSV, sv_coef, SVs]
-Parameters: parameters
-nr_class: 类别个数;回归SVM和1分类SVM为2
-totalSV: total #SV
-rho: -b of the decision function(s) wx+b
-Label: 标签; 回归SVM和1分类SVM 为空
-sv_indices: 在[1,...,num_traning_data]范围内的值,表示在训练集中的SVs
-ProbA: 成对概率信息;-b 0或1分类SVM 中为空
-ProbB: 同上
-nSV: 每个类的SVs; 回归SVM和1分类SVM 为空
-sv_coef: 决策函数中SVs的系数
-SVs: support vectors
如果不使用选项'-b1',ProbA和ProbB为空矩阵;如果设定‘-v’选项,执行交叉验证,返回模型是标量(scalar)
分类—交叉验证准确率;回归—均方误差
====================
The function 'svmpredict' has three outputs. The firstone, predictd_label, is a vector of predicted labels. The second output, accuracy,is a vector including accuracy (for classification), mean squared error, andsquared correlation coefficient (for regression). The third is a matrixcontaining decision values or probability estimates (if '-b 1' is specified).If k is the number of classes in training data, for decision values, each rowincludes results of predicting k(k-1)/2 binary-class SVMs.For[S3] classification, k = 1 is a special case. Decision value +1 is returned for eachtesting instance, instead of an empty vector. For probabilities, each rowcontains k values indicating the probability that the testing instance is ineach class. Note that the order of classes here is the same as 'Label' field inthe model structure.
‘svmpredict’有三个输出:
1. 预测标签向量
2. 准确率。向量,包括:准确率(分类),均方误差,平方相关系数(回归)
3. 包含决策值/概率预测的矩阵(当设定’-b 1’时)
若k为训练集中类别个数,决策值每一列包含对k(k-1)/2binary-class SVMs的预测结果。分类问题中,k=1是特殊情况,此时对每个测试实例返回Decisionvalue +1,而非空向量。每列包含的k个值分别表示测试实例在每个类别中的概率。注意,此处类别的顺序和模型结构中’Label’的一样。
===============
A matlab function libsvmread reads files in LIBSVMformat:
[label_vector, instance_matrix] = libsvmread('data.txt');
Two outputs are labels and instances, which can thenbe used as inputs of svmtrain or svmpredict.
A matlab function libsvmwrite writes Matlab matrix toa file in LIBSVM format:
libsvmwrite('data.txt', label_vector, instance_matrix)
The instance_matrix must be a sparse matrix. (typemust be double) For 32bit and 64bit MATLAB on Windows, pre-built binary filesare ready in the directory `..\windows', but in future releases, we will onlyinclude 64bit MATLAB binary files.
Libsvmread’读取LIBSVM格式的数据,输出标签和实例,可用于svmtrain和svmpredict的输入。
‘Libsvmwrite’将MATLAB矩阵写入一个LIBSVM格式的文件中。Instance_matrix必须是稀疏矩阵。对Windows32位和64位的MATLAB,预编译二进制文件在目录’..\windows'下,但以后的版本将只包含64位二进制文件。