
-s svm类型:SVM设置类型(默认0)
  0 -- C-SVC
  1 --v-SVC
  2 – 一类SVM
  3 -- e -SVR
  4 -- v-SVR
  -t 核函数类型:核函数设置类型(默认2)
  0 – 线性:u'v
  1 – 多项式:(r*u'v + coef0)^degree
  2 – RBF函数:exp(-r|u-v|^2)
  3 –sigmoid:tanh(r*u'v + coef0)

-g r(gama):核函数中的gamma函数设置(针对多项式/rbf/sigmoid核函数)

-c cost:设置C-SVC,e -SVR和v-SVR的参数(损失函数)(默认1)

-p p:设置e -SVR 中损失函数p的值(默认0.1)


  9. %% a litte clean work
  10. tic;
  11. close all;
  12. clear;
  13. clc;
  14. format compact;
  15. %%

  16. % 生成待回归的数据
  17. x = (-1:0.1:1)';
  18. y = -x.^2;

  19. % 建模回归模型
  20. model = svmtrain(y,x,'-s 3 -t 2 -c 2.2 -g 2.8 -p 0.01');

  21. % 利用建立的模型看其在训练集合上的回归效果
  22. [py,mse] = svmpredict(y,x,model);
  23. figure;
  24. plot(x,y,'o');
  25. hold on;
  26. plot(x,py,'r*');
  27. legend('原始数据','回归数据');
  28. grid on;

  29. % 进行预测
  30. testx = 1.1;
  31. display('真实数据')
  32. testy = -testx.^2

  33. [ptesty,tmse] = svmpredict(testy,testx,model);
  34. display('预测数据');
  35. ptesty

  36. %%
  37. toc

  1. Mean squared error = 9.52768e-005 (regression)
  2. Squared correlation coefficient = 0.999184 (regression)
  3. 真实数据
  4. testy =
  5.    -1.2100
  6. Mean squared error = 0.0102555 (regression)
  7. Squared correlation coefficient = -1.#IND (regression)
  8. 预测数据
  9. ptesty =
  10.    -1.1087
  11. Elapsed time is 0.133552 seconds.

Basic SVM: Linear-kernel SVM for binary classification

Below is the first code to run. The code is for binary classification and use the variable c = 1, gamma (g) = 0.07 and '-b 1' denotes the probability output.

% This code just simply run the SVM on the example data set "heart_scale",
% which is scaled properly. The code divides the data into 2 parts
% train: 1 to 200
% test: 201:270
% Then plot the results vs their true class. In order to visualize the high
% dimensional data, we apply MDS to the 13D data and reduce the dimension
% to 2D

close all

% addpath to the libsvm toolbox

% addpath to the data
dirData = '../libsvm-3.12'; 

% read the data set
[heart_scale_label, heart_scale_inst] = libsvmread(fullfile(dirData,'heart_scale'));
[N D] = size(heart_scale_inst);

% Determine the train and test index
trainIndex = zeros(N,1); trainIndex(1:200) = 1;
testIndex = zeros(N,1); testIndex(201:N) = 1;
trainData = heart_scale_inst(trainIndex==1,:);
trainLabel = heart_scale_label(trainIndex==1,:);
testData = heart_scale_inst(testIndex==1,:);
testLabel = heart_scale_label(testIndex==1,:);

% Train the SVM
model = svmtrain(trainLabel, trainData, '-c 1 -g 0.07 -b 1');
% Use the SVM model to classify the data
[predict_label, accuracy, prob_values] = svmpredict(testLabel, testData, model, '-b 1'); % run the SVM model on the test data

% ================================
% ===== Showing the results ======
% ================================

% Assign color for each class
% colorList = generateColorList(2);  % This is my own way to assign the color...don't worry about it
colorList = prism(100);

% true (ground truth) class
trueClassIndex = zeros(N,1);
trueClassIndex(heart_scale_label==1) = 1; 
trueClassIndex(heart_scale_label==-1) = 2;
colorTrueClass = colorList(trueClassIndex,:);
% result Class
resultClassIndex = zeros(length(predict_label),1);
resultClassIndex(predict_label==1) = 1; 
resultClassIndex(predict_label==-1) = 2;
colorResultClass = colorList(resultClassIndex,:);

% Reduce the dimension from 13D to 2D
distanceMatrix = pdist(heart_scale_inst,'euclidean');
newCoor = mdscale(distanceMatrix,2);

% Plot the whole data set
x = newCoor(:,1);
y = newCoor(:,2);
patchSize = 30; %max(prob_values,[],2);
colorTrueClassPlot = colorTrueClass;
figure; scatter(x,y,patchSize,colorTrueClassPlot,'filled');
title('whole data set');

% Plot the test data
x = newCoor(testIndex==1,1);
y = newCoor(testIndex==1,2);
patchSize = 80*max(prob_values,[],2);
colorTrueClassPlot = colorTrueClass(testIndex==1,:);
figure; hold on;
% Plot the training set
x = newCoor(trainIndex==1,1);
y = newCoor(trainIndex==1,2);
patchSize = 30;
colorTrueClassPlot = colorTrueClass(trainIndex==1,:);
title('classification results');

The result shows:
optimization finished, #iter = 137
nu = 0.457422
obj = -76.730867, rho = 0.435233
nSV = 104, nBSV = 81
Total nSV = 104
Accuracy = 81.4286% (57/70) (classification)

The whole data set is plotted:
whole data set plotted with respect to class labels
The clustering results might look like this:
clustering results using 2-class SVM
The unfilled markers represent data instance from the train set. The filled markers represent data instance from the test set, and filled color represents the class label assigned by SVM whereas the edge color represents the true (ground-truth) label. The marker size of the test set represents the probability that the sample instance is assigned with its corresponding class label; the bigger, the more confidence.   

Kernel SVM for binary classification

Now let's apply some kernel to the SVM. We use almost the same code as before, the only exception is the train data set, trainData, is replaced by the kernelized version  [(1:200)' trainData*trainData']  and the test data, testData, is replaced by its kernelized version  [(1:70)'testData*trainData']  as appeared below.
% Train the SVM
model = svmtrain(trainLabel, [(1:200)' trainData*trainData'], '-c 1 -g 0.07 -b 1 -t 4');
% Use the SVM model to classify the data
[predict_label, accuracy, prob_values] = svmpredict(testLabel, [(1:70)' testData*trainData'], model, '-b 1');  % run the SVM model on the test data
The complete code can be found  here . The resulting clusters are shown in the figure below.

'Linear' kernel
optimization finished, #iter = 403796
nu = 0.335720
obj = -67.042781, rho = -1.252604
nSV = 74, nBSV = 60
Total nSV = 74
Accuracy = 85.7143% (60/70) (classification)

'polynomial' kernel
optimization finished, #iter = 102385
nu = 0.000001
obj = -0.000086, rho = -0.465342
nSV = 69, nBSV = 0
Total nSV = 69
Accuracy = 72.8571% (51/70) (classification)

'RBF' kernel
optimization finished, #iter = 372
nu = 0.890000
obj = -97.594730, rho = 0.194414
nSV = 200, nBSV = 90
Total nSV = 200
Accuracy = 57.1429% (40/70) (classification)

'Sigmoid' kernel
optimization finished, #iter = 90
nu = 0.870000
obj = -195.417169, rho = 0.999993
nSV = 174, nBSV = 174
Total nSV = 174
Accuracy = 60% (42/70) (classification)

'MLP' kernel
optimization finished, #iter = 1247
nu = 0.352616
obj = -68.842421, rho = -0.552693
nSV = 77, nBSV = 63
Total nSV = 77
Accuracy = 82.8571% (58/70) (classification)

 Linear-kernel SVM: 85.7% accuracy
linear-kernel SVM
Polynomial-kernel SVM: 72.86% accuracy 
polynomial-kernel SVM
 RBF-kernel SVM: 57.14% accuracy
RBF-kernel SVM
 Sigmoid-kernel SVM: 60% accuracy
Sigmoid-kernel SVM
 MLP-kernel SVM: 82.86% accuracy
MLP-kernel SVM

Cross validation of C and Gamma

The option for svmtrain
n-fold cross validation: n must >= 2
Usage: model = svmtrain(training_label_vector, training_instance_matrix, 'libsvm_options');
-s svm_type : set type of SVM (default 0)
    0 -- C-SVC
    1 -- nu-SVC
    2 -- one-class SVM
    3 -- epsilon-SVR
    4 -- nu-SVR
-t kernel_type : set type of kernel function (default 2)
    0 -- linear: u'*v
    1 -- polynomial: (gamma*u'*v + coef0)^degree
    2 -- radial basis function: exp(-gamma*|u-v|^2)
    3 -- sigmoid: tanh(gamma*u'*v + coef0)
    4 -- precomputed kernel (kernel values in training_instance_matrix)
-d degree : set degree in kernel function (default 3)
-g gamma : set gamma in kernel function (default 1/num_features)
-r coef0 : set coef0 in kernel function (default 0)
-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
-m cachesize : set cache memory size in MB (default 100)
-e epsilon : set tolerance of termination criterion (default 0.001)
-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)
-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)
-v n : n-fold cross validation mode
-q : quiet mode (no outputs)

In this example, we will use the option enforcing n-fold cross validation in svmtrain, which is simply put the '-v n' in the parameter section, where n denote n-fold cross validation. Here is the example of using 3-fold cross validation:
param = ['-q -v 3 -c ', num2str(c), ' -g ', num2str(g)];
cv = svmtrain(trainLabel, trainData, param);
In the example below, I will show the nested cross validation. First, we search for the optimal parameters (c and gamma) in the big scale, then the searching space is narrowed down until satisfied. The results are compared with the first experiment which does not use the optimal parameters. The full code can be found  here .
 Big scale parameters searching
3-fold cross validation "biggest scale"
 Medium scale parameters searching
3-fold cross validation "medium scale"
 Small scale parameters searching
3-fold cross validation "smallest scale"
 Accuracy = 84.29% which is better than using the non-really-optimal parameter c=1 and gamma=0.07 in the previous experiment which gives 81.43% accuracy.
3-fold cross validation clustering result

Multi-class SVM

Naturally, SVM is a binary classification model, how can we use SVM in the multi-class scenario? In this example, we will show you how to do multi-class classification using libsvm. A simple strategy is to do binary classification 1 pair at a time. Here we will use one-versus-rest approach. In fact, we can just use the original codes (svmtrain and svmpredict) from the libsvm package to do the job by making a "wrapper code" to call the original code one pair at a time. The good news is that libsvm tutorial page provides a wrapper code to do so already. Yes, we will just use it properly.

Just download the demo code from the end of this  URL , which says
[trainY trainX] = libsvmread('./dna.scale'); [testY testX] = libsvmread('./dna.scale.t'); model = ovrtrain(trainY, trainX, '-c 8 -g 4'); [pred ac decv] = ovrpredict(testY, testX, model); fprintf('Accuracy = %g%%\n', ac * 100);
The codes ovrtrain and ovrpredict are the wrapper. You can also do the cross validation from the demo code below, where get_cv_ac is again the wrapper code.
bestcv = 0; for log2c = -1:2:3, for log2g = -4:2:1, cmd = ['-q -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)]; cv = get_cv_ac(trainY, trainX, cmd, 3); if (cv >= bestcv), bestcv = cv; bestc = 2^log2c; bestg = 2^log2g; end fprintf('%g %g %g (best c=%g, g=%g, rate=%g)\n', log2c, log2g, cv, bestc, bestg, bestcv); end end
The full-implemented code can be found  here . Results show that 
train vs test set
 row 1-2000: training set.
OVR-SVM classification result
The one-vs-rest multiclass SVM results. Here we do parameter selection on the train set yielding the accuracy for each class:
class1: Accuracy = 94.3508% (1119/1186) (classification)
class2: Accuracy = 95.4469% (1132/1186) (classification)
class3: Accuracy = 94.1821% (1117/1186) (classification)
overall class: Accuracy = 94.0135%

The best parameters are c=8 and gamma=0.0625. 

Note when the parameters are not select properly, say c=8, gamma=4, the accuracy is as low as 60%. So, parameter selection is really important!!!!

More examples

You may find the following examples useful. Each code is built for some specific application, which might be useful to the reader to download and tweak just to save your developing time. 
  • Big picture: In this scenario, I compiled an easy example to illustrate how to use svm in full process. The code contains:
    • data generation
    • determining train and test data set
    • parameter selection using n-fold cross validation, both semi-manual and the automatic approach
    • train the svm model using one-versus-rest (OVR) approach
    • use the svm model to classify the test set in OVR mode
    • make confusion matrix to evaluate the results
    • show the results in an informative way
    • display the decision boundary on the feature space 
  • Reporting a results using n-fold cross validation: In case you have only 1 data set (i.e., there is no explicit train or test set), n-fold cross validation is a conventional way to assess a classifier. The overall accuracy is obtained by averaging the accuracy per each of the n-fold cross validation. The observations are separated into n folds equally, the code use n-1 folds to train the svm model which will be used to classify the remaining 1 fold according to standard OVR. The code can be found here.
  • Using multiclass ovr-svm with kernel: So far I haven't shown the usage of ovr-svm with kernel specific ('-t x'). In fact, you can add the kernel to any ovr code, they will work. The complete code can be found here.
    • For parameter selection using cross validation, we use the code below to calculate the average accuracy cv. You can just add '-t x' to the code.
      cmd = ['-q -c ', num2str(2^log2c), ' -g ', num2str(2^log2g),' -t 0'];
      cv = get_cv_ac(trainLabel, [(1:NTrain)' trainData*trainData'], cmd, Ncv);

    • Training: just add '-t x' to the training code
      bestParam = ['-q -c ', num2str(bestc), ', -g ', num2str(bestg),' -t 0'];
      model = ovrtrainBot(trainLabel, [(1:NTrain)' trainData*trainData'], bestParam);

    • Classification: the '-t x' is included in the variable model already, so you don't need to specify '-t x' again when classifying. 
      [predict_label, accuracy, decis_values] = ovrpredictBot(testLabel, [(1:NTest)' testData*trainData'], model);
      [decis_value_winner, label_out] = max(decis_values,[],2);
    • However, I found that the code can be very slow in parameter selection routine when the number of class and the number of cross validation are big (e.g., Nclass = 10, Ncv=3). I think the slow part might be caused by [(1:NTrain)' trainData*trainData'] which can be huge. Personally I like to use the default kernel (RBF), which we don't need to make the kernel matrix X*X', which might contribute to a pretty quick speed.
  • Complete example for classification using n-fold cross validation: This code works on the single data where the train and test set are combined within one single set. More details can be found here.
  • Complete example for classification using train and test data set separately: This code works on the data set where the train and test set are separated, that is, train the model using train set and use the model to classify the test set. More details can be found here.
  • How to obtain the SVM weight vector w: Please see the example code and discussion from StackOverflow.


