机器学习-支持向量机(SVM)

1、使用svm对二维数据集进行分类

①使用线性边界函数进行分类

训练数据的可视化函数:

function plotData(X, y)
%PLOTDATA Plots the data points X and y into a new figure 
%   PLOTDATA(x,y) plots the data points with + for the positive examples
%   and o for the negative examples. X is assumed to be a Mx2 matrix.
%
% Note: This was slightly modified such that it expects y = 1 or y = 0

% Find Indices of Positive and Negative Examples
pos = find(y == 1); neg = find(y == 0);

% Plot Examples
plot(X(pos, 1), X(pos, 2), 'k+','LineWidth', 1, 'MarkerSize', 7)
hold on;
plot(X(neg, 1), X(neg, 2), 'ko', 'MarkerFaceColor', 'y', 'MarkerSize', 7)
hold off;

end

在命令行运行代码对于第一组数据进行数据可视化

load('ex6data1.mat');
plotData(X,y);

效果如图所示:

训练svm分类器

当C=1时,在命令行执行语句进行训练分类器并进行边界可视化:

model=svmTrain(X,y,C,@linearKernel,0.001,20);
visualizeBoundaryLinear(X,y,model);

由于C比较小,欠拟合,有一个点并没有进行正确的分类,效果如图:

当C=100时,效果如图:

②使用非线性分界函数进行分类

训练数据可视化如图:

对于线性不可分的使用高斯核函数进行训练

单独使用训练数据中的特征变量不能进行区分,所以需要增加一些新的特征变量,可以使用这些特征的高阶多项式进行划分,但是运算量非常大。

可以通过标记点以及核函数定义新的特征变量,从而训练复杂的非线性边界,在本篇使用高斯内核函数。

高斯内核函数的实现:

function sim = gaussianKernel(x1, x2, sigma)
%RBFKERNEL returns a radial basis function kernel between x1 and x2
%   sim = gaussianKernel(x1, x2) returns a gaussian kernel between x1 and x2
%   and returns the value in sim

% Ensure that x1 and x2 are column vectors
x1 = x1(:); x2 = x2(:);

% You need to return the following variables correctly.
sim = 0;

% ====================== YOUR CODE HERE ======================
% Instructions: Fill in this function to return the similarity between x1
%               and x2 computed using a Gaussian kernel with bandwidth
%               sigma
%
%
sim=exp(-1.0*(x1-x2)'*(x1-x2)/(2.0*sigma*sigma));

% =============================================================
    
end

使用高斯核函数进行训练非线性分类器

C=1;
sigma=0.1;
svmTrain(X,y,C,@(x1,x2)gaussianKernel(x1,x2,sigma));
visualizeBoundary(X,y,model);

3、使用交叉验证选择合适的C和δ值。

使用X,y做为训练集,Xval,yval做为测试集来选择合适的C和δ值

function [C, sigma] = dataset3Params(X, y, Xval, yval)
%DATASET3PARAMS returns your choice of C and sigma for Part 3 of the exercise
%where you select the optimal (C, sigma) learning parameters to use for SVM
%with RBF kernel
%   [C, sigma] = DATASET3PARAMS(X, y, Xval, yval) returns your choice of C and 
%   sigma. You should complete this function to return the optimal C and 
%   sigma based on a cross-validation set.
%
% You need to return the following variables correctly.
C = 1;
sigma = 0.3;

% ====================== YOUR CODE HERE ======================
% Instructions: Fill in this function to return the optimal C and sigma
%               learning parameters found using the cross validation set.
%               You can use svmPredict to predict the labels on the cross
%               validation set. For example, 
%                   predictions = svmPredict(model, Xval);
%               will return the predictions on the cross validation set.
%
%  Note: You can compute the prediction error using 
%        mean(double(predictions ~= yval))
%
wrong=1;
eg=[0.01,0.03,0.1,0.3,1,3,10,30];
for i=1:length(eg)
    for j=1:length(eg)
        model=svmTrain(X,y,eg(i),@(x1,x2)gaussianKernel(x1,x2,eg(j)));
        predictions=svmPredict(model,Xval);
        twrong=mean(double(predictions~=yval));
        
        if twrong<wrong
            wrong=twrong;
            C=eg(i);
            sigma=eg(j);
        end
    end
end
% =========================================================================

end

使用得到的C和δ值进行svm分类器的训练,并进行边界划分。

[C, sigma] = dataset3Params(X, y, Xval, yval);
model = svmTrain(X, y, C, @(x1, x2)gaussianKernel(x1, x2, sigma));
visualizeBoundary(X, y, model);

 

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值