机器学习-支持向量机（SVM）

最新推荐文章于 2022-02-02 06:30:00 发布

z2664836046

最新推荐文章于 2022-02-02 06:30:00 发布

阅读量514

点赞数 1

本文链接：https://blog.csdn.net/z2664836046/article/details/95203110

版权

1、使用svm对二维数据集进行分类

①使用线性边界函数进行分类

训练数据的可视化函数：

function plotData(X, y)
%PLOTDATA Plots the data points X and y into a new figure 
%   PLOTDATA(x,y) plots the data points with + for the positive examples
%   and o for the negative examples. X is assumed to be a Mx2 matrix.
%
% Note: This was slightly modified such that it expects y = 1 or y = 0

% Find Indices of Positive and Negative Examples
pos = find(y == 1); neg = find(y == 0);

% Plot Examples
plot(X(pos, 1), X(pos, 2), 'k+','LineWidth', 1, 'MarkerSize', 7)
hold on;
plot(X(neg, 1), X(neg, 2), 'ko', 'MarkerFaceColor', 'y', 'MarkerSize', 7)
hold off;

end

在命令行运行代码对于第一组数据进行数据可视化

load('ex6data1.mat');
plotData(X,y);

效果如图所示：

训练svm分类器

当C=1时，在命令行执行语句进行训练分类器并进行边界可视化：

model=svmTrain(X,y,C,@linearKernel,0.001,20);
visualizeBoundaryLinear(X,y,model);

由于C比较小，欠拟合，有一个点并没有进行正确的分类，效果如图：

当C=100时，效果如图：

②使用非线性分界函数进行分类

训练数据可视化如图：

对于线性不可分的使用高斯核函数进行训练

单独使用训练数据中的特征变量不能进行区分，所以需要增加一些新的特征变量，可以使用这些特征的高阶多项式进行划分，但是运算量非常大。

可以通过标记点以及核函数定义新的特征变量，从而训练复杂的非线性边界，在本篇使用高斯内核函数。

高斯内核函数的实现：

function sim = gaussianKernel(x1, x2, sigma)
%RBFKERNEL returns a radial basis function kernel between x1 and x2
%   sim = gaussianKernel(x1, x2) returns a gaussian kernel between x1 and x2
%   and returns the value in sim

% Ensure that x1 and x2 are column vectors
x1 = x1(:); x2 = x2(:);

% You need to return the following variables correctly.
sim = 0;

% ====================== YOUR CODE HERE ======================
% Instructions: Fill in this function to return the similarity between x1
%               and x2 computed using a Gaussian kernel with bandwidth
%               sigma
%
%
sim=exp(-1.0*(x1-x2)'*(x1-x2)/(2.0*sigma*sigma));

% =============================================================
    
end

使用高斯核函数进行训练非线性分类器

C=1;
sigma=0.1;
svmTrain(X,y,C,@(x1,x2)gaussianKernel(x1,x2,sigma));
visualizeBoundary(X,y,model);

3、使用交叉验证选择合适的C和δ值。

使用X,y做为训练集，Xval,yval做为测试集来选择合适的C和δ值

function [C, sigma] = dataset3Params(X, y, Xval, yval)
%DATASET3PARAMS returns your choice of C and sigma for Part 3 of the exercise
%where you select the optimal (C, sigma) learning parameters to use for SVM
%with RBF kernel
%   [C, sigma] = DATASET3PARAMS(X, y, Xval, yval) returns your choice of C and 
%   sigma. You should complete this function to return the optimal C and 
%   sigma based on a cross-validation set.
%
% You need to return the following variables correctly.
C = 1;
sigma = 0.3;

% ====================== YOUR CODE HERE ======================
% Instructions: Fill in this function to return the optimal C and sigma
%               learning parameters found using the cross validation set.
%               You can use svmPredict to predict the labels on the cross
%               validation set. For example, 
%                   predictions = svmPredict(model, Xval);
%               will return the predictions on the cross validation set.
%
%  Note: You can compute the prediction error using 
%        mean(double(predictions ~= yval))
%
wrong=1;
eg=[0.01,0.03,0.1,0.3,1,3,10,30];
for i=1:length(eg)
    for j=1:length(eg)
        model=svmTrain(X,y,eg(i),@(x1,x2)gaussianKernel(x1,x2,eg(j)));
        predictions=svmPredict(model,Xval);
        twrong=mean(double(predictions~=yval));
        
        if twrong<wrong
            wrong=twrong;
            C=eg(i);
            sigma=eg(j);
        end
    end
end
% =========================================================================

end

使用得到的C和δ值进行svm分类器的训练，并进行边界划分。

[C, sigma] = dataset3Params(X, y, Xval, yval);
model = svmTrain(X, y, C, @(x1, x2)gaussianKernel(x1, x2, sigma));
visualizeBoundary(X, y, model);

z2664836046

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
机器学习-支持向量机（SVM）

1、使用svm对二维数据集进行分类①使用线性边界函数进行分类训练数据的可视化函数：function plotData(X, y)%PLOTDATA Plots the data points X and y into a new figure % PLOTDATA(x,y) plots the data points with + for the positive examp...
复制链接

扫一扫