ex2 machine learning 整作业解析-2-Regularized logistic regression

最新推荐文章于 2023-03-13 19:04:00 发布

假程序员99999

最新推荐文章于 2023-03-13 19:04:00 发布

阅读量394

点赞数

文章标签：机器学习 matlab

本文链接：https://blog.csdn.net/weixin_46304344/article/details/105611823

版权

ex2 machine learning -2-Regularized logistic regression

1.导入数据

在这里插入图片描述
和ex2-1类似的结构，共三列，前两列是自变量，最后是因变量

2.常规操作

clear ; close all; clc
data = load(‘ex2data2.txt’);
X = data(:, [1, 2]); y = data(:, 3);
plotData(X, y);
hold on;
xlabel(‘Microchip Test 1’)
ylabel(‘Microchip Test 2’)
legend(‘y = 1’, ‘y = 0’)
hold off;
这部分代码解释参见ex2 machine learning 整作业解析-1-Logistic Regression
下一句是
X = mapFeature(X(:,1), X(:,2));
在pdf中看到对mapFeature的介绍

3.mapFeature（）

在这里插入图片描述
查看mapFeature函数内部

function out = mapFeature(X1, X2)
% MAPFEATURE Feature mapping function to polynomial features
%
%   MAPFEATURE(X1, X2) maps the two input features
%   to quadratic features used in the regularization exercise.
%
%   Returns a new feature array with more features, comprising of 
%   X1, X2, X1.^2, X2.^2, X1*X2, X1*X2.^2, etc..
%
%   Inputs X1, X2 must be the same size
%

degree = 6;
out = ones(size(X1(:,1)));
for i = 1:degree
    for j = 0:i
        out(:, end+1) = (X1.^(i-j)).*(X2.^j);
    end
end

end

接受参数X1X2：X = mapFeature(X(:,1), X(:,2));
可以看到，是将X的第一列和第二列传进去，也就是两个同等维数的列向量
目标是生成所有6次方x1，x2的组合，可知这样的组合有28个，即1, x1, x2, x1^2 ,x1x2, x2^2 ,…, x1^6 , x2 ^6

在这里插入图片描述
可以看到X是一个118行28列的矩阵
怎么得到的呢？

查看y，可以看到y是118行的列向量，所以X是有118行的
那么也就是说，X的每一行，本身的两个数的（2列），现在要变成28列，也就是说，每一行的两个数，即x1，x2，要拓展成一个一行28列。
那么我们可以先建立一个118行28列的1矩阵，然后它的每一行，就等于原矩阵的一行变幻出28行后的行向量。
另一方面，也可以看成，输出的矩阵第一列是1，第二列是原先的x1，第三是x2，第四是x1.*x1(x1的各个元素逐一相乘), x1.*x2，x2.*x2，etc
out = ones(size(X1(:,1)));
size(X1(:,1))得到的是一个向量，即，（x1的行数，x1的列数）
out初始化为一个全为1的列向量，行数与x1相同

这里的 i x1x2的总次方，因为这个feature要从0次方一直到6次方；j是x2的指数。

out(:, end+1) = (X1.^(i-j)).*(X2.^j);

end+1，就是在最后增加一列。

% Initialize fitting parameters
initial_theta = zeros(size(X, 2), 1);

% Set regularization parameter lambda to 1
lambda = 1;

theta的初值为维数为X的列数的列向量

4.costFunctionReg()

[cost, grad] = costFunctionReg(initial_theta, X, y, lambda);
cost是一个数，即，这个theta值的cost
grad是一个列向量，维数是theta的维数

function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
%   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
%   theta as the parameter for regularized logistic regression and the
%   gradient of the cost w.r.t. to the parameters. 
% Initialize some useful values
m = length(y); % number of training examples
% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
h = sigmoid(X * theta);
J = -1 * y' *log(h) - (1 - y)' * log(1 - h) + sum(theta(2:end).^2) / 2 * lambda;
J = J / m;

	grad = (1 / m * (h - y)' * X)' + lambda / m * theta;
	grad(1)=grad(1)-lambda/m*theta(1);
% =============================================================
end

参见解析ex2 machine learning 整作业解析-1-Logistic Regression中对costFunction的解释

J

值得注意的是，加入的theta各个元素的平方，不包含theta0
在ex2-1中的J = -1 * y’ *log(h) - (1 - y)’ * log(1 - h); J = J / m;
在这里，J要增多一项
(theta(2:end)：取出theta第二行到最后一行所有的元素
theta(2:end).^2：单个元素逐一平方
接着求和，etc…

grad

在这里插入图片描述
与没有regularize的相比

多了一项λ/mθj, 但是不包含θ0

也就是gradient1是不包含regularization的这一项的
那么可以先统一地加上这一项（方便计算）
grad = (1 / m * (h - y)’ * X)’ + lambda / m * theta;
最后再针对grad(1)减去那一项
grad(1)=grad(1)-lambda/mtheta(1);

5.测试不同的lambda值

fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Expected cost (approx): 0.693\n');
fprintf('Gradient at initial theta (zeros) - first five values only:\n');
fprintf(' %f \n', grad(1:5));
fprintf('Expected gradients (approx) - first five values only:\n');
fprintf(' 0.0085\n 0.0188\n 0.0001\n 0.0503\n 0.0115\n');

fprintf('\nProgram paused. Press enter to continue.\n');
pause;

% Compute and display cost and gradient
% with all-ones theta and lambda = 10
test_theta = ones(size(X,2),1);
[cost, grad] = costFunctionReg(test_theta, X, y, 10);

fprintf('\nCost at test theta (with lambda = 10): %f\n', cost);
fprintf('Expected cost (approx): 3.16\n');
fprintf('Gradient at test theta - first five values only:\n');
fprintf(' %f \n', grad(1:5));
fprintf('Expected gradients (approx) - first five values only:\n');
fprintf(' 0.3460\n 0.1614\n 0.1948\n 0.2269\n 0.0922\n');

fprintf('\nProgram paused. Press enter to continue.\n');
pause;

在这里插入图片描述

可以看到第一次迭代的时候，cost更大，gradients也更大，因为lambda大了嘛

6.优化

initial_theta = zeros(size(X, 2), 1);

% Set regularization parameter lambda to 1 (you should vary this)
lambda = 1;

% Set Options
options = optimset('GradObj', 'on', 'MaxIter', 400);

% Optimize
[theta, J, exit_flag] = ...
	fminunc(@(t)(costFunctionReg(t, X, y, lambda)), initial_theta, options);

7.plotDecisionBoundary(theta, X, y);

在这里插入图片描述

function plotDecisionBoundary(theta, X, y)
%PLOTDECISIONBOUNDARY Plots the data points X and y into a new figure with
%the decision boundary defined by theta
%   PLOTDECISIONBOUNDARY(theta, X,y) plots the data points with + for the 
%   positive examples and o for the negative examples. X is assumed to be 
%   a either 
%   1) Mx3 matrix, where the first column is an all-ones column for the 
%      intercept.
%   2) MxN, N>3 matrix, where the first column is all-ones

% Plot Data
plotData(X(:,2:3), y);
hold on

if size(X, 2) <= 3
    % Only need 2 points to define a line, so choose two endpoints
    plot_x = [min(X(:,2))-2,  max(X(:,2))+2];

    % Calculate the decision boundary line
    plot_y = (-1./theta(3)).*(theta(2).*plot_x + theta(1));

    % Plot, and adjust axes for better viewing
    plot(plot_x, plot_y)
    
    % Legend, specific for the exercise
    legend('Admitted', 'Not admitted', 'Decision Boundary')
    axis([30, 100, 30, 100])
else
    % Here is the grid range
    u = linspace(-1, 1.5, 50);
    v = linspace(-1, 1.5, 50);

    z = zeros(length(u), length(v));
    % Evaluate z = theta*x over the grid
    for i = 1:length(u)
        for j = 1:length(v)
            z(i,j) = mapFeature(u(i), v(j))*theta;
        end
    end
    z = z'; % important to transpose z before calling contour

    % Plot z = 0
    % Notice you need to specify the range [0, 0]
    contour(u, v, z, [0, 0], 'LineWidth', 2)
end
hold off

end

从size(X)可以看到，这里X的列数要大于3，所以执行else
也就是说，ex2-1用的是两点确定一条直线的方法，这里得到的函数不再是直线，所以要取很多个点，一个一个画出来，看起来就是一条平滑的曲线
在这里插入图片描述

 for i = 1:length(u)
        for j = 1:length(v)
            z(i,j) = mapFeature(u(i), v(j))*theta;
        end
    end

u(i)对应x轴坐标，v(j)对应y轴坐标。mapFreature得到一个1 * 28的矩阵（因为传入的x1，x2只是标量而已），theta是28*1的列向量。所以得到一个数。
在这里插入图片描述
可以看到z是一个50 * 50的矩阵
z = z’
最后绘制等高线：contour(u, v, z, [0, 0], ‘LineWidth’, 2)
参见matlab中contour 函数的用法（绘制等高线）

假程序员99999

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
ex2 machine learning 整作业解析-2-Regularized logistic regression

ex2 machine learning -2-Regularized logistic regression1.导入数据和ex2-1类似的结构，共三列，前两列是自变量，最后是因变量2.常规操作clear ; close all; clcdata = load(‘ex2data2.txt’);X = data(:, [1, 2]); y = data(:, 3);plotData(...
复制链接

扫一扫