吴恩达机器学习逻辑回归练习题：逻辑回归及规则化（octave实现）

最新推荐文章于 2024-04-19 21:23:52 发布

爱小白兔的大懒熊

最新推荐文章于 2024-04-19 21:23:52 发布

阅读量2.5k

点赞数 4

分类专栏：机器学习文章标签：机器学习吴恩达逻辑回归编程题 octave答案

本文链接：https://blog.csdn.net/yu_dian931122/article/details/85772791

版权

机器学习专栏收录该内容

4 篇文章 1 订阅

订阅专栏

练习题背景：网易云课堂->吴恩达机器学习课程->逻辑回归练习题

Regularized logistic regression

Logistic Regression

中，将建立一个逻辑回归模型，用来预测学生是否被高校录取。

ex2.m中的代码框架会引导我们来完成整个逻辑回归练习，我们只需要填充相应的函数文件中的代码，然后调用ex2这个函数即可。框架中的具体步骤，在后文中将不再进行详细的描述，我们可能要来回的阅读ex2.m中的代码，来清晰整个运行流程。

ex2.m中的主要步骤：

%% 初始化
clear ; close all; clc

%% 加载数据

data = load('ex2data1.txt');
X = data(:, [1, 2]); y = data(:, 3);

%% 可视化
plotData(X, y);
hold on;
xlabel('Exam 1 score')
ylabel('Exam 2 score')
legend('Admitted', 'Not admitted')
hold off;


%% ============ 计算损失和梯度 ============
[m, n] = size(X);
X = [ones(m, 1) X];

% 初始化theta
initial_theta = zeros(n + 1, 1);

% Compute and display initial cost and gradient
[cost, grad] = costFunction(initial_theta, X, y);

fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Expected cost (approx): 0.693\n');
fprintf('Gradient at initial theta (zeros): \n');
fprintf(' %f \n', grad);
fprintf('Expected gradients (approx):\n -0.1000\n -12.0092\n -11.2628\n');

%% ============= 使用fminumc求解  =============
%  设置fminunc参数
options = optimset('GradObj', 'on', 'MaxIter', 400);

%  求解，这里会使用到costFunction
[theta, cost] = ...
	fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);

% Print theta to screen
fprintf('Cost at theta found by fminunc: %f\n', cost);
fprintf('Expected cost (approx): 0.203\n');
fprintf('theta: \n');
fprintf(' %f \n', theta);
fprintf('Expected theta (approx):\n');
fprintf(' -25.161\n 0.206\n 0.201\n');

% 绘制决策边界
plotDecisionBoundary(theta, X, y);

% Put some labels 
hold on;
% Labels and Legend
xlabel('Exam 1 score')
ylabel('Exam 2 score')

legend('Admitted', 'Not admitted')
hold off;

%% ============== 预测及模型精度计算 ==============

prob = sigmoid([1 45 85] * theta);
fprintf(['For a student with scores 45 and 85, we predict an admission ' ...
         'probability of %f\n'], prob);
fprintf('Expected value: 0.775 +/- 0.002\n\n');

% Compute accuracy on our training set
p = predict(theta, X);

fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);
fprintf('Expected accuracy (approx): 89.0\n');
fprintf('\n');

数据可视化

在实现一个学习算法之前，通常的做法是将数据进行可视化以方便我们进行观察。在这个练习中，使用的数据为以往申请入学学生的历史数据，每个学生数据样本包含该学生两门考试的分数和最终的录取结果。

在这个练习中，使用plotData函数来将数据展示在2维平面空间中，所以我们首先需要在plotData.m文件中将plotData函数的代码补充完整。

function plotData(X, y)
%PLOTDATA Plots the data points X and y into a new figure 
%   PLOTDATA(x,y) plots the data points with + for the positive examples
%   and o for the negative examples. X is assumed to be a Mx2 matrix.

% Create New Figure
figure; hold on;

% ====================== YOUR CODE HERE ======================
% Instructions: Plot the positive and negative examples on a
%               2D plot, using the option 'k+' for the positive
%               examples and 'ko' for the negative examples.
%

% 直接使用联系题中提示的代码进行填充

% 找出y中所有值为1的索引，保存在pos中；找出y中所有值为0的索引，保存在neg中
pos = find(y==1); neg = find(y == 0);
% 根据索引，找出y为1对应的x，并将x在二维空间中展示出来
plot(X(pos, 1), X(pos,2), 'k+', 'LineWidth', 2, 'MarkerSize', 7);
% 根据索引，找出y为0对应的x，并将x在二维空间中展示出来
plot(X(neg, 1), X(neg,2), 'ko', 'MarkerFaceColor', 'y', 'MarkerSize', 7);


% =========================================================================

hold off;

end

完成plotData函数填充后，在octave中运行ex2即可看到数据的可视化结果：

逻辑回归实现

sigmoid函数实现

回顾一下假设函数的定义，逻辑回归的假设函数的定义如下：

$h_\theta(x)=g(\theta^Tx)$ ，其中 $g(z)=\frac{1}{1+e^{-z}}$ ， $g(z)$ 也就是我们说的sigmoid函数。

我们第一步需要在sigmoid.m文件中实现sigmoid函数，且要求sigmoid函数不仅能处理单个数值，还需要能处理矩阵。

sigmoid.m文件代码如下：

function g = sigmoid(z)
%SIGMOID Compute sigmoid function
%   g = SIGMOID(z) computes the sigmoid of z.

% You need to return the following variables correctly 
g = zeros(size(z));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the sigmoid of each value of z (z can be a matrix,
%               vector or scalar).

g = power(1 + exp(-z), -1);


% =============================================================

end

损失函数及梯度

逻辑回归的损失函数 $J(\theta)$ 定义如下：

$J(\theta)=\frac{1}{m}\sum_{i=1}^{m}[-y^{(i)}log(h_{\theta}(x^{(i)}))-(1-y^{(i)})log(1-h_{\theta}(x^{(i)}))]$

（注意区分这里逻辑回归的假设函数和线性回归的假设函数，以及他们的损失函数。）

梯度（也就是 $J(\theta)$ 的偏导数）如下：

$\frac{\partial J(\theta)}{\partial \theta_j}=\frac{1}{m}\sum_{i=1}^{m}(h_{\theta}(x^{(i)})-y^{(i)})x_j^{(i)}$

（课程中对孙淑函数求偏导数的过程没有详细的解释，可参考逻辑回归损失函数求导）

在ex2中，会调用costFunction函数来计算损失和梯度，所以我们需要将损失和梯度的计算过程填充到costFunction.m文件中。

并且需要与线性回归练习题区分的是，在线性回归练习题中，我们是自行实现梯度下降算法来求解假设函数 $h_{\theta}(x)$ 的参数，所以我们根据自己的需要，将损失计算和梯度下降分别实现在不同的函数中；但是在逻辑回归练习题中，我们将借助fminumc函数来求解参数，所以需要以fminumc的规范来定义costFunction。

costFunction.m代码实现如下：

function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
%   J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
%   parameter for logistic regression and the gradient of the cost
%   w.r.t. to the parameters.

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
%


h = sigmoid(X * theta);
J = ((-y)' * log(h) - (1-y)' * log(1 - h)) / m;
grad = ((h - y)' * X)' ./ m;


% =============================================================

end

使用fminumc函数学习参数

fminumc函数是一个优化求解器，它能够找到无约束函数的最小值，对于逻辑回归，我们的目的是找到使损失函数 $J(\theta)$ 最小的所有参数 $\theta$ 。

评估逻辑回归

在建立好逻辑回归模型后，我们可以使用模型来预测一个学生是否会被高校录取，一种方法是给出学生的考试成绩，通过模型计算出其被录取的可能性；另一种方法是绘制出决策边界。

在ex2中决策边界的绘制代码不需要我们完成，我们需要完成的是predict函数。predict函数中，使用我们前面学习好的参数 $\theta$ 和学生的分数，来预测该学生是否能被录取（返回0或者1）。

pridict.m代码：

function p = predict(theta, X)
%PREDICT Predict whether the label is 0 or 1 using learned logistic 
%regression parameters theta
%   p = PREDICT(theta, X) computes the predictions for X using a 
%   threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)

m = size(X, 1); % Number of training examples

% You need to return thetest
p = zeros(m, 1);

% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
%               your learned logistic regression parameters. 
%               You should set p to a vector of 0's and 1's
%

pos = find(sigmoid(X * theta) >= 0.5);
p(pos, 1) = 1;


% =========================================================================


end

ex2函数的输出内容

完成所有的函数代码填充后，直接调用ex2函数，即可完成整个过程。

ex2函数执行过程中，各个步骤的输出内容如下：

Cost at initial theta (zeros): 0.693147
Expected cost (approx): 0.693
Gradient at initial theta (zeros):
 -0.100000
 -12.009217
 -11.262842
Expected gradients (approx):
 -0.1000
 -12.0092
 -11.2628

Cost at test theta: 0.218330
Expected cost (approx): 0.218
Gradient at test theta:
 0.042903
 2.566234
 2.646797
Expected gradients (approx):
 0.043
 2.566
 2.647

Program paused. Press enter to continue.
Cost at theta found by fminunc: 0.203498
Expected cost (approx): 0.203
theta:
 -25.161272
 0.206233
 0.201470
Expected theta (approx):
 -25.161
 0.206
 0.201

Program paused. Press enter to continue.
For a student with scores 45 and 85, we predict an admission probability of 0.776
289
Expected value: 0.775 +/- 0.002

Train Accuracy: 89.000000
Expected accuracy (approx): 89.0

绘制出两张图片，分别是数据的可视化、决策边界

Regularized logistic regression

在练习题的第二个部分中，我们将使用规则化的逻辑回归，来预测芯片是否能通过QA（质量检测）。

ex2_reg.m中的代码框架会引导我们来完成整个逻辑回归练习，我们只需要填充相应的函数文件中的代码，然后调用ex2_reg这个函数即可。并且，在ex2_reg中会复用上一部分练习中ex2中的许多函数，这里将不再进行重复性的描述。

ex2_reg的主要步骤：

%% 初始化
clear ; close all; clc

%% 加载数据
data = load('ex2data2.txt');
X = data(:, [1, 2]); y = data(:, 3);

%% 可视化数据
plotData(X, y);
hold on;

xlabel('Microchip Test 1')
ylabel('Microchip Test 2')
legend('y = 1', 'y = 0')
hold off;


%% =========== 规则化逻辑回归 ============
% 增加多项式特征
X = mapFeature(X(:,1), X(:,2));

% 初始化参数theta
initial_theta = zeros(size(X, 2), 1);

% 规则化参数 lambda 设置为 1
lambda = 1;

% 计算初始参数的损失和梯度
[cost, grad] = costFunctionReg(initial_theta, X, y, lambda);

fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Expected cost (approx): 0.693\n');
fprintf('Gradient at initial theta (zeros) - first five values only:\n');
fprintf(' %f \n', grad(1:5));
fprintf('Expected gradients (approx) - first five values only:\n');
fprintf(' 0.0085\n 0.0188\n 0.0001\n 0.0503\n 0.0115\n');

fprintf('\nProgram paused. Press enter to continue.\n');
pause;

%% ============= 实现并评估 =============
% 初始化参数theta
initial_theta = zeros(size(X, 2), 1);

% 规则化参数 lambda 设置为 1
lambda = 1;

% 构建fminunc参数
options = optimset('GradObj', 'on', 'MaxIter', 400);

% Optimize
[theta, J, exit_flag] = ...
	fminunc(@(t)(costFunctionReg(t, X, y, lambda)), initial_theta, options);

% 绘制决策边界
plotDecisionBoundary(theta, X, y);
hold on;
title(sprintf('lambda = %g', lambda))
xlabel('Microchip Test 1')
ylabel('Microchip Test 2')
legend('y = 1', 'y = 0', 'Decision boundary')
hold off;

% 评估
p = predict(theta, X);

fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);
fprintf('Expected accuracy (with lambda = 1): 83.1 (approx)\n');

数据可视化

复用ex2中的数据可视化函数，运行ex2_reg函数可得到如下图像：

特征映射

从上面的数据可视化结果可以看出，用简单的线性拟合肯定无法得到效果较好的拟合效果，所以在这个逻辑回归的建立过程中，我们需要使用多项式特征来得到更好的拟合效果。

在mapFeature函数中，可以将特征映射到x1和x2的所有多项式项，直到第六次幂。这个函数不需要我们自行实现。

mapFeature.m：

function out = mapFeature(X1, X2)
% MAPFEATURE Feature mapping function to polynomial features
%
%   MAPFEATURE(X1, X2) maps the two input features
%   to quadratic features used in the regularization exercise.
%
%   Returns a new feature array with more features, comprising of 
%   X1, X2, X1.^2, X2.^2, X1*X2, X1*X2.^2, etc..
%
%   Inputs X1, X2 must be the same size
%

degree = 6;
out = ones(size(X1(:,1)));
for i = 1:degree
    for j = 0:i
        out(:, end+1) = (X1.^(i-j)).*(X2.^j);
    end
end

end

损失和梯度

在使用多项式拟合的过程中，非常容易出现过拟合的情况，尤其是在样本数量较少的情况下，为了改善过拟合的情况，需要对上部分练习题中的损失函数 $J(\theta)$ 进行一定的修改，也就是加上规则项。在这部分练习题中，使用的是L2规则。

$J(\theta)=\frac{1}{m}\sum_{i=1}^{m}[-y^{(i)}log(h_{\theta}(x^{(i)}))-(1-y^{(i)})log(1-h_{\theta}(x^{(i)}))]+\frac{\lambda}{2m}\sum_{j=1}^n\theta_j^2$

要注意的是，我们通常不对 $\theta_0$ 进行规则化，所以在计算梯度时，要分成两个部分：

$\frac{\partial J(\theta)}{\partial \theta_j}=\frac{1}{m}\sum_{i=1}^{m}(h_{\theta}(x^{(i)})-y^{(i)})x_j^{(i)}$ ，当 $j=0$ 时；

$\frac{\partial J(\theta)}{\partial \theta_j}=\frac{1}{m}\sum_{i=1}^{m}(h_{\theta}(x^{(i)})-y^{(i)})x_j^{(i)}+\frac{\lambda}{m}\theta_j$ ，当 $j\geq 1$ 时

在接下来的步骤中，同样使用fminunc进行参数学习，所以我们需要在costFunctionReg中同时计算出损失和梯度。

costFunctionReg.m代码：

function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
%   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
%   theta as the parameter for regularized logistic regression and the
%   gradient of the cost w.r.t. to the parameters. 

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta

theta_r = theta;
theta_r(1, 1) = 0;    % 规则化项中不计算theta0，也就是theta(1,1)
h = sigmoid(X * theta);
J = ((-y)' * log(h) - (1-y)' * log(1 - h)) / m + theta_r' * theta_r * (lambda / (2 * m));

grad = ((h - y)' * X)' ./ m + theta_r .* (lambda / m);


% =============================================================

end

使用fminunc函数学习参数

绘制决策边界

绘制决策边界的函数plotDecisionBoundary同样不需要我们自行实现。

ex2_reg函数执行结果

函数执行过程中输出的内容：

Cost at initial theta (zeros): 0.693147
Expected cost (approx): 0.693
Gradient at initial theta (zeros) - first five values only:
 0.008475
 0.018788
 0.000078
 0.050345
 0.011501
Expected gradients (approx) - first five values only:
 0.0085
 0.0188
 0.0001
 0.0503
 0.0115

Program paused. Press enter to continue.

Cost at test theta (with lambda = 10): 3.164509
Expected cost (approx): 3.16
Gradient at test theta - first five values only:
 0.346045
 0.161352
 0.194796
 0.226863
 0.092186
Expected gradients (approx) - first five values only:
 0.3460
 0.1614
 0.1948
 0.2269
 0.0922

Program paused. Press enter to continue.
Train Accuracy: 83.050847
Expected accuracy (with lambda = 1): 83.1 (approx)

函数执行过程中生成的两张图像：数据可视化、决策边界：

以上就是逻辑回归练习题的全部内容

下载已经填充好代码的所有习题文件，请点这里

逻辑回归练习题的Python实现代码

也可以发邮件至294562919@qq.com

爱小白兔的大懒熊

关注

4
点赞
踩
10

收藏

觉得还不错? 一键收藏
2
评论
吴恩达机器学习逻辑回归练习题：逻辑回归及规则化（octave实现）

练习题背景：网易云课堂-&gt;吴恩达机器学习课程-&gt;逻辑回归练习题目录Logistic Regression数据可视化逻辑回归实现sigmoid函数实现损失函数及梯度使用fminumc函数学习参数评估逻辑回归ex2函数的输出内容Regularized logistic regression数据可视化特征映射损失和梯度使用fminunc...
复制链接

扫一扫