吴恩达机器学习逻辑回归练习题:逻辑回归及规则化(octave实现)

练习题背景:网易云课堂->吴恩达机器学习课程->逻辑回归练习题

目录

Logistic Regression

数据可视化

逻辑回归实现

sigmoid函数实现

损失函数及梯度

使用fminumc函数学习参数

评估逻辑回归

ex2函数的输出内容

Regularized logistic regression

数据可视化

特征映射

损失和梯度

使用fminunc函数学习参数

绘制决策边界

ex2_reg函数执行结果


Logistic Regression

中,将建立一个逻辑回归模型,用来预测学生是否被高校录取。

ex2.m中的代码框架会引导我们来完成整个逻辑回归练习,我们只需要填充相应的函数文件中的代码,然后调用ex2这个函数即可。框架中的具体步骤,在后文中将不再进行详细的描述,我们可能要来回的阅读ex2.m中的代码,来清晰整个运行流程。

ex2.m中的主要步骤:

%% 初始化
clear ; close all; clc

%% 加载数据

data = load('ex2data1.txt');
X = data(:, [1, 2]); y = data(:, 3);

%% 可视化
plotData(X, y);
hold on;
xlabel('Exam 1 score')
ylabel('Exam 2 score')
legend('Admitted', 'Not admitted')
hold off;


%% ============ 计算损失和梯度 ============
[m, n] = size(X);
X = [ones(m, 1) X];

% 初始化theta
initial_theta = zeros(n + 1, 1);

% Compute and display initial cost and gradient
[cost, grad] = costFunction(initial_theta, X, y);

fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Expected cost (approx): 0.693\n');
fprintf('Gradient at initial theta (zeros): \n');
fprintf(' %f \n', grad);
fprintf('Expected gradients (approx):\n -0.1000\n -12.0092\n -11.2628\n');

%% ============= 使用fminumc求解  =============
%  设置fminunc参数
options = optimset('GradObj', 'on', 'MaxIter', 400);

%  求解,这里会使用到costFunction
[theta, cost] = ...
	fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);

% Print theta to screen
fprintf('Cost at theta found by fminunc: %f\n', cost);
fprintf('Expected cost (approx): 0.203\n');
fprintf('theta: \n');
fprintf(' %f \n', theta);
fprintf('Expected theta (approx):\n');
fprintf(' -25.161\n 0.206\n 0.201\n');

% 绘制决策边界
plotDecisionBoundary(theta, X, y);

% Put some labels 
hold on;
% Labels and Legend
xlabel('Exam 1 score')
ylabel('Exam 2 score')

legend('Admitted', 'Not admitted')
hold off;

%% ============== 预测及模型精度计算 ==============

prob = sigmoid([1 45 85] * theta);
fprintf(['For a student with scores 45 and 85, we predict an admission ' ...
         'probability of %f\n'], prob);
fprintf('Expected value: 0.775 +/- 0.002\n\n');

% Compute accuracy on our training set
p = predict(theta, X);

fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);
fprintf('Expected accuracy (approx): 89.0\n');
fprintf('\n');



数据可视化

在实现一个学习算法之前,通常的做法是将数据进行可视化以方便我们进行观察。在这个练习中,使用的数据为以往申请入学学生的历史数据,每个学生数据样本包含该学生两门考试的分数和最终的录取结果。

在这个练习中,使用plotData函数来将数据展示在2维平面空间中,所以我们首先需要在plotData.m文件中将plotData函数的代码补充完整。

function plotData(X, y)
%PLOTDATA Plots the data points X and y into a new figure 
%   PLOTDATA(x,y) plots the data points with + for the positive examples
%   and o for the negative examples. X is assumed to be a Mx2 matrix.

% Create New Figure
figure; hold on;

% ====================== YOUR CODE HERE ======================
% Instructions: Plot the positive and negative examples on a
%               2D plot, using the option 'k+' for the positive
%               examples and 'ko' for the negative examples.
%

% 直接使用联系题中提示的代码进行填充

% 找出y中所有值为1的索引,保存在pos中;找出y中所有值为0的索引,保存在neg中
pos = find(y==1); neg = find(y == 0);
% 根据索引,找出y为1对应的x,并将x在二维空间中展示出来
plot(X(pos, 1), X(pos,2), 'k+', 'LineWidth', 2, 'MarkerSize', 7);
% 根据索引,找出y为0对应的x,并将x在二维空间中展示出来
plot(X(neg, 1), X(neg,2), 'ko', 'MarkerFaceColor', 'y', 'MarkerSize', 7);


% =========================================================================

hold off;

end

完成plotData函数填充后,在octave中运行ex2即可看到数据的可视化结果:

逻辑回归实现

sigmoid函数实现

回顾一下假设函数的定义,逻辑回归的假设函数的定义如下:

h_\theta(x)=g(\theta^Tx),其中g(z)=\frac{1}{1+e^{-z}}g(z)也就是我们说的sigmoid函数。

我们第一步需要在sigmoid.m文件中实现sigmoid函数,且要求sigmoid函数不仅能处理单个数值,还需要能处理矩阵。

sigmoid.m文件代码如下:

function g = sigmoid(z)
%SIGMOID Compute sigmoid function
%   g = SIGMOID(z) computes the sigmoid of z.

% You need to return the following variables correctly 
g = zeros(size(z));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the sigmoid of each value of z (z can be a matrix,
%               vector or scalar).

g = power(1 + exp(-z), -1);


% =============================================================

end

损失函数及梯度

逻辑回归的损失函数J(\theta)定义如下:

J(\theta)=\frac{1}{m}\sum_{i=1}^{m}[-y^{(i)}log(h_{\theta}(x^{(i)}))-(1-y^{(i)})log(1-h_{\theta}(x^{(i)}))]

(注意区分这里逻辑回归的假设函数和线性回归的假设函数,以及他们的损失函数。)

梯度(也就是J(\theta)的偏导数)如下:

\frac{\partial J(\theta)}{\partial \theta_j}=\frac{1}{m}\sum_{i=1}^{m}(h_{\theta}(x^{(i)})-y^{(i)})x_j^{(i)}

(课程中对孙淑函数求偏导数的过程没有详细的解释,可参考逻辑回归损失函数求导

在ex2中,会调用costFunction函数来计算损失和梯度,所以我们需要将损失和梯度的计算过程填充到costFunction.m文件中。

并且需要与线性回归练习题区分的是,在线性回归练习题中,我们是自行实现梯度下降算法来求解假设函数h_{\theta}(x)的参数,所以我们根据自己的需要,将损失计算和梯度下降分别实现在不同的函数中;但是在逻辑回归练习题中,我们将借助fminumc函数来求解参数,所以需要以fminumc的规范来定义costFunction。

costFunction.m代码实现如下:

function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
%   J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
%   parameter for logistic regression and the gradient of the cost
%   w.r.t. to the parameters.

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
%


h = sigmoid(X * theta);
J = ((-y)' * log(h) - (1-y)' * log(1 - h)) / m;
grad = ((h - y)' * X)' ./ m;


% =============================================================

end

使用fminumc函数学习参数

fminumc函数是一个优化求解器,它能够找到无约束函数的最小值,对于逻辑回归,我们的目的是找到使损失函数J(\theta)最小的所有参数\theta

评估逻辑回归

在建立好逻辑回归模型后,我们可以使用模型来预测一个学生是否会被高校录取,一种方法是给出学生的考试成绩,通过模型计算出其被录取的可能性;另一种方法是绘制出决策边界。

在ex2中决策边界的绘制代码不需要我们完成,我们需要完成的是predict函数。predict函数中,使用我们前面学习好的参数\theta和学生的分数,来预测该学生是否能被录取(返回0或者1)。

pridict.m代码:

function p = predict(theta, X)
%PREDICT Predict whether the label is 0 or 1 using learned logistic 
%regression parameters theta
%   p = PREDICT(theta, X) computes the predictions for X using a 
%   threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)

m = size(X, 1); % Number of training examples

% You need to return thetest
p = zeros(m, 1);

% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
%               your learned logistic regression parameters. 
%               You should set p to a vector of 0's and 1's
%

pos = find(sigmoid(X * theta) >= 0.5);
p(pos, 1) = 1;


% =========================================================================


end

ex2函数的输出内容

完成所有的函数代码填充后,直接调用ex2函数,即可完成整个过程。

ex2函数执行过程中,各个步骤的输出内容如下:

Cost at initial theta (zeros): 0.693147
Expected cost (approx): 0.693
Gradient at initial theta (zeros):
 -0.100000
 -12.009217
 -11.262842
Expected gradients (approx):
 -0.1000
 -12.0092
 -11.2628

Cost at test theta: 0.218330
Expected cost (approx): 0.218
Gradient at test theta:
 0.042903
 2.566234
 2.646797
Expected gradients (approx):
 0.043
 2.566
 2.647

Program paused. Press enter to continue.
Cost at theta found by fminunc: 0.203498
Expected cost (approx): 0.203
theta:
 -25.161272
 0.206233
 0.201470
Expected theta (approx):
 -25.161
 0.206
 0.201

Program paused. Press enter to continue.
For a student with scores 45 and 85, we predict an admission probability of 0.776
289
Expected value: 0.775 +/- 0.002

Train Accuracy: 89.000000
Expected accuracy (approx): 89.0

绘制出两张图片,分别是数据的可视化、决策边界


Regularized logistic regression

在练习题的第二个部分中,我们将使用规则化的逻辑回归,来预测芯片是否能通过QA(质量检测)。

ex2_reg.m中的代码框架会引导我们来完成整个逻辑回归练习,我们只需要填充相应的函数文件中的代码,然后调用ex2_reg这个函数即可。并且,在ex2_reg中会复用上一部分练习中ex2中的许多函数,这里将不再进行重复性的描述。

ex2_reg的主要步骤:

%% 初始化
clear ; close all; clc

%% 加载数据
data = load('ex2data2.txt');
X = data(:, [1, 2]); y = data(:, 3);

%% 可视化数据
plotData(X, y);
hold on;

xlabel('Microchip Test 1')
ylabel('Microchip Test 2')
legend('y = 1', 'y = 0')
hold off;


%% =========== 规则化逻辑回归 ============
% 增加多项式特征
X = mapFeature(X(:,1), X(:,2));

% 初始化参数theta
initial_theta = zeros(size(X, 2), 1);

% 规则化参数 lambda 设置为 1
lambda = 1;

% 计算初始参数的损失和梯度
[cost, grad] = costFunctionReg(initial_theta, X, y, lambda);

fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Expected cost (approx): 0.693\n');
fprintf('Gradient at initial theta (zeros) - first five values only:\n');
fprintf(' %f \n', grad(1:5));
fprintf('Expected gradients (approx) - first five values only:\n');
fprintf(' 0.0085\n 0.0188\n 0.0001\n 0.0503\n 0.0115\n');

fprintf('\nProgram paused. Press enter to continue.\n');
pause;

%% ============= 实现并评估 =============
% 初始化参数theta
initial_theta = zeros(size(X, 2), 1);

% 规则化参数 lambda 设置为 1
lambda = 1;

% 构建fminunc参数
options = optimset('GradObj', 'on', 'MaxIter', 400);

% Optimize
[theta, J, exit_flag] = ...
	fminunc(@(t)(costFunctionReg(t, X, y, lambda)), initial_theta, options);

% 绘制决策边界
plotDecisionBoundary(theta, X, y);
hold on;
title(sprintf('lambda = %g', lambda))
xlabel('Microchip Test 1')
ylabel('Microchip Test 2')
legend('y = 1', 'y = 0', 'Decision boundary')
hold off;

% 评估
p = predict(theta, X);

fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);
fprintf('Expected accuracy (with lambda = 1): 83.1 (approx)\n');

数据可视化

复用ex2中的数据可视化函数,运行ex2_reg函数可得到如下图像:

特征映射

从上面的数据可视化结果可以看出,用简单的线性拟合肯定无法得到效果较好的拟合效果,所以在这个逻辑回归的建立过程中,我们需要使用多项式特征来得到更好的拟合效果。

在mapFeature函数中,可以将特征映射到x1和x2的所有多项式项,直到第六次幂。这个函数不需要我们自行实现。

mapFeature.m:

function out = mapFeature(X1, X2)
% MAPFEATURE Feature mapping function to polynomial features
%
%   MAPFEATURE(X1, X2) maps the two input features
%   to quadratic features used in the regularization exercise.
%
%   Returns a new feature array with more features, comprising of 
%   X1, X2, X1.^2, X2.^2, X1*X2, X1*X2.^2, etc..
%
%   Inputs X1, X2 must be the same size
%

degree = 6;
out = ones(size(X1(:,1)));
for i = 1:degree
    for j = 0:i
        out(:, end+1) = (X1.^(i-j)).*(X2.^j);
    end
end

end

损失和梯度

在使用多项式拟合的过程中,非常容易出现过拟合的情况,尤其是在样本数量较少的情况下,为了改善过拟合的情况,需要对上部分练习题中的损失函数J(\theta)进行一定的修改,也就是加上规则项。在这部分练习题中,使用的是L2规则。

J(\theta)=\frac{1}{m}\sum_{i=1}^{m}[-y^{(i)}log(h_{\theta}(x^{(i)}))-(1-y^{(i)})log(1-h_{\theta}(x^{(i)}))]+\frac{\lambda}{2m}\sum_{j=1}^n\theta_j^2

要注意的是,我们通常不对\theta_0进行规则化,所以在计算梯度时,要分成两个部分:

\frac{\partial J(\theta)}{\partial \theta_j}=\frac{1}{m}\sum_{i=1}^{m}(h_{\theta}(x^{(i)})-y^{(i)})x_j^{(i)} , 当j=0时;

\frac{\partial J(\theta)}{\partial \theta_j}=\frac{1}{m}\sum_{i=1}^{m}(h_{\theta}(x^{(i)})-y^{(i)})x_j^{(i)}+\frac{\lambda}{m}\theta_j,当j\geq 1

在接下来的步骤中,同样使用fminunc进行参数学习,所以我们需要在costFunctionReg中同时计算出损失和梯度。

costFunctionReg.m代码:

function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
%   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
%   theta as the parameter for regularized logistic regression and the
%   gradient of the cost w.r.t. to the parameters. 

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta

theta_r = theta;
theta_r(1, 1) = 0;    % 规则化项中不计算theta0,也就是theta(1,1)
h = sigmoid(X * theta);
J = ((-y)' * log(h) - (1-y)' * log(1 - h)) / m + theta_r' * theta_r * (lambda / (2 * m));

grad = ((h - y)' * X)' ./ m + theta_r .* (lambda / m);


% =============================================================

end

使用fminunc函数学习参数

 

绘制决策边界

绘制决策边界的函数plotDecisionBoundary同样不需要我们自行实现。

ex2_reg函数执行结果

函数执行过程中输出的内容:

Cost at initial theta (zeros): 0.693147
Expected cost (approx): 0.693
Gradient at initial theta (zeros) - first five values only:
 0.008475
 0.018788
 0.000078
 0.050345
 0.011501
Expected gradients (approx) - first five values only:
 0.0085
 0.0188
 0.0001
 0.0503
 0.0115

Program paused. Press enter to continue.

Cost at test theta (with lambda = 10): 3.164509
Expected cost (approx): 3.16
Gradient at test theta - first five values only:
 0.346045
 0.161352
 0.194796
 0.226863
 0.092186
Expected gradients (approx) - first five values only:
 0.3460
 0.1614
 0.1948
 0.2269
 0.0922

Program paused. Press enter to continue.
Train Accuracy: 83.050847
Expected accuracy (with lambda = 1): 83.1 (approx)

函数执行过程中生成的两张图像:数据可视化、决策边界:

以上就是逻辑回归练习题的全部内容

下载已经填充好代码的所有习题文件,请点这里

逻辑回归练习题的Python实现代码

也可以发邮件至294562919@qq.com

  • 4
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
Programming Exercise 1: Linear Regression Machine Learning Introduction In this exercise, you will implement linear regression and get to see it work on data. Before starting on this programming exercise, we strongly recom- mend watching the video lectures and completing the review questions for the associated topics. To get started with the exercise, you will need to download the starter code and unzip its contents to the directory where you wish to complete the exercise. If needed, use the cd command in Octave/MATLAB to change to this directory before starting this exercise. You can also find instructions for installing Octave/MATLAB in the “En- vironment Setup Instructions” of the course website. Files included in this exercise ex1.m - Octave/MATLAB script that steps you through the exercise ex1 multi.m - Octave/MATLAB script for the later parts of the exercise ex1data1.txt - Dataset for linear regression with one variable ex1data2.txt - Dataset for linear regression with multiple variables submit.m - Submission script that sends your solutions to our servers [?] warmUpExercise.m - Simple example function in Octave/MATLAB [?] plotData.m - Function to display the dataset [?] computeCost.m - Function to compute the cost of linear regression [?] gradientDescent.m - Function to run gradient descent [†] computeCostMulti.m - Cost function for multiple variables [†] gradientDescentMulti.m - Gradient descent for multiple variables [†] featureNormalize.m - Function to normalize features [†] normalEqn.m - Function to compute the normal equations ? indicates files you will need to complete † indicates optional exercises

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值