Coursera机器学习 Week3 编程作业: Logistic Regression

最新推荐文章于 2023-10-08 11:31:36 发布

萧瑟1

最新推荐文章于 2023-10-08 11:31:36 发布

阅读量520

点赞数 1

分类专栏：机器学习文章标签：逻辑回归机器学习

本文链接：https://blog.csdn.net/qq_41410799/article/details/95628871

版权

机器学习专栏收录该内容

12 篇文章 0 订阅

订阅专栏

此次作业要求实现的内容一个是逻辑回归，另一个是正则逻辑回归。

Logistic Regression

此次逻辑回归模型是通过学生成绩预测学生是否会被大学录取。
训练数据给出了学生两次考试的成绩和录取情况，这里用1表示录取，0表示未被录取。

Visualizing the data

这里要对plotData.m进行编写来实现可视化，ppt已将答案给出。
在这里插入图片描述

plotData.m：

function plotData(X, y)
%PLOTDATA Plots the data points X and y into a new figure 
%   PLOTDATA(x,y) plots the data points with + for the positive examples
%   and o for the negative examples. X is assumed to be a Mx2 matrix.

% Create New Figure
figure; hold on;

% ====================== YOUR CODE HERE ======================
% Instructions: Plot the positive and negative examples on a
%               2D plot, using the option 'k+' for the positive
%               examples and 'ko' for the negative examples.
%
pos = find(y==1); neg= find(y==0);
plot (X(pos,1),X(pos,2),'k+','LineWidth',2,...
'MarkerSize',7);
plot (X(neg,1),X(neg,2),'ko','MarkerFaceColor','y',...
'MarkerSize',7);

% =========================================================================

hold off;

end

sigmoid function

因为这是一个分类问题，所以要对通过多项式求解的预测值进行约束，将其约束在0~1之间，所以需要编写约束函数。
这里约束函数用的是 $\frac{1}{1+e^{-z}}$
这是 $\frac{1}{1+e^{-z}}$ 的图像：
在这里插入图片描述
可以看出当x=0时，y=0.5，我们可以通过0.5这个边界来进行判定，如果大于等于0.5，则认为是1，如果小于0.5，则认为是0。
回到题目中来，我们可以设定
$g_\theta(x) = \theta^tX$
然后将 $g_\theta(x)$ 带入sigmoid函数中，求得 $h_\theta(x)$ 。

sigmoid function

function g = sigmoid(z)
%SIGMOID Compute sigmoid function
%   g = SIGMOID(z) computes the sigmoid of z.

% You need to return the following variables correctly 
g = zeros(size(z));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the sigmoid of each value of z (z can be a matrix,
%               vector or scalar).
g = exp(-z);
g = 1./(1+g);

% =============================================================

end

Cost function and gradient

这一部分需要我们通过costFunction.m实现求代价函数和梯度，求代价函数 $J(\theta)$ 的公式如下：
$J(\theta) = -\frac{1}{m}\sum_{i=1}^{m}[y^{(i)}*log(h_\theta(x^{(i)}))+(1-y^{(i)})*log(1-h_\theta(x^{(i)}))]$
求梯度的公式如下：
$\frac{\partial J(\theta)}{\partial \theta_j} = \sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})*x_j^{(i)}$

costFunction.m

function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
%   J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
%   parameter for logistic regression and the gradient of the cost
%   w.r.t. to the parameters.

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
%
g = X*theta;
h = sigmoid(g);
J = -1/m*(y'*log(h)+(1-y')*log(1-h));
grad = 1/m*(X'*(h-y));

% =============================================================

end

Learning parameters using fminunc

fminunc表示Octave里无约束最小化函数，调用这个函数时，需要传入一个存有配置信息的变量options。上面的代码中，我们的设置项中’GradObj’, ‘on’,代表设置梯度目标参数为打开状态(on)，这也意味着你现在确实要给这个算法提供一个梯度。’MaxIter’, ‘100’代表设置最大迭代次数为100次。initialTheta代表我们给出的一个θ的猜测初始值。

然后我们调用fminunc这个函数，传入三个参数，其中第一个参数@costFunction这里的@符号代表指向之前我们定义的costFunction函数的指针。后面两个参数分别是我们定义的thetatheta初始值和配置信息options。

当我们调用这个fminunc函数时，它会自动的从众多高级优化算法中挑选一个来使用(你也可以把它当做一个可以自动选择合适的学习速率aa的梯度下降算法)。

最终我们会得到三个返回值，分别是满足最小化代价函数J(θ)的θ值optTheta，costFunction中定义的jVal的值functionVal，以及标记是否已经收敛的状态值exitFlag，如果已收敛，标记为1，否则为0。
转自：afunyusong的博客

% Initialize fitting parameters
initial_theta = zeros(n + 1, 1);

% Compute and display initial cost and gradient
[cost, grad] = costFunction(initial_theta, X, y);
options = optimset('GradObj', 'on', 'MaxIter', 400);
[theta, cost] = ...
	fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);

Evaluating logistic regression

通过 plotDecisionBoundary.m 绘制的图像如下：
在这里插入图片描述
之后就是需要编写predict.m函数，来实现对训练数据运用逻辑回归后的预测结果与真实结果进行比较，算出准确率有多高。
通过将数据集带入 $h_\theta(x)$ 中求出结果，如果求出的值大于等于0.5，则预测为1，如果小于0.5，则预测为0。

predict.m

function p = predict(theta, X)
%PREDICT Predict whether the label is 0 or 1 using learned logistic 
%regression parameters theta
%   p = PREDICT(theta, X) computes the predictions for X using a 
%   threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)

m = size(X, 1); % Number of training examples

% You need to return the following variables correctly
p = zeros(m, 1);

% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
%               your learned logistic regression parameters. 
%               You should set p to a vector of 0's and 1's
%
g = X*theta;
h = sigmoid(g);
p = round(h);

% =========================================================================

end

以上就是Logistic Regression部分。

Regularized logistic regression

正则化主要是解决因模型复杂度太高而造成的过拟合现象，正则化的代价函数如下：
$J(_\theta)= -\frac{1}{m}\sum_{i=1}^{m}[y^{(i)}*log(h_\theta(x^{(i)}))+(1-y^{(i)})*log(1-h_\theta(x^{(i)}))]+\frac{\lambda}{2m}\sum_{j=1}^{n}\theta^2_j$

注意：这里的 $\theta$ 不包括 $\theta_0$ 。即不能对theta(1)进行正则化

正则化后的梯度如下：
$j=\frac{\partial J(\theta)}{\partial \theta_j} = \sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})*x_j^{(i)} \qquad j=0$
$j=\frac{\partial J(\theta)}{\partial \theta_j} = \sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})*x_j^{(i)}+\frac{\lambda}{m}\sum_{j=1}^{n}\theta_j \qquad j>0$
知道这些后就可以开始做题了

Visualizing the data

还是跟上面的题意类似，只是函数图像发生了改变
在这里插入图片描述
所以这里不能通过建立直线的模型来进行拟合，所以需要多建立几种特征，Octave中 mapFeature.m可以建立最高项次数为6的特征。

Cost function and gradient

有了上述的公式我们就可以完成costFunctionReg.m函数了。

costFunctionReg.m

function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
%   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
%   theta as the parameter for regularized logistic regression and the
%   gradient of the cost w.r.t. to the parameters. 

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0;
grad = zeros(size(theta));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
g = X*theta;
h = sigmoid(g);
J = (-1/m)*(y'*log(h)+(1-y)'*(log(1-h)))+((lambda/(2*m))*(theta'*theta-theta(1)*theta(1)));
grad = 1/m*(X'*(h-y))+(lambda/m*theta);
grad(1) = grad(1) - lambda/m*(theta(1));

% =============================================================

end

带入fminunc即可求出无约束最小化函数。

Plotting the decision boundary

利用训练出来的模型来画出图像
在这里插入图片描述
以上就是正则逻辑回归的过程

萧瑟1

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
1
评论
Coursera机器学习 Week3 编程作业: Logistic Regression

此次作业要求实现的内容一个是逻辑回归，另一个是正则逻辑回归。Logistic Regression此次逻辑回归模型是通过学生成绩预测学生是否会被大学录取。训练数据给出了学生两次考试的成绩和录取情况，这里用1表示录取，0表示未被录取。Visualizing the data这里要对plotData.m进行编写来实现可视化，ppt已将答案给出。plotData.m：function p...
复制链接

扫一扫