机器学习: 逻辑回归(Logistic Regression) 小项目

最新推荐文章于 2024-07-17 00:46:41 发布

小威威__

最新推荐文章于 2024-07-17 00:46:41 发布

阅读量3.5k

点赞数 2

分类专栏： machine-learning 文章标签：机器学习逻辑回归正规化梯度下降优化函数

本文链接：https://blog.csdn.net/linwh8/article/details/77837325

版权

该项目的所有代码在我的github上，欢迎有兴趣的同学与我探讨研究~
地址：Machine-Learning/machine-learning-ex2/

1. Introduction

逻辑回归(Logistic Regression), 在Wiki上的定义如下：

In statistics, logistic regression, or logit regression, or logit model[1] is a regression model where the dependent variable (DV) is categorical. This article covers the case of a binary dependent variable—that is, where it can take only two values, “0” and “1”, which represent outcomes such as pass/fail, win/lose, alive/dead or healthy/sick. Cases where the dependent variable has more than two outcome categories may be analysed in multinomial logistic regression, or, if the multiple categories are ordered, in ordinal logistic regression.[2] In the terminology of economics, logistic regression is an example of a qualitative response/discrete choice model.

以我的理解，逻辑回归也是数据拟合的一种工具，它用于解决分类问题，即结果是离散型的，而这类问题往往用线性回归不能很好的解决。因此，对于分类问题，我们应当选择逻辑回归而不是线性回归。

逻辑回归主要有两类：

Logistic Regression with two outcome, it often called logistic regression.
Logistic Regression with more than two outcome, it often call multinomial logistic Regression

对于逻辑回归，那么不得不提到Decision Boundary(决策边界)。决策边界是用于将不同类别的数据划分开来，它也有两种类型：

Linear decision boundaries 线性决策边界
Non-linear decision boundaries 非线性决策边界

通常，我们会通过梯度下降或者其它优化算法得到最终的theta。然后利用theta，划出决策边界。注意：决策边界是假设函数的属性，与训练集并没有直接关系，它是由假设函数的theta决定的。

逻辑回归的假设函数，Sigmoid函数，这个函数的特征如下：

当X为0时，函数值为0.5；
当X<0时，函数值< 0.5, 当X > 0时，函数值>0.5;
当X趋近于负无穷时，函数值趋近于0；
当X趋近于正无穷时，函数值趋近于1；
函数形状类似于一个S，所以称为S型函数。

假设函数的函数值代表的是是输出为1的概率，可以设置一个值如0.5：当函数值>=0.5时，output为1, 当函数值 < 0.5时， output为0.

对于损失函数，如果直接以代入假设函数求得，那么该损失函数将会是“non-convex”(非凸性)，这将对梯度下降的速率有很大的影响。所以便有了对数形式的损失函数。因为是分类问题，所以损失函数有两种情况，但通过一些技巧可以将它们合二为一。

在编写代码时，将式子以向量化(Vectorization)的形式也是很方便的。

而梯度，就是损失函数J(theta)对theta的偏导。

为了解决多结果的回归问题，我们要使用的是one-VS-all的算法。例如我的结果有三类（A、B、C），我可以用3个分类器实现（AB，AC，BC）。然后将X带入每个分类器，哪个值高选哪个。

对于拟合，有三类情况：

Underfitting 欠拟合：没有较好拟合数据，预测准确率不高
Overfitting 过拟合 ：过分拟合数据，导致泛化程度不高，影响预测的准确率
Oridinary 普通 ：较好拟合数据

Underfitting, or high bias, is when the form of our hypothesis function h maps poorly to the trend of the data. It is usually caused by a function that is too simple or uses too few features. At the other extreme, overfitting, or high variance, is caused by a hypothesis function that fits the available data but does not generalize well to predict new data. It is usually caused by a complicated function that creates a lot of unnecessary curves and angles unrelated to the data.

解决Overfitting，主要有两种方法：

Reduce the number of features;
- Manually select which features to keep.
- Use a model selection algorithm (studied later in the course).
Regularization.
- Keep all the features, but reduce the magnitude of parameters theta
- Regularization works well when we have a lot of slightly useful features

下面谈谈Regularization (正规化)：

正规化就是通过添加多余的项以减少theta的大小，从而避免出现过拟合的情况。我们知道，当feature过多时，会导致过拟合的情况。为了解决这一问题，我们要减少feature的影响，即我们需要适当减少theta的值，极限情况下是对应theta等于0，那么feature也就不发挥作用了。

那如何衡量正规化的程度呢？有lambda参数。合适的lambda参数可以使过拟合变成正常拟合，但是太大的lambada值也有可能将过拟合变成欠拟合。所以选择一个合适的lambada值也是很重要的。待会project会有所涉及。

需要掌握：
1. 逻辑回归的假设函数、损失函数、梯度下降的迭代函数；
2. 逻辑回归的梯度计算；
3. 正规化后逻辑回归的假设函数、损失函数、梯度下降的迭代函数。

2. Logistic Regression

主函数

%% Initialization
clear ; close all; clc

%% Load Data
%  The first two columns contains the exam scores and the third column
%  contains the label.

data = load('ex2data1.txt');
X = data(:, [1, 2]); y = data(:, 3);

%% ==================== Part 1: Plotting ====================
%  We start the exercise by first plotting the data to understand the 
%  the problem we are working with.

fprintf(['Plotting data with + indicating (y = 1) examples and o ' ...
         'indicating (y = 0) examples.\n']);

plotData(X, y);

% Put some labels 
hold on;
% Labels and Legend
xlabel('Exam 1 score')
ylabel('Exam 2 score')

% Specified in plot order
legend('Admitted', 'Not admitted')
hold off;

fprintf('\nProgram paused. Press enter to continue.\n');
pause;


%% ============ Part 2: Compute Cost and Gradient ============
%  In this part of the exercise, you will implement the cost and gradient
%  for logistic regression. You neeed to complete the code in 
%  costFunction.m

%  Setup the data matrix appropriately, and add ones for the intercept term
[m, n] = size(X);

% Add intercept term to x and X_test
X = [ones(m, 1) X];

% Initialize fitting parameters
initial_theta = zeros(n + 1, 1);

% Compute and display initial cost and gradient
[cost, grad] = costFunction(initial_theta, X, y);

fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Expected cost (approx): 0.693\n');
fprintf('Gradient at initial theta (zeros): \n');
fprintf(' %f \n', grad);
fprintf('Expected gradients (approx):\n -0.1000\n -12.0092\n -11.2628\n');

% Compute and display cost and gradient with non-zero theta
test_theta = [-24; 0.2; 0.2

最低0.47元/天解锁文章

小威威__

关注

2
点赞
踩
7

收藏

觉得还不错? 一键收藏
1
评论
机器学习: 逻辑回归(Logistic Regression) 小项目

该项目的所有代码在我的github上，欢迎有兴趣的同学与我探讨研究~ 地址：Machine-Learning/machine-learning-ex2/1. Introduction逻辑回归(Logistic Regression), 在Wiki上的定义如下： In statistics, logistic regression, or logit regression, or logit m
复制链接

扫一扫

专栏目录