[深度之眼机器学习训练营第四期]对数几率回归

最新推荐文章于 2023-02-24 11:19:25 发布

nudt_oys

最新推荐文章于 2023-02-24 11:19:25 发布

阅读量216

点赞数

分类专栏：机器学习文章标签：逻辑回归机器学习

本文链接：https://blog.csdn.net/qq_26658823/article/details/103941528

版权

机器学习专栏收录该内容

21 篇文章 1 订阅

订阅专栏

基本概念

对数几率回归（Logistic Regression，又称逻辑回归）可以用来解决二分类和多分类问题。分类问题中，输出集合不再是连续值，而是离散值，即 $\mathcal{Y}\in \{0,1,2,\cdots\}$ 。以二分类问题为例，其输出集合一般为 $\mathcal{Y}\in \{0,1\}$ 。

为了解决二分类问题，对数几率回归在线性回归的基础上引入Sigmoid函数（Logistic函数），其中 $\exp(\cdot)$ 是自然指数：
$\dfrac{1}{1 +\exp({-z})}\\$
该函数的值域为 $[0, 1]$ ，如下图所示：
在这里插入图片描述
因此，对数几率回归中假设集的定义为：
$h_\theta (x) = g ( \theta^T x )$

实际上， $h_{\theta}(x)$ 给出了在给定参数 $\theta$ 和样本 $x$ 的条件下，标签 $y = 1$ 的概率。
$\begin{aligned}& h_\theta(x) = P(y=1 | x ; \theta) = 1 - P(y=0 | x ; \theta) \\& P(y = 0 | x;\theta) + P(y = 1 | x ; \theta) = 1\end{aligned}$

损失函数

对数几率回归的损失函数如下所示：
$J(\theta) = \dfrac{1}{n} \sum_{i=1}^N \mathrm{Cost}(h_\theta(x^{(i)}),y^{(i)}) \\ \mathrm{Cost}(h_\theta(x^{(i)}),y^{(i)}) =\left\{ \begin{aligned} &-\log(h_\theta(x^{(i)})) \; & \text{if }y^{(i)} = 1\\ &-\log(1-h_\theta(x^{(i)})) \; & \text{if } y^{(i)} = 0 \end{aligned} \right.$
该损失函数通过极大似然法导出。对于给定的输入集 $\mathcal{X}$ 和输出集 $\mathcal{Y}$ ，其似然函数为：
$\prod _{i = 1}^n \left[h_\theta(x^{(i)})\right]^{y^{(i)}}\left[1 - h_\theta(x^{(i)})\right]^{1 - y^{(i)}}$

由于连乘不好优化，因此上式两边取对数，转化成连加的形式，得到对数似然函数：
$L(\theta)=\frac{1}{n} \sum _{i=1}^n \left[ y^{(i)} \log h_\theta(x^{(i)}) + (1-y^{(i)})\log(1 - h_\theta(x^{(i)})) \right ]$
最大化上述对数似然函数就可以得到最优的参数 $\theta$ 。而最大化对数似然函数 $L(\theta)$ 等价于最小化 $L(\theta)$ ，因此我们可以得到如下损失函数的形式：
$J(\theta) = -\frac{1}{n} \sum _{i=1}^n \left[ y^{(i)} \log h_\theta(x^{(i)}) + (1-y^{(i)})\log(1 - h_\theta(x^{(i)})) \right ]$

参数学习

得到损失函数后，需要使用梯度下降法求解该函数的最小值。首先，将损失函数进行化简：
$\begin{aligned} J(\theta) &=-\frac{1}{n} \sum _{i=1}^N \left[ y^{(i)} \log h_\theta(x^{(i)}) + (1-y^{(i)})\log(1 - h_\theta(x^{(i)})) \right ] \\ &=-\frac{1}{n} \sum _{i=1}^n \left[ y^{(i)}\log \frac {h_\theta(x^{(i)})} {1 - h_\theta(x^{(i)})} + \log(1 - h_\theta(x^{(i)})) \right ] \\ &=-\frac{1}{n} \sum _{i=1}^n \left[ y^{(i)} \log \frac { {\exp(\theta\cdot x^{(i)})} / (1 + \exp(\theta\cdot x^{(i)}))} {{1} /(1 + \exp(\theta\cdot x^{(i)}))} + \log(1 - h_\theta(x^{(i)})) \right ] \\ &=-\frac{1}{n} \sum _{i=1}^n \left[ y_i (\theta\cdot x^{(i)}) + \log(1 + \exp (\theta\cdot x^{(i)})) \right ] \end{aligned}$

求解损失函数 $J(\theta)$ 对参数 $\theta$ 的偏导数：
$\begin{aligned} \frac{\partial}{\partial \theta}J(\theta) &=-\frac{1}{n} \sum _{i=1}^n \left [y^{(i)} \cdot x^{(i)} - \frac {1} {1 + \exp(\theta \cdot x^{(i)})} \cdot \exp(\theta \cdot x^{(i)}) \cdot x^{(i)}\right ] \\ &=-\frac{1}{n} \sum _{i=1}^n \left [y^{(i)} \cdot x^{(i)} - \frac {\exp(\theta \cdot x^{(i)})} {1 + \exp(\theta \cdot x^{(i)})} \cdot x^{(i)}\right ] \\ &=-\frac{1}{n} \sum _{i=1}^n \left (y^{(i)} - \frac {\exp(\theta \cdot x^{(i)})} {1 + \exp(\theta \cdot x^{(i)})} \right ) x^{(i)}\\ &=\frac{1}{n} \sum _{i=1}^n \left (h_\theta(x^{(i)})-y^{(i)} \right )x^{(i)} \end{aligned}$

使用梯度下降法逐个更新参数：
$\theta_j \coloneqq \theta_j - \frac{\alpha}{n} \sum_{i=1}^n \left(h_\theta(x^{(i)}) - y^{(i)}\right) x_j^{(i)}$

nudt_oys

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
[深度之眼机器学习训练营第四期]对数几率回归

J(θ)=1m∑i=1mCost(hθ(x(i)),y(i))Cost(hθ(x),y)={−log⁡(hθ(x)) if y = 1−log⁡(1−hθ(x)) if y = 0J(\theta) = \dfrac{1}{m} \sum_{i=1}^m \mathrm{Cost}(h_\theta(x^{(i)}),y^{(i)}...
复制链接

扫一扫

专栏目录