逻辑回归 Logistic Regression

最新推荐文章于 2022-10-30 02:05:30 发布

元气少女wuqh

最新推荐文章于 2022-10-30 02:05:30 发布

阅读量591

点赞数

分类专栏：《Hand on machine learni

本文链接：https://blog.csdn.net/tsinghuahui/article/details/80375043

版权

《Hand on machine learni 专栏收录该内容

14 篇文章 4 订阅

订阅专栏

又名 Logit Regression. 通常用来估计样本属于某一类的概率。

1. 概率估计

【式-1】Logistic Regression模型估计概率（向量形式）

p ̂ = h θ (x) = σ (θ T x)

$\hat{p} = h_{\theta}(\mathbf{x}) = \sigma(\theta^T\mathbf{x})$
其中，

σ (t) = 1 1 + exp ( - t )

$\sigma(t) = \frac{1}{1+\exp(-t)}$

当得到Logistic模型的概率估计结果之后，便可得到当前样本的类别预测结果：

【式-2】Logistic 回归模型预测

ŷ ={01if  p̂ <0.5if  p̂ ≥0.5 y ^ = { 0 i f     p ^ < 0.5 1 i f     p ^ ≥ 0.5

$\hat{y}= \begin{cases} 0& if\ \ \hat{p} < 0.5 \\ 1& if\ \ \hat{p} \ge 0.5 \end{cases}$

【注意】

由于当 $\sigma(t)<0.5$ 的时候 $t<0$ ，反之类似，因此Logistic回归在具体判别的时候并不计算 $\sigma(\cdot)$ 的值，而是直接在 $\theta^T\mathbf{x}>0$ 的时候预测值为 $1$ ，反之为 $0$

2. 模型训练与代价函数

1）从直观上来讲：

【式-3】单训练样本下的代价函数

c(θ)={−log(p̂ )−log(1−p̂ )if  y=1if  y=0 c ( θ ) = { − log ⁡ ( p ^ ) i f     y = 1 − log ⁡ ( 1 − p ^ ) i f     y = 0

$c(\theta) = \begin{cases} -\log(\hat{p}) & if \ \ y=1 \\ -\log(1-\hat{p}) & if \ \ y=0 \end{cases}$

而cost function就是多个单样本的误差求和后平均：

【式-4】Logistic回归的代价函数 (log loss)

J (θ) = - 1 m \sum i = 1 m [y (i) log (p ̂ (i)) + (1 - y (i)) log (1 - p ̂ (i))]

$J(\theta) = -\frac{1}{m}\sum_{i=1}^m \left[ y^{(i)} \log\left(\hat{p}^{(i)}\right) + \left(1-y^{(i)}\right)\log\left(1-\hat{p}^{(i)}\right) \right]$

2）从概率和最大似然的角度来讲：

$y$ 的取值(0或1)可以用【式-5】来建模：

【式-5】Logistic Regression 中类别 $y$ 的概率估计

P r (y | x; θ) = h θ (x) y (1 - h θ (x)) (1 - y)

$Pr(y|x;\theta)=h_{\theta}(x)^y(1-h_{\theta}(x))^{(1-y)}$

假设所有的观测样本件都是独立的，有 Likelihood function：

L (θ | x) = P r (Y | X; θ) = \prod i P r (y i | x i; θ) = \prod i h θ (x i) y i (1 - h θ (x i)) (1 - y i)

$L(\theta|x) = Pr(Y|X;\theta) = \prod_i Pr(y_i|x_i;\theta)\\ = \prod_i h_{\theta}(x_i)^{y_i}(1-h_{\theta}(x_i))^{(1-y_i)}$

两遍取对数有 log likelihood (再用 $\frac{1}{m}$ 进行归一化):

1 m log L (θ | x) = 1 m log P r (Y | X; θ)

$\frac{1}{m}\log L(\theta|x) = \frac{1}{m}\log Pr(Y|X;\theta)$

极大似然估计问题可以建模为：

m a x 1 m log L (θ | x) = 1 m log P r (Y | X; θ)

$\ max\ \frac{1}{m}\log L(\theta|x) = \frac{1}{m}\log Pr(Y|X;\theta)$

即为

min - 1 m \sum i = 1 m [y (i) log (h θ (x i)) + (1 - y (i)) log (1 - h θ (x i))]

$\min -\frac{1}{m}\sum_{i=1}^m \left[ y^{(i)} \log\left(h_{\theta}(x_i)\right) + \left(1-y^{(i)}\right)\log\left(1-h_{\theta}(x_i)\right) \right]$

即为 $\min\ J(\theta)$ ，与 式-4 形式一致。

在线性回归部分我们提过，线性回归问题一般有两种解决方式：1）利用闭式解求解 2）利用迭代算法求解。不幸的是，Logistic回归问题目前没有闭式解，但由于代价函数是凸的，所以能够利用GD或者其他优化算法求解全局最优值：

【式-6】Logistic代价函数对第 $j$ 个参数的偏导数

\frac{\partial}{\partial θ_{j}} J (θ) = \frac{1}{m} \sum_{i = 1}^{m} (σ (θ^{T} x^{(i)}) - y^{(i)}) x_{j}^{(i)}

$\frac{\partial}{\partial \theta_j}J(\theta) = \frac{1}{m}\sum_{i=1}^m\left( \sigma\left( \theta^T\mathbf{x}^{(i)} \right) - y^{(i)}\right)x_{j}^{(i)}$

在得到 式-6 中所有参数的偏导项后记得求得梯度向量，进而由 batch GD 求解。

对 Stochastic GD 来说，每次只能利用一个样本进行计算；同样，对 mini-batch GD来说，每次需要用一个 mini-batch 进行计算。

元气少女wuqh

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
逻辑回归 Logistic Regression

又名 Logit Regression. 通常用来估计样本属于某一类的概率。1. 概率估计【式-1】Logistic Regression模型估计概率（向量形式） p̂&amp;amp;nbsp;=hθ(x)=σ(θTx)p^=hθ(x)=σ(θTx)\hat{p} = h_{\theta}(\mathbf{x}) = \sigma(\theta^T\mathbf{x}) 其中， σ(t)=...
复制链接

扫一扫