机器学习笔记 ---- Logistic Regression

VampireWeekend

已于 2022-05-06 10:49:09 修改

阅读量254

点赞数

分类专栏： Machine Learning 文章标签：机器学习逻辑回归 python

于 2018-07-26 23:42:30 首次发布

本文链接：https://blog.csdn.net/sinat_35406909/article/details/81172931

版权

Machine Learning 专栏收录该内容

10 篇文章 0 订阅

订阅专栏

本文详细介绍了Logistic Regression在分类问题中的应用，包括线性回归的问题、Logistic Regression模型、Sigmoid函数、决策边界、代价函数、梯度下降优化算法、多类别分类、过拟合及正则化策略，特别是L2正则化的线性回归，为理解和实践提供了全面的理论基础。

摘要由CSDN通过智能技术生成

Logistic Regression

1. Problems of Linear Regression When Applied to Classification Problem

1) h(x) may out of range
2) some unusual feature values lead to failure of classification

2. Logistic Regression Model

$1)h_{\theta}(x)=g(\theta^{T}x) = P(y=1| x ; \theta)$

where $g(z)=\frac{1}{1+e^{-z}}$ is called Sigmoid Function / Logistic Function

3. Decision Boundary

y=1 → $h_{\theta}(x)>0.5$ → $\theta^{T}x>0$
y=0 → $h_{\theta}(x)<0.5$ → $\theta^{T}x<0$
decision boundary: $h_{\theta}(x)=0.5$ → $\theta^{T}x=0$ (may be nonlinear)

4. Cost Function

$Cost(h(x),y)=\begin{cases} -log(h(x)), & y=1\\ -log(1-h(x)), & y=0 \end{cases} =-ylog(h(x))-(1-y)log(1-h(x))$
$J(\theta)=\frac{1}{m}\sum_{i=1}^{m}{Cost(h(x^{(i)}),y^{(i)})} =\frac{1}{m}\sum_{i=1}^{m}({-y^{T}log(h)-(1-y)^{T}log(1-h)})$
where $h=g(X\theta)$

5.Iteration Formula

$\theta_{j}:=\theta_{j}-\alpha *\frac{1}{m}\sum_{i=1}^{m}(h(x^{(i)}) - y^{(i)})*x_j^{(i)}$
vectorized formula:
$\theta:=\alpha *\frac{1}{m}X^{T}(g(X\theta)-y)$
(identical to linear regression)

6. Some Optimization Algorithms

Conjugate Gradient / BFGS / L-BFGS
No need to pick α and faster, but more complex

7. Multiclass Classification: one-vs-all

Train $h_{\theta}^{(i)}$ for every individual i.
When predicting, using $max_{i}(h_{\theta}^{(i)}(x))$

8.Overfitting Problems

underfit — high bias — too few features
overfit — high variance — too many features ---- fail to predict

2 solutions:

Reduce features
Regularization: Keep all features while reduce the values of some features

9. Regularization

adding $\frac{\lambda}{2m}\sum_{j=1}^{n}\theta_{j}^{2}$ to $J(\theta)$
Note that it does not contain $\theta_{0}$ !
$\lambda$ : regularization parameter, making $\theta$ small

10. Regularized Linear Regression

(1) Linear Regression

$\frac{1}{2m}(\sum_{i=1}^{m}(h(x^{(i)})-y^{(i)})^2+\lambda\sum_{j=1}^{n}\theta_{j}^{2})$
Note that it does not contain $\theta_{0}$ !

$\theta_{j}:=\theta_{j}-\alpha (\frac{1}{m}\sum_{i=1}^{m}(h(x^{(i)}) - y^{(i)})*x_j^{(i)}+\frac{\lambda}{m}\theta_{j})$ for $j \neq = 0$
which is also
$\theta_{j}:=\theta_{j}(1-\alpha\frac{\lambda}{m})-\alpha *\frac{1}{m}\sum_{i=1}^{m}(h(x^{(i)}) - y^{(i)})*x_j^{(i)}+$ for $j \neq = 0$

(2) Normal Equation

$\theta=(X^{T}X+\lambda diag(0,1,1,...1,1))^{-1}X^{T}y$ where size of diag() is (n+1)*(n+1)

VampireWeekend

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
机器学习笔记 ---- Logistic Regression

Logistic Regression1. Problems of Linear Regression When Applied to Classification Problem1) h(x) may out of range 2) some unusual feature values lead to failure of classification 2. Logisti...
复制链接

扫一扫

专栏目录