Logistic 回归

最新推荐文章于 2022-09-14 13:08:28 发布

奋斗啊哈

最新推荐文章于 2022-09-14 13:08:28 发布

阅读量766

点赞数

分类专栏：机器学习算法文章标签： logistics

本文链接：https://blog.csdn.net/foolsnowman/article/details/51399504

版权

机器学习算法专栏收录该内容

17 篇文章 1 订阅

订阅专栏

分类

Logistics使用logistics函数对特征和输出类别的概率进行建模，而使用线性模型输出表示分类的概率时不能保证概率特性（如概率介于0、1之间），如下

P (y = t r u e | x) = w \cdot f

$P(y=true|x)=w\cdot f$ 但是使用线性输出结果可以表示不同分类发生的几率，

P ( y = t r u e | x ) P ( y = f a l s e | x ) = P ( y = t r u e | x ) 1 - P ( y = t r u e | x ) = w \cdot f

$\frac {P(y=true|x)}{P(y=false|x)}=\frac {P(y=true|x)}{1-P(y=true|x)}=w\cdot f$

上式等式两边的值域范围不同，使用自然对数解决这个问题

ln P ( y = t r u e | x ) 1 - P ( y = t r u e | x ) = w \cdot f

$\ln\frac {P(y=true|x)}{1-P(y=true|x)}=w\cdot f$
解上式得

P (y = t r u e | x) = e w \cdot f 1 + e w \cdot f = 1 1 + e - w \cdot f

$P(y=true|x)=\frac {e^{w\cdot f}}{1+e^{w\cdot f}}=\frac {1}{1+e^{-w\cdot f}}$

P (y = f a l s e | x) = 1 1 + e w \cdot f

$P(y=false|x)=\frac {1}{1+e^{w\cdot f}}$

11+e−w⋅f $\frac {1}{1+e^{-w\cdot f}}$ 是逻辑斯谛函数，其一般形式为

11+e−x $\frac {1}{1+e^{-x}}$ .
逻辑斯谛回归模型用于分类(classification,也称inference)的方法是通过计算不同类别的条件概率，如

y=true $y=true$ ,则有

P ( y = t r u e | x ) P ( y = f a l s e | x ) > 1

$\frac {P(y=true|x)}{P(y=false|x)}\gt 1$

e w \cdot f > 1

$e^{w\cdot f} \gt 1$

w \cdot f > 0

$w\cdot f \gt 0$

e w \cdot f > 1

$e^{w\cdot f} \gt 1$ 是

|w| $|w|$ 维空间的超平面，所以使用超平面的判别问题在逻辑斯谛回归模型中借助了条件概率从比较概率大小的角度进行判别。

目标函数

线性判别模型的参数学习通过缩小训练集上的误差平方和进行，逻辑斯谛模型参数的学习是通过最大化条件概率，即

w * = a r g m a x w \prod i P (y i | x i)

$w^*=\mathop{ argmax }_{w}\ \displaystyle \prod_{i}P(y_i|x_i)$ 等价于

w * = a r g m a x w \sum i log P (y i | x i)

$w^*=\mathop{ argmax }_{w}\ \displaystyle \sum_{i}\log P(y_i|x_i)$ 求解该最大值的方法有拟牛顿法、梯度下降等。
注释：逻辑斯谛回归可用于二分类问题中,而上面介绍的MEMM模型是多项逻辑斯谛模型，可用于K分类问题。

应用场景：根据学生的两门课程的分数和是否被录取的历史数据，对新的学生两门课程的分数推断其是否被录取。R实现的logistic 分类程序如下。

#Load data
data <- read.csv("data.csv")
#Create plot
plot(data$score.1,data$score.2,col=as.factor(data$label),xlab="Score-1",ylab="Score-2")

这里写图片描述

#Predictor variables
X <- as.matrix(data[,c(1,2)])
#Add ones to X
X <- cbind(rep(1,nrow(X)),X)
#Response variable
Y <- as.matrix(data$label)

定义 $P(y_i=true|\overrightarrow x_i)=\frac 1{1+e^{-\theta\overrightarrow {x_i}}}$

#Sigmoid function
sigmoid <- function(z)
{
g <- 1/(1+exp(-z))
return(g)
}

损失函数为负的极大似然函数： $-\prod P(y_i|\overrightarrow x_i)$ 等价于 $-\sum logP(y_i|\overrightarrow x_i)$ 即：

这里写图片描述

#Cost Function
cost <- function(theta)
{
m <- nrow(X)
g <- sigmoid(X%*%theta)
J <- (1/m)*sum((-Y*log(g)) - ((1-Y)*log(1-g)))
return(J)
}
#Intial theta
initial_theta <- rep(0,ncol(X))
#Cost at inital theta
cost(initial_theta)

# Derive theta using gradient descent using optim 
theta_optim <- optim(par=initial_theta,fn=cost)
theta_optim$count
print(theta_optim)
#set theta
theta <- theta_optim$par
theta
#cost at optimal value of the theta
theta_optim$value
# probability of admission for student
prob <- sigmoid(t(c(1,45,85))%*%theta)
prob

注：logistics 回归是广义线性模型中的一种特殊情况。
参考：
http://www.r-bloggers.com/logistic-regression-with-r-step-by-step-implementation-part-2/

广义线性模型
1.Agresti A. Foundations of linear and generalized linear models[M]. John Wiley & Sons, 2015.
2.Agresti A, Kateri M. Categorical data analysis[M]. Springer Berlin Heidelberg, 2011.
3.Gelman A, Hill J. Data analysis using regression and multilevel/hierarchical models[M]. Cambridge University Press, 2006.