logistics回归模型的原理和实现

最新推荐文章于 2024-05-28 17:00:30 发布

想要快乐的小张

最新推荐文章于 2024-05-28 17:00:30 发布

阅读量3.6k

点赞数 3

分类专栏：机器学习文章标签： python 逻辑回归机器学习

本文链接：https://blog.csdn.net/m0_46480988/article/details/118696884

版权

机器学习专栏收录该内容

6 篇文章 1 订阅

订阅专栏

机器学习基础（七）

Logistics回归
- 原理
- 代码实现

Logistics回归

原理

Logistics回归是统计学习中的经典分类方法，是一种广义的线性回归模型。它经常被使用于二分类问题的解决上，具有不错的效果。

Logistics回归是在线性回归的基础上，加入了 $s i g m o i d$ 函数，使函数的取值分布在 $[0, 1]$ 之间，从而使模型具有分类的效果。

Logistics回归的表达式为：
$h_{\theta}(x)=g(\theta^{T}X)=\frac{1}{1+e^{-\theta^{T}x}}$

所以可以得到
$P(Y=1|x)=h_{\theta}(x)\qquad P(Y=0|x)=1-h_{\theta}(x)$

然后就可以得到模型的似然函数为
$L(\theta)=\prod^{n}_{i=1}(h_{\theta}(x_{i}))^{y_{i}}(1-h_{\theta}(x_{i}))^{1-y_{i}}$

即当 $y_{i}=0$ 时函数取 $h_{\theta}(x)$ ，当 $y_{i}=1$ 时函数取 $1-h_{\theta}(x)$

然后对似然函数取对数，得到
$\ln(L(\theta))=\sum^{n}_{i=1}(y_{i}\ln(h_{\theta}(x_{i}))+(1-y_{i})\ln(1-h _{\theta}(x))))$

然后通过求解模型的极值，就可以得到最优的 $\theta$ 值，这也是看作Logistics函数的损失函数，这是所有数据的总损失。但似然函数是取模型最大值时的 $\theta$ 值，损失函数是需要求损失最小，所以可以将似然函数取负，然后取平均每个数据的损失，这样可以减少计算量。
$cost(h_{\theta}(x),y)=-\frac{1}{n}\sum^{n}_{i=1}(y_{i}\ln(h_{\theta}(x_{i}))+(1-y_{i})\ln(1-h _{\theta}(x))))$

代码实现

#导入所需的模块
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler

#导入癌症数据
data = load_breast_cancer()
x = pd.DataFrame(data.data,columns=data["feature_names"])
y = data.target
#切分数据集
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.3,random_state=1234)

#数据集进行标准化处理
std = StandardScaler()
x_train = std.fit_transform(x_train)
x_test = std.transform(x_test)

#建立logistics回归模型
#模型默认使用l2正则化，C是指定正则化的参数
LR = LogisticRegression(C=60)
LR.fit(x_train,y_train)
#查看模型的准确率
print(LR.score(x_test,y_test))
y_pre = LR.predict(x_test)
#查看模型的召回率
print(classification_report(x_pre,y_test,target_names=data.target_names))

想要快乐的小张

关注

3
点赞
踩
19

收藏

觉得还不错? 一键收藏
0
评论
logistics回归模型的原理和实现

机器学习基础（七）Logistics回归原理代码实现Logistics回归原理Logistics回归是统计学习中的经典分类方法，是一种广义的线性回归模型。它经常被使用于二分类问题的解决上，具有不错的效果。Logistics回归是在线性回归的基础上，加入了sigmoidsigmoidsigmoid函数，使函数的取值分布在[0,1][0,1][0,1]之间，从而使模型具有分类的效果。Logistics回归的表达式为：hθ(x)=g(θTX)=11+e−θTxh_{\theta}(x)=g(\thet
复制链接

扫一扫

专栏目录