python线性回归与logistic回归

最新推荐文章于 2023-03-26 10:29:47 发布

章逸佳

最新推荐文章于 2023-03-26 10:29:47 发布

阅读量754

点赞数 1

分类专栏： python

本文链接：https://blog.csdn.net/weixin_43161647/article/details/93600067

版权

本文介绍了Python在机器学习中常用的线性回归和逻辑回归模型。讲解了sklearn库中的LinearRegression和LogisticRegression，展示了如何利用内置数据集进行训练。还提到了GridSearchCV用于参数调优，并简述了sklearn.metrics模块中的评估指标。

摘要由CSDN通过智能技术生成

1.Python自带的数据集
sklearn 提供多种类型的数据集
自带的小数据集（packaged dataset）：sklearn.datasets.load_
可在线下载的数据集（Downloaded Dataset）：sklearn.datasets.fetch_
计算机生成的数据集（Generated Dataset）：sklearn.datasets.make_
其中常用的自带小数据集有如下几个：
（1）鸢尾花数据集：load_iris（）：用于分类任务的数据集
（2）手写数字数据集：load_digits（）:用于分类任务或者降维任务的数据集
（3）乳腺癌数据集load-breast-cancer（）：简单经典的用于二分类任务的数据集
（4）糖尿病数据集：load-diabetes（）：经典的用于回归认为的数据集，值得注意的是，这10个特征中的每个特征都已经被处理成0均值，方差归一化的特征值。
（5）波士顿房价数据集：load-boston（）：经典的用于回归任务的数据集
（6）体能训练数据集：load-linnerud（）：经典的用于多变量回归任务的数据集。
数据集的调用方法（以小数据集中的diabetes数据集为例）

from sklearn.datasets import load_diabetes#其他数据集只用改数据集名称即可
diabetes=load_diabetes()
y=diabetes.target#提取数据集中的因变量
X=diabetes.data#提取数据集中的自变量
##这个数据集本身就是标准化过的

2.线性回归
sklearn中的LinearRegression模块提供了线性回归的功能，拟合线性模型的过程如下。
系数说明：
copy_X : boolean, optional, default True. If True, X will be copied; else, it may be overwritten.（是否复制原数据）
fit_intercept : boolean, optional, default True. whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations（是否加入截距项）
normalize : boolean, optional, default False（是否正则化，默认是不用）
This parameter is ignored when fit_intercept is set to False.（如果没有截距项进入回归，正则化选项可以自动忽略）
If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm.
If you wish to standardize, please use:class:sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False.（标准化的操作步骤，使用standard.scalar模块并设置正则化=F）
n_jobs