机器学习--逻辑回归

最新推荐文章于 2022-04-11 09:09:33 发布

小浩子7号

最新推荐文章于 2022-04-11 09:09:33 发布

阅读量153

点赞数

分类专栏：机器学习

本文链接：https://blog.csdn.net/qq_41782791/article/details/116084465

版权

机器学习专栏收录该内容

13 篇文章 0 订阅

订阅专栏

逻辑回归：用线性回归式子作为逻辑回归的输入，用来解决二分类问题

想把线性回归用来做二分类问题，要用sigmoid函数

小于0.5的概率归为0，大于0.5的概率归位1

二、用逻辑回归做癌症二分类问题

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import  SGDRegressor, LogisticRegression
from sklearn.metrics import classification_report    #召回率API
def logistic():
    """
    逻辑回归对癌症病情的二分类问题
    :return: None
    """
    #构造类标签名字
    column = ['Sample code number', 'Clump Thickness',
              'Uniformity of Cell Size',
              'Uniformity of Cell Shape',
              'Marginal Adhesion',
             'Single Epithelial Cell Size ',
              'Bare Nuclei',
              'Bland Chromatin',
              'Normal Nucleoli',
              'Mitoses',
              'Class:']
    data = pd.read_csv("./breast-cancer-wisconsin.data ", names=column)
    print(data)

    #处理缺失值
    data = data.replace(to_replace='?', value=np.nan)
    data = data.dropna()    #把nan类型的值删除

    #进行数据的分割
    x_train, x_test, y_train, y_test = train_test_split(data[column[1:10]], data[column[10]], test_size=0.25)  #切片时 不包括10
    print(x_train )

    #进行标准化处理
    std = StandardScaler()
    x_train = std.fit_transform(x_train)
    x_test = std.fit_transform(x_test)

    #进行逻辑回归预测
    LR = LogisticRegression(C=1.0, )
    LR.fit(x_train, y_train )
    print(LR.coef_)

    y_predict = LR.predict(x_test)

    print("准确率", LR.score(x_test, y_test) )

    print("召回率", classification_report(y_test, y_predict, labels=[2, 4], target_names=['良性', '恶性'] ) )


    return None
if "__name__" =="__name__":
    logistic()

运行结果

   Sample code number  Clump Thickness  ...  Mitoses  Class:
0               1000025                5  ...        1       2
1               1002945                5  ...        1       2
2               1015425                3  ...        1       2
3               1016277                6  ...        1       2
4               1017023                4  ...        1       2
..                  ...              ...  ...      ...     ...
694              776715                3  ...        1       2
695              841769                2  ...        1       2
696              888820                5  ...        2       4
697              897471                4  ...        1       4
698              897471                4  ...        1       4

[699 rows x 11 columns]
     Clump Thickness  Uniformity of Cell Size  ...  Normal Nucleoli  Mitoses
146                3                        4  ...                1        1
426                5                        3  ...                1        1
620                3                        1  ...                1        1
214               10                       10  ...                6        1
336                6                        5  ...                4        1
..               ...                      ...  ...              ...      ...
91                 3                        1  ...                1        1
121                4                        2  ...                1        1
421               10                       10  ...                2        1
311                1                        1  ...                1        1
344                7                        6  ...                5        3

[512 rows x 9 columns]
[[ 1.49940035 -0.03587566  1.04958732  0.50454901  0.58707274  1.36663014
   0.78557885  0.41617397  0.65672664]]
准确率 0.9532163742690059
召回率               precision    recall  f1-score   support

          良性       0.97      0.96      0.97       116
          恶性       0.91      0.95      0.93        55

    accuracy                           0.95       171
   macro avg       0.94      0.95      0.95       171
weighted avg       0.95      0.95      0.95       171

主要关注召回率，看恶性是0.95 ，意味着假如100个人，有5个人患癌症没有被预测出来

小浩子7号

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
机器学习--逻辑回归

逻辑回归：用线性回归式子作为逻辑回归的输入，用来解决二分类问题想把线性回归用来做二分类问题，要用sigmoid函数小于0.5的概率归为0，大于0.5的概率归位1二、用逻辑回归做癌症二分类问题import pandas as pdimport numpy as npfrom sklearn.model_selection import train_test_splitfrom sklearn.preprocessing import StandardScalerfrom s
复制链接

扫一扫