ROC曲线

最新推荐文章于 2024-06-09 10:53:10 发布

天下第一小白

最新推荐文章于 2024-06-09 10:53:10 发布

阅读量164

点赞数 2

分类专栏：机器学习笔记文章标签： ROC曲线

本文链接：https://blog.csdn.net/sinat_36899414/article/details/103299125

版权

机器学习笔记专栏收录该内容

25 篇文章 1 订阅

订阅专栏

下面是精准率和召回率

在这里插入图片描述

def TN(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 0) & (y_predict == 0))

def FP(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 0) & (y_predict == 1))

def FN(y_true,y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 1) & (y_predict == 0))

def TP(y_true,y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 1) & (y_predict == 1))

def confusion_matrix(y_true, y_predict):
    return np.array([
        [TN(y_true,y_predict),FP(y_true,y_predict)],
        [FN(y_true,y_predict),TP(y_true,y_predict)]
    ])
# 精准率：

def precision_score(y_true, y_predict):
    tp = TP(y_true,y_predict)
    fp = FP(y_true, y_predict)
    try:
        return tp/(tp+fp)
    except:
        return 0.0

# 召回率：

def recall_score(y_true, y_predict):
    tp = TP(y_true,y_predict)
    fn = FN(y_true, y_predict)
    try:
        return tp/(tp+fn)
    except:
        return 0.0
    
def TPR(y_true, y_predict):
    tp = TP(y_true,y_predict)
    fn = FN(y_true, y_predict)
    try:
        return tp/(tp+fn)
    except:
        return 0.0

def FPR(y_true, y_predict):
    fp = FP(y_true,y_predict)
    tn = TN(y_true, y_predict)
    try:
        return fp/(fp+tn)
    except:
        return 0.0

自己动手实现：

import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt

digits = datasets.load_digits()
X = digits.data
y = digits.target.copy()
y[digits.target == 9] = 1
y[digits.target != 9] = 0
X_train, X_test,y_train, y_test = train_test_split(X,y,random_state=666)

from sklearn.linear_model import LogisticRegression
log_reg = LogisticRegression()
log_reg.fit(X_train,y_train)
decision_score = log_reg.decision_function(X_test)

fprs= []
tprs = []
threshholds = np.arange(np.min(decision_score),np.max(decision_score),0.1)
for threshhold in threshholds:
    y_predict = np.array(decision_score >= threshhold, dtype='int')
    fprs.append(FPR(y_test, y_predict))
    tprs.append(TPR(y_test, y_predict))
plt.plot(fprs, tprs)
plt.show()

在这里插入图片描述
下面用sklearn中的ROC

from sklearn.metrics import roc_curve
fprs, tprs ,threshholds = roc_curve(y_test, decision_score)
plt.plot(fprs, tprs)
plt.show()

在这里插入图片描述
上面的曲线下面的面积越大，模型的准确度越好，下面是求曲线下面的面积：

from sklearn.metrics import roc_auc_score
roc_auc_score(y_test,decision_score)

多分类问题的混淆矩阵

digits = datasets.load_digits()
X = digits.data
y = digits.target.copy()
X_train, X_test,y_train, y_test = train_test_split(X,y,random_state=666)
log_reg = LogisticRegression()
log_reg.fit(X_train,y_train)
y_predict = log_reg.predict(X_test)


from sklearn.metrics import precision_score
precision_score(y_test, y_predict,average="micro")   # 0.9555555555555556

from sklearn.metrics import confusion_matrix
confusion_matrix(y_test, y_predict)

cfm = confusion_matrix(y_test, y_predict)
plt.matshow(cfm, cmap=plt.cm.gray)
plt.show()

在这里插入图片描述
错误矩阵：

row_sums = np.sum(cfm, axis=1)
err_matrix = cfm/row_sums
np.fill_diagonal(err_matrix,0)
err_matrix

plt.matshow(err_matrix, cmap=plt.cm.gray)
plt.show()

在这里插入图片描述

天下第一小白

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
ROC曲线

def TN(y_true, y_predict): assert len(y_true) == len(y_predict) return np.sum((y_true == 0) & (y_predict == 0))def FP(y_true, y_predict): assert len(y_true) == len(y_predict) r...
复制链接

扫一扫

专栏目录