sklearn.metrics使用手册

最新推荐文章于 2024-08-08 23:17:59 发布

学渣渣渣渣渣

最新推荐文章于 2024-08-08 23:17:59 发布

阅读量6.7k

点赞数

文章标签： python 机器学习深度学习人工智能监督学习

本文链接：https://blog.csdn.net/weixin_42468475/article/details/106037298

版权

sklearn.metrics里面的几个函数可以衡量机器学习模型的precision、recall、accuracy、ROC等等。
对于上述概念的定义、理解可以参考：https://blog.csdn.net/weixin_41770169/article/details/80362646
https://blog.csdn.net/program_developer/article/details/79946787（这个写的很详细）
注意：
1.ROC曲线用在多分类中是没有意义的。只有在二分类中Positive和Negative同等重要时候，适合用ROC曲线评价。如果确实需要在多分类问题中用ROC曲线的话，可以转化为多个“一对多”的问题。即把其中一个当作正例，其余当作负例来看待，画出多个ROC曲线。
2. AUC就是ROC曲线下的面积，衡量学习器优劣的一种性能指标。AUC的计算方法同时考虑了学习器对于正例和负例的分类能力，在样本不平衡的情况下，依然能够对分类器做出合理的评价。AUC对样本类别是否均衡并不敏感，这也是不均衡样本通常用AUC评价学习器性能的一个原因。例如在癌症预测的场景中，假设没有患癌症的样本为正例，患癌症样本为负例，负例占比很少(大概0.1%)，如果使用准确率评估，把所有的样本预测为正例便可以获得99.9%的准确率。但是如果使用AUC，把所有样本预测为正例，TPR为1，FPR为1。这种情况下学习器的AUC值将等于0.5，成功规避了样本不均衡带来的问题。

分类指标

accuracy准确率

accuracy_score(y_true,y_pre)

AUC面积

auc(x, y, reorder=False)

Example：

>>> import numpy as np
>>> from sklearn import metrics
>>> y = np.array([1, 1, 2, 2])
>>> pred = np.array([0.1, 0.4, 0.35, 0.8])
>>> fpr, tpr, thresholds = metrics.roc_curve(y, pred, pos_label=2)
>>> metrics.auc(fpr, tpr)
0.75

ROC曲线下的面积;较大的AUC代表了较好的performance。

F1值

f1_score(y_true, y_pred, labels=None, pos_label=1, average=‘binary’, sample_weight=None)

F1 = 2 * (precision * recall) / (precision + recall) precision(查准率)=TP/(TP+FP) recall(查全率)=TP/(TP+FN)

Precision查准率（精度）

precision_score(y_true, y_pred, labels=None, pos_label=1, average=‘binary’,)

precision(查准率)=TP/(TP+FP)

Recall查全率（召回率）

accuracy_score(y_true,y_pre)

recall(查全率)=TP/(TP+FN)

precision_recall曲线

Compute precision-recall pairs for different probability thresholds，自变量是thresholds，和ROC曲线不同的是，precision_recall_curve的threshold是从0到1的。

sklearn.metrics.precision_recall_curve(y_true, probas_pred, pos_label=None, sample_weight=None)

Example：

>>> import numpy as np
>>> from sklearn.metrics import precision_recall_curve
>>> y_true = np.array([0, 0, 1, 1])
>>> y_scores = np.array([0.1, 0.4, 0.35, 0.8])
>>> precision, recall, thresholds = precision_recall_curve(
...     y_true, y_scores)
>>> precision
array([0.66666667, 0.5       , 1.        , 1.        ])
>>> recall
array([1. , 0.5, 0.5, 0. ])
>>> thresholds
array([0.35, 0.4 , 0.8 ])

ROC曲线

roc_curve(y_true, y_score, pos_label=None, sample_weight=None, drop_intermediate=True)

注意 y_score : 数组, shape = [样本数]
目标得分，是阳性的概率估计，信心值，简而言之就是 model.predict_proba() 对应阳性哪一类的结果。
pos_label： int or str, 阳性对应的标签，默认是1（ROC在多分类任务中只能做到多对一）
sample_weight: 顾名思义，样本的权重，可选择的
drop_intermediate: boolean, optional (default=True) 是否放弃一些不出现在绘制的ROC曲线上的次优阈值。这有助于创建更轻的ROC曲线

Returns : fpr : array, shape = [>2] 增加假阳性率，例如，i是预测的假阳性率，得分>=临界值[i]
tpr : array, shape = [>2] 增加真阳性率，例如，i是预测的真阳性率，得分>=临界值[i]。
thresholds : array, shape = [n_thresholds]
真阳性率 = true positive rate = TPR = TP/ (TP + FN)
假阳性率 = false positive rate = FPR = FP / (FP+TN)

Example：

import numpy as np
from sklearn import metrics
y = np.array([1, 1, 0, 0]) #真值序列
scores = np.array([0.1, 0.4, 0.35, 0.8]) #预测为阳性的概率序列

fpr, tpr, thresholds = metrics.roc_curve(y, scores, pos_label=1) #pos_label=1定义y=1为阳性
print(fpr,tpr, thresholds)

结果：

[0.  0.5 0.5 1.  1. ] [0.  0.  0.5 0.5 1. ] [1.8  0.8  0.4  0.35 0.1 ]# 1.8那个值可能优点奇葩 无视就好了

高级的Example： 利用roc_curve计算出来的fpr、tpr作为横纵坐标绘制ROC曲线。

import numpy as np
np.random.seed(10)

import matplotlib.pyplot as plt

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import (RandomTreesEmbedding, RandomForestClassifier,
                              GradientBoostingClassifier)
from sklearn.preprocessing import OneHotEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_curve
from sklearn.pipeline import make_pipeline

n_estimator = 10
X, y = make_classification(n_samples=80000)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)

# It is important to train the ensemble of trees on a different subset
# of the training data than the linear regression model to avoid
# overfitting, in particular if the total number of leaves is
# similar to the number of training samples
X_train, X_train_lr, y_train, y_train_lr = train_test_split(
    X_train, y_train, test_size=0.5)

# Unsupervised transformation based on totally random trees
rt = RandomTreesEmbedding(max_depth=3, n_estimators=n_estimator,
                          random_state=0)

rt_lm = LogisticRegression(max_iter=1000)
pipeline = make_pipeline(rt, rt_lm)  #把两个模型并联起来
pipeline.fit(X_train, y_train)
y_pred_rt = pipeline.predict_proba(X_test)[:, 1] # 对label1的预测概率
fpr_rt_lm, tpr_rt_lm, _ = roc_curve(y_test, y_pred_rt) # pos_label参数默认是1 。得到横坐标纵坐标

# Supervised transformation based on random forests
rf = RandomForestClassifier(max_depth=3, n_estimators=n_estimator)
rf_enc = OneHotEncoder()
rf_lm = LogisticRegression(max_iter=1000)
rf.fit(X_train, y_train)
rf_enc.fit(rf.apply(X_train))
rf_lm.fit(rf_enc.transform(rf.apply(X_train_lr)), y_train_lr)

y_pred_rf_lm = rf_lm.predict_proba(rf_enc.transform(rf.apply(X_test)))[:, 1]
fpr_rf_lm, tpr_rf_lm, _ = roc_curve(y_test, y_pred_rf_lm) 

# Supervised transformation based on gradient boosted trees
grd = GradientBoostingClassifier(n_estimators=n_estimator)
grd_enc = OneHotEncoder()
grd_lm = LogisticRegression(max_iter=1000) 
grd.fit(X_train, y_train)
grd_enc.fit(grd.apply(X_train)[:, :, 0])  #随机森林的渐变增强
grd_lm.fit(grd_enc.transform(grd.apply(X_train_lr)[:, :, 0]), y_train_lr)

y_pred_grd_lm = grd_lm.predict_proba(
    grd_enc.transform(grd.apply(X_test)[:, :, 0]))[:, 1]
fpr_grd_lm, tpr_grd_lm, _ = roc_curve(y_test, y_pred_grd_lm)

# The gradient boosted model by itself
y_pred_grd = grd.predict_proba(X_test)[:, 1]
fpr_grd, tpr_grd, _ = roc_curve(y_test, y_pred_grd)

# The random forest model by itself
y_pred_rf = rf.predict_proba(X_test)[:, 1]
fpr_rf, tpr_rf, _ = roc_curve(y_test, y_pred_rf)

plt.figure(1)
plt.plot([0, 1], [0, 1], 'k--', label="random guess")
plt.plot(fpr_rt_lm, tpr_rt_lm, label='RT + LR')
plt.plot(fpr_rf, tpr_rf, label='RF')
plt.plot(fpr_rf_lm, tpr_rf_lm, label='RF + LR')
plt.plot(fpr_grd, tpr_grd, label='GBT')
plt.plot(fpr_grd_lm, tpr_grd_lm, label='GBT + LR')
plt.xlabel('False positive rate')
plt.ylabel('True positive rate')
plt.title('ROC curve')
plt.legend(loc='best')
plt.show()

plt.figure(2)
plt.xlim(0, 0.2)
plt.ylim(0.8, 1)
plt.plot([0, 1], [0, 1], 'k--')
plt.plot(fpr_rt_lm, tpr_rt_lm, label='RT + LR')
plt.plot(fpr_rf, tpr_rf, label='RF')
plt.plot(fpr_rf_lm, tpr_rf_lm, label='RF + LR')
plt.plot(fpr_grd, tpr_grd, label='GBT')
plt.plot(fpr_grd_lm, tpr_grd_lm, label='GBT + LR')
plt.xlabel('False positive rate')
plt.ylabel('True positive rate')
plt.title('ROC curve (zoomed in at top left)')
plt.legend(loc='best')
plt.show()

在这里插入图片描述

classification_report

sklearn.metrics.classification_report(y_true, y_pred, labels=None, target_names=None, sample_weight=None, digits=2, output_dict=False, zero_division='warn')
[source]

创建显示主要分类指标的文本报告
Example：

>>> from sklearn.metrics import classification_report
>>> y_true = [0, 1, 2, 2, 2]
>>> y_pred = [0, 0, 2, 2, 1]
>>> target_names = ['class 0', 'class 1', 'class 2']
>>> print(classification_report(y_true, y_pred, target_names=target_names))
              precision    recall  f1-score   support

     class 0       0.50      1.00      0.67         1
     class 1       0.00      0.00      0.00         1
     class 2       1.00      0.67      0.80         3

    accuracy                           0.60         5
   macro avg       0.50      0.56      0.49         5
weighted avg       0.70      0.60      0.61         5

>>> y_pred = [1, 1, 0]
>>> y_true = [1, 1, 1]
>>> print(classification_report(y_true, y_pred, labels=[1, 2, 3]))
              precision    recall  f1-score   support

           1       1.00      0.67      0.80         3
           2       0.00      0.00      0.00         0
           3       0.00      0.00      0.00         0

   micro avg       1.00      0.67      0.80         3
   macro avg       0.33      0.22      0.27         3
weighted avg       1.00      0.67      0.80         3

宏平均（Macro-averaging），是先对每一个类统计指标值，然后在对所有类求算术平均值。宏平均指标相对微平均指标而言受小类别的影响更大。即将n分类的评价拆成n个二分类的评价，计算每个二分类的score，n个 score的平均值即为Macro score。
微平均（Micro-averaging），是对数据集中的每一个实例不分类别进行统计建立全局混淆矩阵，然后计算相应指标。将n分类的评价拆成n个二分类的评价，将n个二分类评价的TP、FP、RN对应相加，计算评价准确率和召回率，由这2个准确率和召回率计算的F1 score即为Micro F1。
在测试数据集上，度量分类器对大类判别的有效性应该选择微平均，而度量分类器对小类判别的有效性则应该选择宏平均。一般来讲，Macro F1、Micro F1高的分类效果好。Macro F1受样本数量少的类别影响大。

混淆矩阵

根据定义，混淆矩阵 $C$ 是这样的 $C_{i,j}$ 等于已知在第 i 组而预测在第 j 组的观测值。因此在二元分类中，真阴性的计数是 $C_{0,0}$ 假阴性的计数是 $C_{1,0}$ 。真阳性是 $C_{1,1}$ 假阳性是 $C_{0,1}$ 。

sklearn.metrics.confusion_matrix(y_true, y_pred, labels=None, sample_weight=None, normalize=None)
[source]

Example:

>>> from sklearn.metrics import confusion_matrix
>>> y_true = [2, 0, 2, 2, 0, 1]
>>> y_pred = [0, 0, 2, 2, 0, 2]
>>> confusion_matrix(y_true, y_pred)
array([[2, 0, 0],
       [0, 0, 1],
       [1, 0, 2]])


>>> y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
>>> y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
>>> confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])
array([[2, 0, 0],
       [0, 0, 1],
       [1, 0, 2]])