Machine Learning：基本模型评估指数_分类

最新推荐文章于 2024-01-16 17:37:48 发布

__XX__

最新推荐文章于 2024-01-16 17:37:48 发布

阅读量334

点赞数

分类专栏：数据科学文章标签：机器学习分类算法评估指标数据科学

本文链接：https://blog.csdn.net/qq_29575471/article/details/109341799

版权

数据科学专栏收录该内容

16 篇文章 0 订阅

订阅专栏

文章目录

Intro
分类

Intro

本文介绍基本的模型评估指数，例如
分类：
1. accuracy
2. precision
3. recall
4. f1
5. P-R曲线
6. 混淆矩阵
7. roc曲线
8. auc
回归：
1. MAE
2. MSE
3. SSE
4. RMSE
5. R-square
6. adjusted R-square
聚类
1. Jaccard
2. FM
3. Rand
4. DB
5. Dunn
本文只对这些评估指数与python中的调用方式（主要为sklearn）做简单介绍，具体API的使用方法自行查阅官方文档，例如

# In ipython
from sklearn.metrics import accuracy_score
accuracy_score?

Signature: accuracy_score(y_true, y_pred, *, normalize=True, sample_weight=None)
Docstring:
Accuracy classification score.

In multilabel classification, this function computes subset accuracy:
the set of labels predicted for a sample must *exactly* match the
corresponding set of labels in y_true.

Read more in the :ref:`User Guide <accuracy_score>`.

Parameters
----------
y_true : 1d array-like, or label indicator array / sparse matrix
    Ground truth (correct) labels.

y_pred : 1d array-like, or label indicator array / sparse matrix
    Predicted labels, as returned by a classifier.

normalize : bool, optional (default=True)
    If ``False``, return the number of correctly classified samples.
    Otherwise, return the fraction of correctly classified samples.

sample_weight : array-like of shape (n_samples,), default=None
    Sample weights.
...

或自行google

分类

仅讨论二分类问题

假设
1. 将正类正确预测为正类的数量为TP-True Positive
2. 将正类错误预测为负类的数量为FN-False Negative
3. 将负类错误预测为正类的数量为FP-False Positive
4. 将负类正确预测为负类的数量为TN-True Negative

accuracy

$\frac {预测正确}{总数}$

from sklearn.metrics import accuracy_score

应用场景

数据分布较均匀，对于二分类问题，应当保证正例和负例的比例接近1:1。若正例和负例比值较大或较小，如1:4或4:1，accuracy都不适用

precision

$\frac {TP}{TP+FP}$
查准率
所有被预测为正类的samples中，真正是正类的占比

from sklearn.metrics import precision_score

应用场景

我们很关心预测正例的准确性，例如，正例为有价值的股票，负例为没有价值的股票，我们希望预测出所有有价值的股票，且预测结果中的确有价值的股票占比应当尽可能高，从而购买这些股票。若precision太低，会导致的确有价值的股票占比较少，导致花费很多成本收获却很少。
此时，若一些的确有价值的股票没有被预测为正例是可以接受的，因为无法购买它们至少不会让我们损失成本。

recall

$\frac {TP}{TP+FN}$
查全率
所有的确为正类的samples中，被正确预测为正类的占比

from sklearn.metrics import recall_score

应用场景

我们很关心是否所有正例都被预测出来了，例如，正例为感染病毒的人数，负例为未感染病毒的人数。此时我们希望，的确感染病毒的人都能被预测出来，即使里面包含了一些未感染病毒的人，也可以接受。（预测出来的人将被以合理的方式隔离）

F1

$\frac {2}{\frac {1}{precision} + \frac {1}{recall}}$
F1是precision recall的调和平均，根据基本不等式可得，当且仅当recall = precision时F1取到最大。从后文我们会看到，precision and recall总是互相反向增长的，此时即使precision很高，由于recall过低F1也会很低，反之亦然。
故F1是个很好的用于调和两个指标的指标，有效避免了一个过高另一个过低的问题

from sklearn.metrics import f1_score

confusion matrix

from sklearn.metrics import confusion_matrix, plot_confusion_matrix
disp1 = plot_confusion_matrix(classifier, X_test, y_test)

在这里插入图片描述

precision-recall curve

对于一个感知机或SVM模型，本质上是超平面： $\cdot x + b = 0$
以单层感知机为例，若 $\frac {w \cdot x_i + b}{||w||_2} < 0$ ，即 $\cdot x_i + b < 0$ ，则sample xi被预测为负例，否则被预测为正例
这里的0被认为是threshold
同理，在logistic模型中，threshold通常是0.5
现更改threshold，预测precision and recall也会随之变化。
- 具体而言就是，对于一个logistic模型，原先大于等于0.5的样例是正例，现在提高threshold为0.8，则意味FP减少，FN增大，即precision增大，recall减小
- 反之，threshold降低为0.3，意味FN减少，FP增大 ，即precision减小，recall增大
- 可见，precision and recall总在相互调和
- 因此我们关心，当threshold的取值不同时，precision and recall的变化

# From sklearn document
from sklearn import svm, datasets
from sklearn.model_selection import train_test_split
import numpy as np

# Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target

# Add noisy features
random_state = np.random.RandomState(0)
n_samples, n_features = X.shape
X = np.c_[X, random_state.randn(n_samples, 200 * n_features)]

# Limit to the two first classes, and split into training and test
X_train, X_test, y_train, y_test = train_test_split(X[y < 2], y[y < 2],
                                                    test_size=.5,
                                                    random_state=random_state)

# Create a simple classifier
classifier = svm.LinearSVC(random_state=random_state)
classifier.fit(X_train, y_train)
y_score = classifier.decision_function(X_test)


from sklearn.metrics import precision_recall_curve
from sklearn.metrics import plot_precision_recall_curve
import matplotlib.pyplot as plt

disp = plot_precision_recall_curve(classifier, X_test, y_test)

在这里插入图片描述

显然，若precison-recall曲线下方的面积越大，模型效果应当越好。直观看，此时precision可以和recall同时取到较大的值
P-R曲线右上角越往右上侧越好
threshold取到右上角的点时，模型效果最佳

ROC curve

from sklearn.metrics import confusion_matrix, plot_confusion_matrix
disp2 = plot_roc_curve(classifier, X_test, y_test)

在这里插入图片描述

其中
TPR, true positive rate, sensitivity, recall: ignore
- 所有正例中，被正确预测为正例的比例
FPR, false positive rate: $\frac {FP}{FP+TN}$
- 所有负例中，被错误预测为正例的比例
当threshold增加时，FPR减少，TPR减少
当threshold降低时，FPR增加，TPR增加
- FPR和TPR 同增同减
左上角的点越往左上侧越好
threshold取到左上角的点时，模型效果最佳
随机预测情形下，ROC curve应该是斜率为1过原点的直线，该直线上意味着在所有负例和所有正例中，预测为正类的占比相同，负类同理。因为这是随机预测，正例和负例的预测情形应当相同