机器学习评价指标

最新推荐文章于 2024-01-16 15:48:20 发布

chen_c_q

最新推荐文章于 2024-01-16 15:48:20 发布

阅读量347

点赞数

本文链接：https://blog.csdn.net/chen_c_q/article/details/83059606

版权

回归任务中使用的指标

回归任务的指标较为单一，一般使用MSE(Mean Square Error) ，平均平方差/均方误差是回归任务最常用的性能度量。

分类任务中常用的指标

分类任务可以以不同的方式进行评估，来到特定的目标，这些指标可以很好地衡量分类算法。然而，在许多情况下，有必要区分不同类型的误分类，因为误分类的相对权重是不同的。因此，引入下面的定义：

-真正（True Positive）：被模型预测为正的正样本
-假负（False Negative）：被模型预测为负的正样本
-假正（False Positive）：被模型预测为正的负样本
-真负（Ture Negative）：被模型预测为负的负样本

混淆矩阵（Confusion Mstrix）

一目了然，假正和假负可以被认为是类似的错误，但是在检查预测结果时，假正可以通过进一步测试容易地发现，但是假负经常被忽略，并会出现一系列由此引起的严重后果，因此引入混淆矩阵的概念：混淆矩阵是用来总结一个分类器结果的矩阵，对于k元分类，其实它就是一个k * k的表格，用来记录分类器的预测结果。对于最常见的二元分类来说，它的混淆矩阵是2乘2的，如下：

	预测值=1	预测值=0
真实值=1	TP	FN
真实值=0	FP	TN

混淆矩阵中的这四个数值，经常被用来定义其他一些度量。

 - klearn.metrics.confusion_matrix(y_true, y_pred, labels=None, sample_weight=None)

y_true: 是样本真实分类结果标签
y_pred: 是样本预测分类结果
labels：混淆矩阵的索引，如果没有赋值，则按照y_true, y_pred中出现过的值排序
sample_weight : 样本权重

准确率（Accuracy）

准确率是其中最重要的指标，表示预测对了的次数在所有预测中所占的比重，分为两种情况：

1、真实准确（True）的正预测（Positive），即TP，预测为正，实际也为正；
2、真实准确（True）的负预测（Negative），即TN，预测为负，实际也为负；

所以，准确率(Accuracy) = (TP + TN) / (TP + TN+ FN +TN)= (TP + TN) / (ALL)

 - sklearn.metrics.accuracy_score(y_true, y_pred, normalize=True, sample_weight=None)

normalize：默认值为True，返回正确分类的比例；如果为False，返回正确分类的样本数

精确率（Precision）

精确率是针对我们预测结果而言的，它表示的是预测为正的样本中有多少是真正的正样本。预测为正的样本（Positive）有两种可能：

1、把实际的正类预测为正类，即真实准确的正预测(True Positive，TP)；
2、把实际的负类预测为正类，即错误虚假的正预测(False Positive，FP)；

所以，精确率(Precision) = TP / (TP + FP)

 - sklearn.metrics.precision_score(y_true，y_pred，labels = None，pos_label = 1，average ='binary'，sample_weight = None)

召回率（Recall）

召回率是针对我们原来的样本而言的，它表示的是原来样本中的正例有多少被预测正确了，也即真实准确的正预测在原来样本的正例中所占的百分比。原来样本中的正例包括两种情况：

1、实际是正类，被预测为正类（Positive），即真实准确的正预测(TruePositive，TP)；
2、实际是正类，被预测为负类（Negative），即错误虚假的负预测(False Negative，FN)；

所以，召回率(Recall) = TP / (TP + FN)

- klearn.metrics.recall_score(y_true, y_pred, labels=None, pos_label=1,average='binary', sample_weight=None)

F-measure or balanced F-score

F值 = 正确率 * 召回率 * 2 / (正确率 + 召回率) （F 值即为正确率和召回率的调和平均值）

- sklearn.metrics.f1_score(y_true, y_pred, labels=None, pos_label=1, average=’binary’, sample_weight=None)

average : [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’] 多类/多标签目标需要此参数。如果没有，则返回每个类的分数。

‘binary’:
Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.

‘micro’:
Calculate metrics globally by counting the total true positives, false negatives and false positives.

‘macro’:
Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.

‘weighted’:
Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.

‘samples’:
Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score).

sample_weight : array-like of shape = [n_samples], optional

chen_c_q

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
机器学习评价指标

评估分类的指标分类任务可以以不同的方式进行评估，来到特定的目标，这些指标可以很好地衡量分类算法。然而，在许多情况下，有必要区分不同类型的误分类，因为误分类的相对权重是不同的。因此，引入下面的定义：-真正（True Positive）：被模型预测为正的正样本-假负（False Negative）：被模型预测为负的正样本-假正（False Positive）：被模型预测为正的负样本-真负（T...
复制链接

扫一扫