机器学习-性能指标

机器学习-性能指标 (Machine Learning - Performance Metrics)

There are various metrics which we can use to evaluate the performance of ML algorithms, classification as well as regression algorithms. We must carefully choose the metrics for evaluating ML performance because −

我们可以使用各种指标来评估ML算法,分类以及回归算法的性能。 我们必须谨慎选择评估ML性能的指标,因为-

  • How the performance of ML algorithms is measured and compared will be dependent entirely on the metric you choose.

    如何测量和比较ML算法的性能完全取决于您选择的指标。

  • How you weight the importance of various characteristics in the result will be influenced completely by the metric you choose.

    您如何权衡各种特征在结果中的重要性,将完全取决于您选择的指标。

分类问题的绩效指标 (Performance Metrics for Classification Problems)

We have discussed classification and its algorithms in the previous chapters. Here, we are going to discuss various performance metrics that can be used to evaluate predictions for classification problems.

在前面的章节中,我们讨论了分类及其算法。 在这里,我们将讨论各种性能指标,这些指标可用于评估分类问题的预测。

混淆矩阵 (Confusion Matrix)

It is the easiest way to measure the performance of a classification problem where the output can be of two or more type of classes. A confusion matrix is nothing but a table with two dimensions viz. “Actual” and “Predicted” and furthermore, both the dimensions have “True Positives (TP)”, “True Negatives (TN)”, “False Positives (FP)”, “False Negatives (FN)” as shown below −

这是衡量分类问题性能的最简单方法,其中输出可以是两种或多种类型的类。 混淆矩阵不过是具有二维的表。 “实际”和“预测”,此外,这两个维度均具有“真阳性(TP)”,“真阴性(TN)”,“假阳性(FP)”,“假阴性(FN)”,如下所示-

Actual Predicated

Explanation of the terms associated with confusion matrix are as follows −

与混淆矩阵相关的术语的解释如下-

  • True Positives (TP) − It is the case when both actual class & predicted class of data point is 1.

    真实正值(TP) -数据点的实际类别和预测类别均为1时会出现这种情况。

  • True Negatives (TN) − It is the case when both actual class & predicted class of data point is 0.

    真负数(TN) -数据点的实际类别和预测类别都为0的情况。

  • False Positives (FP) − It is the case when actual class of data point is 0 & predicted class of data point is 1.

    误报(FP) -数据点的实际类别为0且数据点的预测类别为1的情况。

  • False Negatives (FN) − It is the case when actual class of data point is 1 & predicted class of data point is 0.

    假阴性(FN) -数据点的实际类别为1而数据点的预测类别为0的情况。

We can use confusion_matrix function of sklearn.metrics to compute Confusion Matrix of our classification model.

我们可以使用sklearn.metrics的confusion_matrix函数来计算分类模型的Confusion Matrix。

分类精度 (Classification Accuracy)

It is most common performance metric for classification algorithms. It may be defined as the number of correct predictions made as a ratio of all predictions made. We can easily calculate it by confusion matrix with the help of following formula −

$$Accuracy =\frac{TP+TN}{𝑇𝑃+𝐹𝑃+𝐹𝑁+𝑇𝑁}$$

它是分类算法最常见的性能指标。 可以将其定义为正确预测的数量,以其作为所有预测的比例。 我们可以借助以下公式轻松地通过混淆矩阵来计算它-

$$ Accuracy = \ frac {TP + TN} {𝑇𝑃+𝐹𝑃+𝐹𝑁+𝑇𝑁} $$

We can use accuracy_score function of sklearn.metrics to compute accuracy of our classification model.

我们可以使用sklearn.metrics的precision_score函数来计算分类模型的准确性。

分类报告 (Classification Report)

This report consists of the scores of Precisions, Recall, F1 and Support. They are explained as follows −

该报告由“精确度”,“召回率”,“ F1”和“支持”得分组成。 它们解释如下-

精确 (Precision)

Precision, used in document retrievals, may be defined as the number of correct documents returned by our ML model. We can easily calculate it by confusion matrix with the help of following formula −

$$Precision=\frac{TP}{TP+FP}$$

文档检索中使用的精度可以定义为我们的ML模型返回的正确文档数。 我们可以借助以下公式轻松地通过混淆矩阵来计算它-

$$ Precision = \ frac {TP} {TP + FP} $$

召回或敏感性 (Recall or Sensitivity)

Recall may be defined as the number of positives returned by our ML model. We can easily calculate it by confusion matrix with the help of following formula −

$$Recall =\frac{TP}{TP+FN}$$

召回率可以定义为我们的ML模型返回的肯定数。 我们可以借助以下公式轻松地通过混淆矩阵来计算它-

$$ Recall = \ frac {TP} {TP + FN} $$

特异性 (Specificity)

Specificity, in contrast to recall, may be defined as the number of negatives returned by our ML model. We can easily calculate it by confusion matrix with the help of following formula −

$$Specificity =\frac{TN}{TN+FP}$$

与召回相反,特异性可以定义为我们的ML模型返回的阴性数。 我们可以借助以下公式轻松地通过混淆矩阵来计算它-

$$ Specificity = \ frac {TN} {TN + FP} $$

支持 (Support)

Support may be defined as the number of samples of the true response that lies in each class of target values.

支持可以定义为每类目标值中真实响应的样本数。

F1分数 (F1 Score)

This score will give us the harmonic mean of precision and recall. Mathematically, F1 score is the weighted average of the precision and recall. The best value of F1 would be 1 and worst would be 0. We can calculate F1 score with the help of following formula −

该分数将为我们提供精确度和查全率的调和平均值。 在数学上,F1分数是精度和召回率的加权平均值。 F1的最佳值是1,最差的是0。我们可以使用以下公式来计算F1得分-

𝑭𝟏 = 𝟐 ∗ (𝒑𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 ∗ 𝒓𝒆𝒄𝒂𝒍𝒍) / (𝒑𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏 + 𝒓𝒆𝒄𝒂𝒍𝒍)

𝑭𝟏=𝟐∗(𝒑𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏∗𝒓𝒆𝒄𝒂𝒍𝒍)/(𝒑𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏+𝒓𝒆𝒄𝒂𝒍𝒍)

F1 score is having equal relative contribution of precision and recall.

F1分数在准确性和召回率上具有相等的相对贡献。

We can use classification_report function of sklearn.metrics to get the classification report of our classification model.

我们可以使用sklearn.metrics的category_report函数来获取我们的分类模型的分类报告。

AUC(ROC曲线下的面积) (AUC (Area Under ROC curve))

AUC (Area Under Curve)-ROC (Receiver Operating Characteristic) is a performance metric, based on varying threshold values, for classification problems. As name suggests, ROC is a probability curve and AUC measure the separability. In simple words, AUC-ROC metric will tell us about the capability of model in distinguishing the classes. Higher the AUC, better the model.

AUC(曲线下面积)-ROC(接收器工作特性)是一种性能指标,基于变化的阈值,用于分类问题。 顾名思义,ROC是一条概率曲线,而AUC则测量了可分离性。 简而言之,AUC-ROC度量标准将告诉我们有关模型区分类的能力。 AUC越高,模型越好。

Mathematically, it can be created by plotting TPR (True Positive Rate) i.e. Sensitivity or recall vs FPR (False Positive Rate) i.e. 1-Specificity, at various threshold values. Following is the graph showing ROC, AUC having TPR at y-axis and FPR at x-axis −

从数学上讲,它可以通过在各种阈值下绘制TPR(真正率)即灵敏度或召回率与FPR(假正率)即1-Specific来创建。 下图显示了ROC,在y轴上具有TPR且在x轴上具有FPR的AUC-

AOC

We can use roc_auc_score function of sklearn.metrics to compute AUC-ROC.

我们可以使用sklearn.metrics的roc_auc_score函数来计算AUC-ROC。

LOGLOSS(对数损失) (LOGLOSS (Logarithmic Loss))

It is also called Logistic regression loss or cross-entropy loss. It basically defined on probability estimates and measures the performance of a classification model where the input is a probability value between 0 and 1. It can be understood more clearly by differentiating it with accuracy. As we know that accuracy is the count of predictions (predicted value = actual value) in our model whereas Log Loss is the amount of uncertainty of our prediction based on how much it varies from the actual label. With the help of Log Loss value, we can have more accurate view of the performance of our model. We can use log_loss function of sklearn.metrics to compute Log Loss.

也称为逻辑回归损失或交叉熵损失。 它基本上是根据概率估计定义的,并衡量分类模型的性能,其中输入是介于0和1之间的概率值。通过精确区分它可以更清楚地理解。 众所周知,准确度是模型中预测的计数(预测值=实际值),而对数损失是基于预测值与实际标签相差多少的预测不确定性量。 借助对数损失值,我们可以更准确地了解模型的性能。 我们可以使用sklearn.metrics的log_loss函数来计算对数损失。

(Example)

The following is a simple recipe in Python which will give us an insight about how we can use the above explained performance metrics on binary classification model −

以下是Python中的一个简单配方,它将使我们了解如何在二进制分类模型上使用上述解释的性能指标-


from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
from sklearn.metrics import roc_auc_score
from sklearn.metrics import log_loss
X_actual = [1, 1, 0, 1, 0, 0, 1, 0, 0, 0]
Y_predic = [1, 0, 1, 1, 1, 0, 1, 1, 0, 0]
results = confusion_matrix(X_actual, Y_predic)
print ('Confusion Matrix :')
print(results)
print ('Accuracy Score is',accuracy_score(X_actual, Y_predic))
print ('Classification Report : ')
print (classification_report(X_actual, Y_predic))
print('AUC-ROC:',roc_auc_score(X_actual, Y_predic))
print('LOGLOSS Value is',log_loss(X_actual, Y_predic))

输出量 (Output)


Confusion Matrix :
[
   [3 3]
   [1 3]
]
Accuracy Score is 0.6
Classification Report :
            precision      recall      f1-score       support
      0       0.75          0.50      0.60           6
      1       0.50          0.75      0.60           4
micro avg     0.60          0.60      0.60           10
macro avg     0.62          0.62      0.60           10
weighted avg  0.65          0.60      0.60           10
AUC-ROC:  0.625
LOGLOSS Value is 13.815750437193334

回归问题的绩效指标 (Performance Metrics for Regression Problems)

We have discussed regression and its algorithms in previous chapters. Here, we are going to discuss various performance metrics that can be used to evaluate predictions for regression problems.

在前面的章节中,我们讨论了回归及其算法。 在这里,我们将讨论各种性能指标,这些指标可用于评估回归问题的预测。

平均绝对误差(MAE) (Mean Absolute Error (MAE))

It is the simplest error metric used in regression problems. It is basically the sum of average of the absolute difference between the predicted and actual values. In simple words, with MAE, we can get an idea of how wrong the predictions were. MAE does not indicate the direction of the model i.e. no indication about underperformance or overperformance of the model. The following is the formula to calculate MAE −

$$MAE = \frac{1}{n}\sum|Y -\hat{Y}|$$

它是用于回归问题的最简单的误差度量。 它基本上是预测值与实际值之间的绝对差的平均值之和。 简而言之,借助MAE,我们可以了解预测的错误程度。 MAE不指示模型的方向,即不指示模型的性能不足或性能过高。 以下是计算MAE的公式-

$$ MAE = \ frac {1} {n} \ sum | Y-\ hat {Y} | $$

Here, 𝑌=Actual Output Values

𝑌=实际输出值

And $\hat{Y}$= Predicted Output Values.

并且$ \ hat {Y} $ =预测的输出值。

We can use mean_absolute_error function of sklearn.metrics to compute MAE.

我们可以使用sklearn.metrics的mean_absolute_error函数来计算MAE。

均方误差(MSE) (Mean Square Error (MSE))

MSE is like the MAE, but the only difference is that the it squares the difference of actual and predicted output values before summing them all instead of using the absolute value. The difference can be noticed in the following equation −

$$MSE = \frac{1}{n}\sum(Y -\hat{Y})$$

MSE就像MAE,但是唯一的区别是,它在对所有实际值和预测值进行求和之前将它们平方和,而不是使用绝对值。 可以在以下公式中注意到差异-

$$ MSE = \ frac {1} {n} \ sum(Y-\ hat {Y})$$

Here, 𝑌=Actual Output Values

𝑌=实际输出值

And $\hat{Y}$ = Predicted Output Values.

$ \ hat {Y} $ =预测的输出值。

We can use mean_squared_error function of sklearn.metrics to compute MSE.

我们可以使用sklearn.metrics的mean_squared_error函数来计算MSE。

R平方(R 2 ) (R Squared (R2))

R Squared metric is generally used for explanatory purpose and provides an indication of the goodness or fit of a set of predicted output values to the actual output values. The following formula will help us understanding it −

$$R^{2} = 1 -\frac{\frac{1}{n}\sum_{i{=1}}^n(Y_{i}-\hat{Y_{i}})^2}{\frac{1}{n}\sum_{i{=1}}^n(Y_{i}-\bar{Y_i)^2}}$$

R平方度量通常用于说明目的,并提供一组预测输出值与实际输出值之间的优劣程度的指示。 以下公式将帮助我们理解它-

$$ R ^ {2} = 1-\ frac {\ frac {1} {n} \ sum_ {i {= 1}} ^ n(Y_ {i}-\ hat {Y_ {i}})^ 2} {\ frac {1} {n} \ sum_ {i {= 1}} ^ n(Y_ {i}-\ bar {Y_i)^ 2}} $$

In the above equation, numerator is MSE and the denominator is the variance in 𝑌 values.

在上式中,分子为MSE,分母为is值的方差。

We can use r2_score function of sklearn.metrics to compute R squared value.

我们可以使用sklearn.metrics的r2_score函数来计算R平方值。

(Example)

The following is a simple recipe in Python which will give us an insight about how we can use the above explained performance metrics on regression model −

以下是Python中的一个简单配方,它将使我们了解如何在回归模型上使用上述解释的性能指标-


from sklearn.metrics import r2_score
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import mean_squared_error
X_actual = [5, -1, 2, 10]
Y_predic = [3.5, -0.9, 2, 9.9]
print ('R Squared =',r2_score(X_actual, Y_predic))
print ('MAE =',mean_absolute_error(X_actual, Y_predic))
print ('MSE =',mean_squared_error(X_actual, Y_predic))

输出量 (Output)


R Squared = 0.9656060606060606
MAE = 0.42499999999999993
MSE = 0.5674999999999999

翻译自: https://www.tutorialspoint.com/machine_learning_with_python/machine_learning_algorithms_performance_metrics.htm

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值