了解机器学习算法的性能指标

最新推荐文章于 2023-12-24 20:23:25 发布

weixin_26752765

最新推荐文章于 2023-12-24 20:23:25 发布

阅读量2.1k

点赞数

文章标签：算法 python 机器学习人工智能 java

原文链接：https://medium.com/analytics-vidhya/understanding-performance-metrics-for-machine-learning-algorithms-996dd7efde1e

版权

Performance metrics are used to evaluate the overall performance of Machine learning algorithms and to understand how well our machine learning models are performing on a given data under different scenarios. Choosing the right metric is very essential to understand the behavior of our model and make necessary changes to further improve the model. There are different types of performance metrics. In this article, we’ll have a look at some of the most used metrics.

性能指标用于评估机器学习算法的整体性能，并了解我们的机器学习模型在不同情况下对给定数据的性能。选择正确的度量标准对于理解我们模型的行为并进行必要的更改以进一步改进模型非常重要。有不同类型的性能指标。在本文中，我们将介绍一些最常用的指标。

混淆矩阵。 (Confusion Matrix.)

A confusion matrix is used to evaluate the performance of classification algorithms.

混淆矩阵用于评估分类算法的性能。

As we can see from the image above, a confusion matrix has two rows and two columns for binary classification. The number of rows and columns of a confusion matrix is equal to the number of classes. Columns are the predicted classes, and rows are the actual classes.

从上图可以看出，混淆矩阵有两行两列用于二进制分类。混淆矩阵的行数和列数等于类数。列是预测类，行是实际类。

Now let’s look at each block of our confusion matrix:

现在，让我们看一下混淆矩阵的每个块：

1) True Positives (TP): In this case, the actual value is 1 and the value predicted by our classifier is also 1

1) 真实正值(TP)：在这种情况下，实际值为1，而我们的分类器预测的值为1

2) True Negatives (TN): In this case, the actual value is 0 and the value predicted by our classifier is also 0

2) True Negatives(TN)：在这种情况下，实际值为0，而我们的分类器预测的值为0

2) False Positives (FP) (Type 1 error): In this case, the actual value is 0 but the value predicted by our classifier is 1

2) 误报(FP)(类型1错误)：在这种情况下，实际值为0，但分类器预测的值为1

3) False Negatives (FN) (Type 2 error): In this case, the actual value is 1 but the value predicted by our classifier is 0

3) 假阴性(FN)(第2类错误)：在这种情况下，实际值为1，但分类器预测的值为0

The end goal of our classification algorithm is to maximize the true positives and true negatives i.e. correct predictions and minimize the false positives and false negatives i.e. incorrect predictions.

我们的分类算法的最终目标是最大化真实的正数和真实的负数，即正确的预测，并最小化错误的正数和错误的负数，即错误的预测。

False negatives can be worrisome especially in medical applications e.g., Consider an application where you have to detect breast cancer in patients. Suppose a patient has cancer but our model predicted that she doesn’t have cancer. This can be dangerous as the person is cancer positive but our model failed to predict it.

假阴性可能会令人担忧，特别是在医疗应用中，例如，考虑必须在患者中检测到乳腺癌的应用。假设患者患有癌症，但我们的模型预测她没有癌症。这可能很危险，因为该人是癌症阳性，但我们的模型未能对其进行预测。

准确性。 (Accuracy.)

Accuracy is the most commonly used performance metric for classification algorithms. Accuracy can be defined as the number of correct predictions divided by Total predictions. We can easily calculate accuracy from the confusion matrix using the below formula.

准确性是分类算法最常用的性能指标。准确度可以定义为正确预测数除以总预测数。我们可以使用以下公式轻松地从混淆矩阵中计算出准确性。

Accuracy works well when the classes are balanced i.e. equal number of samples for each class, but if the classes are imbalanced i.e. unequal number of samples per class, then accuracy might not be the right metric.

当类别是平衡的(即每个类别的样本数相等)时，精度效果很好，但是，如果类别是不平衡的(即每个类别的样本数不相等)，则精度可能不是正确的指标。

为什么精度对于不平衡数据是不可靠的指标？ (Why is accuracy an unreliable metric for imbalanced data?)

let’s consider a binary classification problem where we have two classes of cats and dogs, where cats consist of 90% of the total population and dogs consist of 10%. Here cat is our majority class and the dog is our minority class. now if our model predicts every data point as cats still we can get a very high accuracy of 90%.

让我们考虑一个二元分类问题，其中有猫和狗两类，其中猫占总人口的90％，狗占总人口的10％。猫是我们的主要阶层，狗是我们的少数阶层。现在，如果我们的模型将每个数据点都预测为猫，那么我们仍然可以获得90％的非常高的准确性。

This can be worrisome especially when the cost of misclassification of minority class is very high e.g., in applications such as fraud detection in credit card transactions, where the fraudulent transactions are very less in number compared to non-fraudulent transactions.

当少数群体类别的错误分类的成本非常高时，例如在信用卡交易中的欺诈检测之类的应用中，与非欺诈性交易相比，欺诈性交易的数量要少得多，这尤其令人担忧。

回忆或敏感性。 (Recall or sensitivity.)

Recall can be defined as the number of correct positive predictions divided by the sum of correct positive predictions and incorrect positive predictions, it is also called a true positive rate. T

最低0.47元/天解锁文章

weixin_26752765

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
了解机器学习算法的性能指标

Performance metrics are used to evaluate the overall performance of Machine learning algorithms and to understand how well our machine learning models are performing on a given data under different sc...
复制链接

扫一扫