看精确召回率和f1得分

最新推荐文章于 2024-04-16 16:17:09 发布

weixin_26756255

最新推荐文章于 2024-04-16 16:17:09 发布

阅读量2.9k

点赞数

原文链接：https://medium.com/@tkanstren/a-look-at-precision-recall-and-f1-score-36b5fd0dd3ec

版权

Terminology of a specific domain is often difficult to start with. With a software engineering background, machine learning has many such terms that I find I need to remember to use the tools and read the articles.

特定领域的术语通常很难入手。在软件工程的背景下，机器学习有许多这样的术语，我发现我需要记住要使用这些工具并阅读文章。

Some basic terms are Precision, Recall, and F1-Score. These relate to getting a finer-grained idea of how well a classifier is doing, as opposed to just looking at overall accuracy. Writing an explanation forces me to think it through, and helps me remember the topic myself. That’s why I like to write these articles.

一些基本术语是Precision，Recall和F1-Score。这些与获得更精细的分类器效果的想法有关，而不是仅仅关注整体准确性。撰写说明会迫使我仔细考虑，并帮助我自己记住该主题。这就是为什么我喜欢写这些文章。

I am looking at a binary classifier in this article. The same concepts do apply more broadly, just require a bit more consideration on multi-class problems. But that is something to consider another time.

我正在看本文中的二进制分类器。相同的概念的确适用范围更广，只需要对多类问题进行更多考虑即可。但这是另一回事了。

Before going into the details, an overview figure is always nice:

在进入细节之前，总览图总是很不错的：

On the first look, it is a bit of a messy web. No need to worry about the details for now, but we can look back at this during the following sections when explaining the details from the bottom up. The metrics form a hierarchy starting with the the true/false negatives/positives (at the bottom), and building up all the way to the F1-score to bind them all together. Lets build up from there.

乍一看，它有点混乱。现在无需担心细节，但是在从下至上解释细节时，我们可以在以下各节中回顾一下。指标形成一个层次结构，从真/假否定/正数 (在底部)开始，并一直建立到F1分数以将它们全部绑定在一起。让我们从那里开始。

正确/错误肯定和否定 (True/False Positives and Negatives)

A binary classifier can be viewed as classifying instances as positive or negative:

二进制分类器可以视为将实例分类为正数或负数：

Positive: The instance is classified as a member of the class the classifier is trying to identify. For example, a classifier looking for cat photos would classify photos with cats as positive (when correct).
肯定的 ：实例被分类为分类器试图识别的类的成员。例如，寻找猫照片的分类器会将猫的照片分类为正(正确时)。
Negative: The instance is classified as not being a member of the class we are trying to identify. For example, a classifier looking for cat photos should classify photos with dogs (and no cats) as negative.
负数：实例被归类为不是我们试图识别的类的成员。例如，寻找猫照片的分类器应将带有狗(而不是猫)的照片分类为负。

The basis of precision, recall, and F1-Score comes from the concepts of True Positive, True Negative, False Positive, and False Negative. The following table illustrates these (consider value 1 to be a positive prediction):

精度，召回率和F1-Score的基础来自“ 真肯定” ，“ 真否定” ，“ 假肯定 ”和“ 假否定”的概念 。下表对此进行了说明(将值1视为肯定预测)：

Examples of True/False Positive and Negative

正/负正负示例

真正(TP) (True Positive (TP))

The following table shows 3 examples of a True Positive (TP). The first row is a generic example, where 1 represents the Positive prediction. The following two rows are examples with labels. Internally, the algorithms would use the 1/0 representation, but I used labels here for a more intuitive understanding.

下表显示了3个正正(TP)示例。第一行是一般示例，其中1表示肯定预测。以下两行是带有标签的示例。在内部，算法将使用1/0表示形式，但是我在这里使用标签是为了更直观地理解。

Examples of True Positive (TP) relations.

真正(TP)关系的示例。

误报(FP) (False Positive (FP))

These False Positives (FP) examples illustrate making wrong predictions, predicting Positive samples for a actual Negative samples. Such failed prediction is called False Positive.

这些误报(FP)示例说明了错误的预测，为实际的负样本预测了正样本。这种失败的预测称为误报。

真负(TN) (True Negative (TN))

For the True Negative (TN) example, the cat classifier correctly identifies a photo as not having a cat in it, and the medical image as the patient having no cancer. So the prediction is Negative and correct (True).

对于True Negative(TN)示例，猫分类器正确地将照片识别为其中没有猫，而医学图像则将其识别为没有癌症的患者。因此，该预测为负且正确(正确)。

假阴性(FN) (False Negative (FN))

In the False Negative (FN) case, the classifier has predicted a Negative result, while the actual result was positive. Like no cat when there is a cat. So the prediction was Negative and wrong (False). Thus it is a False Negative.

在假阴性(FN)情况下，分类器预测的结果为阴性，而实际结果为肯定。就像没有猫的猫一样。因此，该预测是负面的和错误的(错误)。因此，这是一个假阴性。

混淆矩阵 (Confusion Matrix)

A confusion matrix is sometimes used to illustrate classifier performance based on the above four values (TP, FP, TN, FN). These are plotted against each other to show a confusion matrix:

有时会使用混淆矩阵来说明基于上述四个值(TP，FP，TN，FN)的分类器性能。这些相互绘制以显示混淆矩阵：

Using the cancer prediction example, a confusion matrix for 100 patients might look something like this:

使用癌症预测示例，一个100位患者的混淆矩阵可能看起来像这样：

This example has:

这个例子有：

TP: 45 positive cases correctly predicted
TP：正确预测45例阳性病例
TN: 25 negative cases correctly predicted
TN：正确预测25例阴性病例
FP: 18 negative cases are misclassified (wrong positive predictions)
FP：18例阴性病例被错误分类(错误的阳性预测)
FN: 12 positive cases are misclassified (wrong negative predictions)
FN：12个阳性病例被错误分类(错误的阴性预测)

Thinking about this for a while, there are different severities to the different errors here. Classifying someone who has cancer as not having it (false negative, denying treatment), is likely more severe than classifying someone who does not have it as having it (false positive, consider treatment, do further tests).

考虑一段时间，这里的不同错误有不同的严重程度。将患有癌症的人分类为没有癌症(假阴性，拒绝治疗)可能比将没有癌症的人分类为癌症(假阳性，考虑治疗，做进一步的检查)更为严格。

As the severity of different kinds of mistakes varies across use cases, the metrics such as Accuracy, Precision, Recall, and F1-score can be used to balance the classifier estimates as preferred.

由于不同类型错误的严重性随使用案例的不同而不同，因此可以使用诸如Accuracy ， Precision ， Recall和F1分数之类的指标来平衡分类器估计值。

准确性 (Accuracy

最低0.47元/天解锁文章

weixin_26756255

关注

0
点赞
踩
8

收藏

觉得还不错? 一键收藏
0
评论
看精确召回率和f1得分

Terminology of a specific domain is often difficult to start with. With a software engineering background, machine learning has many such terms that I find I need to remember to use the tools and read...
复制链接

扫一扫