机器学习评估指标 auc_机器学习中的14种流行评估指标

机器学习评估指标 auc

The evaluation metric is used to measure the performance of a machine learning model. A correct choice of an evaluation metric is very essential for a model. This article will cover all the metrics used in classification and regression machine learning models.

评估指标用于衡量机器学习模型的性能。 正确选择评估指标对于模型非常重要。 本文将介绍分类和回归机器学习模型中使用的所有指标。

Evaluation Metrics discussed in the article:

文章中讨论的评估指标:

Image for post

Metrics used in Classification Models:

分类模型中使用的指标:

For a classification machine learning algorithm, the output of the model can be a target class label or probability score. The different evaluation metric is used for these two approaches.

对于分类机器学习算法,模型的输出可以是目标类别标签或概率分数。 这两种方法使用了不同的评估指标。

指标当ML模型的预测是类标签时使用: (Metric Used when the prediction of the ML model is a class label:)

Confusion Matrix:

混淆矩阵:

A confusion matrix is the easiest way to measure the performance of a classification problem. It is used to visualize and observe the performance of the prediction of ML models. For a k class classification model, a matrix of size k*k is used to observe the prediction. For a binary class classification problem, a standard 2*2 size matrix is used.

混淆矩阵是衡量分类问题性能的最简单方法。 它用于可视化和观察ML模型的预测性能。 对于ak类分类模型,使用大小为k * k的矩阵来观察预测。 对于二进制分类问题,使用标准的2 * 2大小矩阵。

Image for post
Source, Confusion Matrix for binary Classification ,二进制分类混淆矩阵
Notations,TP: True Postive: Number of Points which are actually positive and predicted to be positiveFN: False Negative: Number of Points which are actually positive but predicted to be negativeFP: False Positive: Number of Points which are actually negative but predicted to be positiveTN: True Negative: Number of Points which are actually negative and predicted to be negative

An ML model is considered good if the numbers on principal diagonal are maximum and the number on off-diagonal should be minimum. For a binary confusion matrix, TP and TN should be high and FN and FP should be low.

如果主对角线上的数字最大而非对角线上的数字最小,则ML模型被认为是好的。 对于二进制混淆矩阵, TP和TN应该较高,而FN和FP应该较低。

Different problems have different metrics to choose from:

不同的问题有不同的度量标准可供选择:

  • For the problem of cancer diagnosis, TP should be high and FN should me very low close to 0. Patients having cancer should never be predicted to be not cancer which is the case of FN.

    对于癌症诊断问题,TP应该很高,而FN应该非常低,接近于0。永远不要预测患有癌症的患者不是癌症,FN就是这种情况。

  • For the problem of spam detection, FP should be very low. No mails should be predicted to be spam which is not spam.

    对于垃圾邮件检测问题, FP应该非常低 。 不应将任何邮件预测为垃圾邮件,而不是垃圾邮件。

What are type I and type II errors?

什么是I型和II型错误?

A type I error is also known as a false positive (FP). A type II error is also known as a false negative (FN).

I型错误也称为误报(FP)。 II型错误也称为假阴性(FN)。

Accuracy:

准确性:

Accuracy is the most common performance metric used for classification algorithms. A

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值