机器学习 模型性能评估_如何评估机器学习模型的性能

机器学习 模型性能评估

Table of contents:

目录:

  • Why evaluation is necessary?

    为什么需要评估?
  • Confusion Matrix

    混淆矩阵
  • Accuracy

    准确性
  • Precision & Recall

    精度和召回率
  • ROC-AUC

    中华民国
  • Log Loss

    日志损失
  • Coefficient of Determination (R-Squared)

    测定系数(R平方)
  • Summary

    摘要

为什么需要评估? (Why evaluation is necessary?)

Let me start with a very simple example.

让我从一个非常简单的例子开始。

Robin and Sam both started preparing for an entrance exam for engineering college. They both shared a room and put equal amount of hard work while solving numerical problems. They both studied almost the same hours for the entire year and appeared in the final exam. Surprisingly, Robin cleared but Sam did not. When asked, we got to know that their was one difference in their strategy of preparation, “test series”. Robin had joined a test series and he used to test his knowledge and understanding by giving those exams and then further evaluating where is he lagging. But Sam was confident and he just kept training himself.

罗宾和山姆都开始为工科大学准备入学考试。 他们俩共享一个房间,并在解决数值问题时付出了相同的努力。 他们俩全年学习了几乎相同的时间,并参加了期末考试。 出人意料的是,罗宾清除了,但萨姆没有清除。 当被问到时,我们知道他们是他们准备策略“测试系列”的一个区别。 罗宾加入了一个测试系列,他过去通过参加那些考试来测试他的知识和理解力,然后进一步评估他落后的地方。 但是山姆很自信,他只是不断地训练自己。

In the same fashion as discussed above, a machine learning model can be trained extensively with many parameters and new techniques but as long as you are skipping it’s evaluation, you cannot trust it.

以与上述相同的方式,可以使用许多参数和新技术对机器学习模型进行广泛的训练,但是只要您不对其进行评估,就无法信任它。

如何阅读混淆矩阵? (How to read Confusion Matrix?)

A confusion matrix is a correlation between the predictions of a model and the actual class labels of the data points.

混淆矩阵是模型的预测与数据点的实际类别标签之间的相关性。

Let’s say you are building a model which detects whether a person has diabetes or not. After train-test split you got a test set of length 100 out of which 70 data points are labeled positive (1) and 30 data points are labelled negative (0). Now let me draw the matrix for your test prediction:

假设您正在建立一个模型来检测一个人是否患有糖尿病。 经过火车测试拆分后,您得到了长度为100的测试集,其中70个数据点标记为正(1),而30个数据点标记为负(0)。 现在,让我为您的测试预测绘制矩阵:

Image for post

Out of 70 actual positive data points, your model predicted 64 points as positive and 6 as negative. Out of 30 actual negative points, it predicted 3 as positive and 27 as negative.

在70个实际的阳性数据点中,您的模型预测64个点为正,6个点为负。 在30个实际负点中,它预测3个正点和27个负点。

Note: In the notations True Positive, True Negative, False Positive & False Negative, notice that the second term (Positive or Negative) is denoting your prediction and 1st term denotes whether your predicted right or wrong.

注意:在符号True Positive,True Negative,False Positive和False Negative中 ,请注意第二项(Positive或Negative)表示您的预测,而第一项则表示您预测的是对还是错。

Based on the above matrix we can define some very important ratios:

根据上面的矩阵,我们可以定义一些非常重要的比率:

  • TPR (True Positive Rate) =( True Positive / Actual Positive )

    TPR(真正率)=(真正/实际正)

  • TNR (True Negative Rate) =( True Negative/ Actual Negative)

    TNR(真负利率)=(真负/实际负)

  • FPR (False Positive Rate) =( False Positive / Actual Negative )

    FPR(误报率​​)=(误报/实际负)

  • FNR (False Negative Rate) =( False Negative / Actual Positive )

    FNR(假阴性率)=(假阴性/实际阳性)

For our case of diabetes detection model we can calculate these ratios:

对于我们的糖尿病检测模型,我们可以计算以下比率:

TPR = 91.4%

TPR = 91.4%

TNR = 90%

TNR = 90%

FPR = 10%

FPR = 10%

FNR = 8.6%

FNR = 8.6%

If you want your model to be smart then your model has to predict correctly. Which means your True Positives and True Negatives should be as high as possible and at the same t

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值