Measuring performance of classifiers

Confusion Matrix is a common method for describing the performance of classifiers. It's a simple cross tabulation of predicted classes vs. obsessed classes.




Overall Accuracy and Kappa Statistic

The simplest measure of accuracy of this model is called Overall Accuracy which is simply the percent of samples that model predicted correct class for them. 


Although Overall Accuracy is simple and interpretable,  there are at least two major problems with this measure:

  • The Overall Accuracy measure makes no assumptions about natural frequencies of classes. For example, if we build a model to classify credit card transactions as fraudulent or good, we probably can simply achieve very high Overall Accuracy by predicting all transactions as good. Because, the fraudulent transactions account for small part of all transactions.
  • The Overall Accuracy measure treats all classes the same. Consider a scenario such as classification of emails as Spam or Good. In this scenario classifying a good email as spam and deleting it will have a negative impact on user experience and has a higher cost compare to misclassifying a spam email as good. The Overall Accuracy measure does not distinguish between a model that misclassifies the good emails or spam emails.
The overall accuracy measure helps us to understand if model passes the minimum requirements. The overall accuracy needs to be higher than no-information rate for the model to be even considered. For example in the case of simple binary classification, the no information rate based on pure randomness is 50%. So, if we randomly assign classes to each observation, with a large enough sample, we probably get 50% accuracy. So, any model with overall accuracy of less than 50% in binary classification and less than 1/C (assuming there are C classes) accuracy will be unacceptable


An alternative to no information rate is Kappa Statistic. This statistic shows the overall agreement between two raters.  This statistic can have values between -1 and 1. One shows complete agreement, zero shows complete disagreement and -1 shows complete agreement in opposite direction. Kappa statistics higher than 0.3 to 0.5 is considered acceptable.


Sensitivity and Specificity

Sensitivity(a.k.a True Positive Rate, TP or Recall):measures the proportion of positives that are correctly identified as such(e.g., the percentage of sick people who are correctly identified as having the condition)

Specificity(a.k.a True Negative Rate, TN): measures the proportion of negatives that correctly identified as such(e.g., the percentage of healthy people who are correctly identified as not having the condition)


Younden's Index:

J = Sensitivity+Specificity - 1

Its value ranges from 0 to 1, and has a zero when a diagnostic test gives the same proportion of positive results for groups with and without the disease, i.e the test is useless. A value of 1 indicates that there are no false positives or false negatives values, i.e. the test is perfect. The index gives equal weight to false positive and false negative values, so all tests with same value of the index give the same proportion of total misclassified results.


This index as well as other measures such as F-score is being used in conjunction with ROC curves to identify the best cut-off threshold of probabilities to predict classes.


Calibration Plot:

One approach to create calibration plot is partitioning the predicted probabilities of test values to different bins. Then calculate the ration of observed events among samples that fall in each bin. Finally, plotting the mid point value of each bin against ratio of events among samples in that bin should be a 45 degree line for well calibrated probabiities.


Receiving Operator Characteristic(ROC) Curves:

ROC curves were designed as a general method that, given a collection of continuous data points, determine an effective threshold such that values above threshold are indicative of a specific event.


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值