Measuring performance of classifiers

最新推荐文章于 2016-07-05 11:02:28 发布

walter1990

最新推荐文章于 2016-07-05 11:02:28 发布

阅读量251

点赞数

分类专栏：机器学习

本文链接：https://blog.csdn.net/suichen1/article/details/50718754

版权

机器学习专栏收录该内容

33 篇文章 0 订阅

订阅专栏

Confusion Matrix is a common method for describing the performance of classifiers. It's a simple cross tabulation of predicted classes vs. obsessed classes.

Overall Accuracy and Kappa Statistic

The simplest measure of accuracy of this model is called Overall Accuracy which is simply the percent of samples that model predicted correct class for them.

Although Overall Accuracy is simple and interpretable, there are at least two major problems with this measure:

The Overall Accuracy measure makes no assumptions about natural frequencies of classes. For example, if we build a model to classify credit card transactions as fraudulent or good, we probably can simply achieve very high Overall Accuracy by predicting all transactions as good. Because, the fraudulent transactions account for small part of all transactions.

The Overall Accuracy measure treats all classes the same. Consider a scenario such as classification of emails as Spam or Good. In this scenario classifying a good email as spam and deleting it will have a negative impact on user experience and has a higher cost compare to misclassifying a spam email as good. The Overall Accuracy measure does not distinguish between a model that misclassifies the good emails or spam emails.

The overall accuracy measure helps us to understand if model passes the minimum requirements. The overall accuracy needs to be higher than no-information rate for the model to be even considered. For example in the case of simple binary classification, the no information rate based on pure randomness is 50%. So, if we randomly assign classes to each observation, with a large enough sample, we probably get 50% accuracy. So, any model with overall accuracy of less than 50% in binary classification and less than 1/C (assuming there are C classes) accuracy will be unacceptable

An alternative to no information rate is Kappa Statistic. This statistic shows the overall agreement between two raters. This statistic can have values between -1 and 1. One shows complete agreement, zero shows complete disagreement and -1 shows complete agreement in opposite direction. Kappa statistics higher than 0.3 to 0.5 is considered acceptable.

Sensitivity and Specificity

Sensitivity(a.k.a True Positive Rate, TP or Recall):measures the proportion of positives that are correctly identified as such(e.g., the percentage of sick people who are correctly identified as having the condition)

Specificity(a.k.a True Negative Rate, TN): measures the proportion of negatives that correctly identified as such(e.g., the percentage of healthy people who are correctly identified as not having the condition)

Younden's Index:

J = Sensitivity+Specificity - 1

Its value ranges from 0 to 1, and has a zero when a diagnostic test gives the same proportion of positive results for groups with and without the disease, i.e the test is useless. A value of 1 indicates that there are no false positives or false negatives values, i.e. the test is perfect. The index gives equal weight to false positive and false negative values, so all tests with same value of the index give the same proportion of total misclassified results.

This index as well as other measures such as F-score is being used in conjunction with ROC curves to identify the best cut-off threshold of probabilities to predict classes.

Calibration Plot:

One approach to create calibration plot is partitioning the predicted probabilities of test values to different bins. Then calculate the ration of observed events among samples that fall in each bin. Finally, plotting the mid point value of each bin against ratio of events among samples in that bin should be a 45 degree line for well calibrated probabiities.

Receiving Operator Characteristic(ROC) Curves:

ROC curves were designed as a general method that, given a collection of continuous data points, determine an effective threshold such that values above threshold are indicative of a specific event.

walter1990

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Measuring performance of classifiers

Confusion Matrix is a common method for describing the performance of classifiers. It's a simple cross tabulation of predicted classes vs. obsessed classes.Overall Accuracy and Kappa Sta
复制链接

扫一扫