分类问题的评估(二分类&多分类)

召回率、准确率、F值

对于二分类问题,可将样例根据其真实类别和分类器预测类别划分为:

真正例(True Positive,TP):真实类别为正例,预测类别为正例。
假正例(False Positive,FP):真实类别为负例,预测类别为正例。
假负例(False Negative,FN):真实类别为正例,预测类别为负例。
真负例(True Negative,TN):真实类别为负例,预测类别为负例。
然后可以构建混淆矩阵(Confusion Matrix)如下表所示。

准确率,又称查准率(Precision,P):
P = T P T P + F P P = \frac {TP} {TP+FP} P=TP+FPTP
召回率,又称查全率(Recall,R):
R = T P T P + F N R = \frac {TP} {TP + FN } R=TP+FNTP
F 1 F_1 F1值:
F 1 = 2 ∗ P ∗ R P + R F_1 = \frac {2*P*R} {P+R} F1=P+R2PR
F 1 F_1 F1的一般形式:
F β = ( 1 + β 2 ) ∗ P ∗ R ( β 2 ∗ P ) + R F_\beta = \frac {(1+\beta^2)*P*R} {(\beta^2*P)+R} Fβ=(β2P)+R(1+β2)PR

宏平均(macro-average)和微平均(micro-average)
如果只有一个二分类混淆矩阵,那么用以上的指标就可以进行评价,没有什么争议,但是当我们在n个二分类混淆矩阵上要综合考察评价指标的时候就会用到宏平均和微平均。宏平均(macro-average)和微平均(micro-average)是衡量文本分类器的指标。

When dealing with multiple classes there are two possible ways of averaging these measures(i.e. recall, precision, F1-measure) , namely, macro-average and micro-average. The macro-average weights equally all the classes, regardless of how many documents belong to it. The micro-average weights equally all the documents, thus favouring the performance on common classes. Different classifiers will perform different in common and rare categories. Learning algorithms are trained more often on more populated classes thus risking local over-fitting.

宏平均(Macro-averaging),是先对每一个类统计指标值,然后在对所有类求算术平均值。宏平均指标相对微平均指标而言受小类别的影响更大。
M a c r o _ P = 1 n ∑ i = 1 n P i Macro\_P = \frac {1} {n} \sum_{i=1}^nP_i Macro_P=n1i=1nPi
M a c r o _ R = 1 n ∑ i = 1 n R i Macro\_R = \frac {1} {n} \sum_{i=1}^nR_i Macro_R=n1i=1nRi
M a c r o _ F = 1 n ∑ i = 1 n F i Macro\_F = \frac {1} {n} \sum_{i=1}^nF_i Macro_F=n1i=1nFi
M a c r o _ F = 2 ∗ M a c r o _ P ∗ M a c r o _ R M a c r o _ P + M a c r o _ R Macro\_F = \frac {2*Macro\_P*Macro\_R} {Macro\_P+Macro\_R} Macro_F=Macro_P+Macro_R2Macro_PMacro_R
其中 P i = T P i T P i + F P i P_i = \frac {TP_i} {TP_i+FP_i} Pi=TPi+FPiTPi R i = T P i T P i + F N i R_i = \frac {TP_i} {TP_i+FN_i} Ri=TPi+FNiTPi
从上面的公式我们可以看到微平均并没有什么疑问,但是在计算宏平均F值时我给出了两个公式分别。都可以用。
微平均(Micro-averaging),是对数据集中的每一个实例不分类别进行统计建立全局混淆矩阵,然后计算相应指标。
M i c r o _ P = ∑ i = 1 n T P i ∑ i = 1 n T P i + ∑ i = 1 n F P i Micro\_P = \frac {\sum_{i=1}^n TP_i} {\sum_{i=1}^n TP_i+\sum_{i=1}^n FP_i} Micro_P=i=1nTPi+i=1nFPii=1nTPi
M i c r o _ P = ∑ i = 1 n T P i ∑ i = 1 n T P i + ∑ i = 1 n F N i Micro\_P = \frac {\sum_{i=1}^n TP_i} {\sum_{i=1}^n TP_i+\sum_{i=1}^n FN_i} Micro_P=i=1nTPi+i=1nFNii=1nTPi
M a c r o _ F = 2 ∗ M a c r o _ P ∗ M a c r o _ R M a c r o _ P + M a c r o _ R Macro\_F = \frac {2*Macro\_P*Macro\_R} {Macro\_P+Macro\_R} Macro_F=Macro_P+Macro_R2Macro_PMacro_R

“macro” simply calculates the mean of the binary metrics,giving equal weight to each class. In problems where infrequent classesare nonetheless important, macro-averaging may be a means of highlightingtheir performance. On the other hand, the assumption that all classes areequally important is often untrue, such that macro-averaging willover-emphasize the typically low performance on an infrequent class.
“weighted” accounts for class imbalance by computing the average ofbinary metrics in which each class’s score is weighted by its presence in thetrue data sample.
“micro” gives each sample-class pair an equal contribution to the overallmetric (except as a result of sample-weight). Rather than summing themetric per class, this sums the dividends and divisors that make up theper-class metrics to calculate an overall quotient.Micro-averaging may be preferred in multilabel settings, includingmulticlass classification where a majority class is to be ignored.
“samples” applies only to multilabel problems. It does not calculate aper-class measure, instead calculating the metric over the true and predictedclasses for each sample in the evaluation data, and returning their(sample_weight-weighted) average.

  • 4
    点赞
  • 23
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值