最重要分清四个变量:
https://en.wikipedia.org/wiki/Sensitivity_and_specificity
- True positive (TP): e.g., Sick people correctly identified as sick
- False positive (FP): e.g., Healthy people incorrectly identified as sick
- True negative (TN): e.g., Healthy people correctly identified as healthy
- False negative (FN): e.g., Sick people incorrectly identified as healthy
In general, Positive = identified and negative = rejected. Therefore:
- True positive = correctly identified
- False positive = incorrectly identified
- True negative = correctly rejected
- False negative = incorrectly rejected
所以:
condition positive (P):the number of real positive cases in the data
condition negative (N):the number of real negative cases in the data
即 P = TP + FN, F = TN + FP
准确率 Accuracy = (TP + TN) / (P + N) = (TP + TN) / (TP + FN + TN + FP)
精确率Precision = TP / (TP + FP)
召回率Recall = TP / P = TP / (TP + FN)
F1-Score = 2 * (Precision * Recall) / (Precision + Recall) = 2 * TP / (2 * TP + FP +FN)
示例说明:
1000个病人,现被确诊400个病人,600个健康人。
这400个病人实际上有300个病人(TP)和100给误判的健康人(FP);
而这判定为健康的600人中,有500个是健康人(TN),100个病人(FN)。
即: TP = 300, FP = 100, TN = 500, FN = 100.
注:实际上的患者为500人(P = TP + FN),500个健康人(F = TN + FP)
所以:
准确率Accuray = (TP + TN) / (TP + FN + TN + FP) = 800 / 1000 = 80%
精确率Precision = TP / (TP + FP) = 300 / (300 + 100)= 75%
召回率Recall = TP / (TP + FN) = 300 / (300 + 100) = 75%
F1-score = 2 * (Precision * Recall) / (Precision + Recall) = 75%