统计学指标
Cohen’s Kappa coefficient
用途:Cohen’s Kappa coefficient用于度量两个rators对于同一事物二分类的一致性程度。
科研中,有些二分类任务的结果需要human evaluation作为groundtruth。这时候,两个author会对实验结果进行采样并判断是否分类正确,Cohen’s Kappa coefficient越高,代表他们对于结果一致认可的程度越高。
Cohen’s kappa coefficient计算公式
k = ( p 0 − p e ) / ( 1 − p e ) k = (p_0 - p_e) / (1 - p_e) k=(p0−pe)/(1−pe)
其中:
-
p 0 p_0 p0 relative observed agreement among rators。
p 0 p_0 p0 是所有打分中,两个rator打分一致的频次 -
p e p_e pe hypothetical probability of chance agreement
p e p_e pe 是根据观察,两个rator打分一致的概率
举个例子来说明:
博物馆有100个待展览的展品,两个管理员独立对它们进行分类,yes代表展出,no代表不展出。
他们的打分结果如下
rator1\rator2 | YES | No |
---|---|---|
Yes | 30 | 20 |
No | 15 | 35 |
计算过程如下:
- p 0 = ( 30 + 35 ) / 100 = 0.65 p_0 = (30 + 35) / 100 = 0.65 p0=(30+35)/100=0.65 (意见一致的频率)
- p e = 0.5 ∗ 0.45 + 0.5 ∗ 0.55 = 0.5 p_e = 0.5 * 0.45 + 0.5 * 0.55 = 0.5 pe=0.5∗0.45+0.5∗0.55=0.5 (根据观察,同时打yes或者同时打no的概率)
最终的Cohen’s Kappa coefficient得分为
(
0.65
−
0.5
)
/
(
1
−
0.5
)
=
0.3
(0.65 - 0.5) / (1 - 0.5) = 0.3
(0.65−0.5)/(1−0.5)=0.3
Cohen’s Kappa coefficient的参考分数
score | interpretation |
---|---|
≤ 0 \le 0 ≤0 | no agreement |
( 0 , 0.20 ] (0,0.20] (0,0.20] | none to slight |
( 0.21 , 0.40 ] (0.21,0.40] (0.21,0.40] | fair |
( 0.41 , 0.60 ] (0.41,0.60] (0.41,0.60] | moderate |
( 0.61 , 0.80 ] (0.61,0.80] (0.61,0.80] | substantial |
( 0.81 , 1.00 ] (0.81,1.00] (0.81,1.00] | perfect agreement |