累积增益(Cumulative Gain, CG)
C G p = ∑ i = 1 p r e l [ i ] CG_{p} = \sum_{i=1}^{p} rel_[i] CGp=i=1∑prel[i]
折损累计增益(Discounted cumulative gain,DCG)
D
C
G
p
=
∑
i
=
1
p
r
e
l
i
i
log
2
(
i
+
1
)
=
r
e
l
i
+
∑
i
=
2
p
r
e
l
i
i
log
2
(
i
+
1
)
DCG_{p}=\sum_{i=1}^{p} \frac{reli_{i}}{\log_{2}(i+1)}=rel_{i}+\sum_{i=2}^{p} \frac{reli_{i}}{\log_{2}(i+1)}
DCGp=i=1∑plog2(i+1)relii=reli+i=2∑plog2(i+1)relii
常用公式用来增加相关度影响的比重:
D
C
G
p
=
∑
i
=
1
p
2
r
e
l
i
i
−
1
log
2
(
i
+
1
)
DCG_{p}=\sum_{i=1}^{p} \frac{2^{reli_{i}}-1}{\log_{2}(i+1)}
DCGp=i=1∑plog2(i+1)2relii−1
归一化折损累计增益(Normalized Discounted Cumulative Gain, NDCG)
N
D
C
G
p
=
D
C
G
p
I
D
C
G
p
NDCG_{p}=\frac{DCG_{p}}{IDCG_{p}}
NDCGp=IDCGpDCGp
其中IDCG为理想的DCG
I
D
C
G
p
=
∑
i
=
1
R
E
L
p
2
r
e
l
i
i
−
1
log
2
(
i
+
1
)
IDCG_{p}=\sum_{i=1}^{REL_{p}} \frac{2^{reli_{i}}-1}{\log_{2}(i+1)}
IDCGp=i=1∑RELplog2(i+1)2relii−1
其中
r
e
l
p
rel_{p}
relp表示语料库中相关性最高的p个文档列表。
精确度PRE
正样本中被预测正确或者检索正确的精度。
P
R
E
=
T
P
T
P
+
F
N
PRE = \frac {TP}{TP+FN}
PRE=TP+FNTP