1. 熵
H ( P ) = ∑ P ⋅ log 1 P = − ∑ P ⋅ log P H(P) = \sum P\cdot\log \frac{1}{P}=-\sum P \cdot \log P H(P)=∑P⋅logP1=−∑P⋅logP
2. 交叉熵
令
P
P
P为真实分布,
Q
Q
Q为预测分布:
X
E
(
P
,
Q
)
=
∑
P
⋅
log
1
Q
=
−
∑
P
⋅
log
Q
XE(P, Q) = \sum P\cdot \log\frac{1}{Q}\\ =-\sum P\cdot \log Q
XE(P,Q)=∑P⋅logQ1=−∑P⋅logQ
3. 相对熵(KL散度)
令
P
P
P为真实分布,
Q
Q
Q为预测分布:
K
L
(
P
∣
∣
Q
)
=
∑
P
⋅
log
P
Q
=
X
E
(
P
,
Q
)
−
H
(
P
)
KL(P||Q)=\sum P\cdot \log \frac{P}{Q} \\ =XE(P,Q)-H(P)
KL(P∣∣Q)=∑P⋅logQP=XE(P,Q)−H(P)
- 《一篇文章讲清楚交叉熵和KL散度》,讲的很形象。