信息量
符号
x
x
x的信息量定义为
x
x
x出现概率的倒数,单位比特
I
(
x
)
=
log
1
P
(
x
)
I(x)=\log \frac{1}{P(x)}
I(x)=logP(x)1
熵
平均信息量
H
(
P
)
=
∑
P
(
x
)
log
1
P
(
x
)
H(P)=\sum P(x)\log \frac{1}{P(x)}
H(P)=∑P(x)logP(x)1
交叉熵
H ( P , Q ) = ∑ P ( x ) log 1 Q ( x ) H(P,Q)=\sum P(x)\log \frac{1}{Q(x)} H(P,Q)=∑P(x)logQ(x)1
相对熵(KL散度)
D K L ( P ∣ ∣ Q ) = H ( P , Q ) − H ( P ) = ∑ P ( x ) log P ( x ) Q ( x ) D_{KL}(P||Q)=H(P,Q)-H(P)=\sum P(x)\log \frac{P(x)}{Q(x)} DKL(P∣∣Q)=H(P,Q)−H(P)=∑P(x)logQ(x)P(x)
JS散度
D J S ( P ∣ ∣ Q ) = 1 2 D K L ( P ∣ ∣ P + Q 2 ) + 1 2 D K L ( Q ∣ ∣ P + Q 2 ) D_{JS}(P||Q)=\frac{1}{2}D_{KL}(P||\frac{P+Q}{2})+\frac{1}{2}D_{KL}(Q||\frac{P+Q}{2}) DJS(P∣∣Q)=21DKL(P∣∣2P+Q)+21DKL(Q∣∣2P+Q)