信息熵
信息I,是发生概率的函数,我们先验地希望独立事件具备:
I ( A , B ) = I ( A ) + I ( B ) I(A,B)=I(A)+I(B) I(A,B)=I(A)+I(B)
P ( A , B ) = P ( A ) ∗ P ( B ) P(A,B)=P(A)*P(B) P(A,B)=P(A)∗P(B)
即满足非负和可加,故定义 I ( W ) = − l o g ( P ( W ) ) I(W)=-log(P(W)) I(W)=−log(P(W))
熵E,信息的期望
H ( w ) = P ∗ I = − P ∗ l o g ( P ) = − ∑ i P ( x i ) ∗ l o g ( P ( x i ) ) H(w)=P*I=-P*log(P)=-\sum_i P(x_i)*log(P(x_i)) H(w)=P∗I=−P∗log(P)=−∑iP(xi)∗