[Elements of Information Theory]

本文深入探讨了信息论中的熵概念,包括离散随机变量的熵定义、性质及其与不确定性度量的关系。介绍了二元熵函数,并展示了熵在概率分布中的变化趋势。接着,联合熵和条件熵被提出,阐述了它们如何描述两个或多个变量之间的信息关系。最后,讨论了零熵的含义,即当一个变量完全确定时的情况。这些理论在数据压缩、通信和统计推断等领域有着广泛应用。
摘要由CSDN通过智能技术生成

Entropy

Definition

  • Let X be a discrete random variable with alphabet X \mathcal{X} X and probability mass function p ( x ) = P r ( X = x ) , x ϵ X . p(x)=Pr(X=x),x \epsilon \mathcal{X}. p(x)=Pr(X=x),xϵX.

  • The entropy of X is defined as
    H ( X ) = − ∑ x ϵ X p ( x ) l o g p ( x ) H(X)=-\sum_{x\epsilon \mathcal{X}}p(x)logp(x) H(X)=xϵXp(x)logp(x)
    a measure of a uncertainty of a random variable

  • H ( X ) H(X) H(X) only depends on p ( x ) p(x) p(x).We also write H ( p ) H(p) H(p) for H ( X ) H(X) H(X).

  • H ( X ) ≥ 0 H(X)\ge0 H(X)0

  • When X X X is uniform over X \mathcal{X} X,then H ( X ) = l o g ∣ X ∣ H(X)=log\lvert \mathcal{X} \rvert H(X)=logX

  • H b ( X ) = l o g b a ∗ H a ( X ) H_{b}(X)=log_{b}a*H_{a}(X) Hb(X)=logbaHa(X)

Example

  • Binary entropy function H ( p ) H(p) H(p)
    L e t X = { 1 with probability p 0 with probability 1-p Let X= \begin{cases} 1& \text{with probability p}\\ 0& \text{with probability 1-p} \end{cases} LetX={10with probability pwith probability 1-p
    H ( X ) = − p l o g ( p ) − ( 1 − p ) l o g ( 1 − p ) H(X)=-plog(p)-(1-p)log(1-p) H(X)=plog(p)(1p)log(1p)

  • H ( X ) = − E p [ l o g p ( X ) ] H(X)=-E_{p}[logp(X)] H(X)=Ep[logp(X)]

  • For a discrete random variable X X X defined on X \mathcal{X} X,
    0 ≤ H ( X ) ≤ l o g ∣ X ∣ 0\le H(X)\le log\vert \mathcal{X} \rvert 0H(X)logX
    Equality if and only if p ( x ) = 1 / ∣ X ∣ p(x)=1/ \lvert \mathcal{X} \rvert p(x)=1/X.(Uniform distribution maximizes entropy)

  • Convexity is widely applied
    ∑ i p i f ( x i ) ≤ f ( ∑ i p i x i ) \sum_{i}p_{i}f(x_{i})\le f(\sum_{i}p_{i}x_{i}) ipif(xi)f(ipixi)

Joint Entropy

  • Two random variables X X X and Y Y Y can be considered to be a single vector-valued random variable
  • The joint entropy H ( X , Y ) H(X,Y) H(X,Y) of a pair of discrete random variable ( X , Y ) (X,Y) (X,Y) with joint distribution p ( x , y ) p(x,y) p(x,y) is defined as
    H ( X , Y ) = − ∑ x ϵ X ∑ y ϵ Y p ( x , y ) l o g p ( x , y ) H(X,Y)=-\sum_{x\epsilon \mathcal{X}} \sum_{y\epsilon \mathcal{Y}}p(x,y)logp(x,y) H(X,Y)=xϵXyϵYp(x,y)logp(x,y)
  • H ( X , Y ) = − E l o g p ( X , Y ) H(X,Y)=-Elogp(X,Y) H(X,Y)=Elogp(X,Y)
  • H ( X , X ) = H ( X ) H(X,X)=H(X) H(X,X)=H(X)
  • H ( X , Y ) = H ( Y , X ) H(X,Y)=H(Y,X) H(X,Y)=H(Y,X)
  • H ( X 1 , X 2 , . . . , X n ) = − ∑ p ( x 1 , x 2 , . . . , x n ) l o g p ( x 1 , x 2 , . . . , x n ) H(X_{1},X_{2},...,X_{n})=-\sum p(x_{1},x_{2},...,x_{n})logp(x_{1},x_{2},...,x_{n}) H(X1,X2,...,Xn)=p(x1,x2,...,xn)logp(x1,x2,...,xn)

Conditional Entropy

  • Entropy for p ( Y ∣ X = x ) p(Y|X=x) p(YX=x)
    H ( Y ∣ X = x ) = ∑ y − p ( y ∣ X = x ) l o g p ( y ∣ X = x ) = − E l o g p ( y ∣ X = x ) H(Y|X=x)=\sum_{y}-p(y|X=x)logp(y|X=x)=-Elogp(y|X=x) H(YX=x)=yp(yX=x)logp(yX=x)=Elogp(yX=x)
  • When X X X is known: H ( Y ∣ X ) ≤ H ( Y ) H(Y|X)\le H(Y) H(YX)H(Y)
  • H ( X ∣ Y ) ≠ H ( Y ∣ X ) H(X|Y)\ne H(Y|X) H(XY)=H(YX)
  • H ( X ∣ Y ) + H ( Y ) = H ( Y ∣ X ) + H ( X ) = H ( X , Y ) H(X|Y)+H(Y)=H(Y|X)+H(X)=H(X,Y) H(XY)+H(Y)=H(YX)+H(X)=H(X,Y)

Zero Entropy

  • If H ( Y ∣ X ) = 0 H(Y|X)=0 H(YX)=0:
    • then Y Y Y is a function of X X X.
    • H ( Y ∣ X = x ) = 0 H(Y|X=x)=0 H(YX=x)=0
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值