Distributed Source Coding

Reference:

Elements of Information Theory, 2nd Edition

Slides of EE4560, TUD

Introduction

  • We know how to encode a source X X X. A rate R ≥ H ( X ) R\ge H(X) RH(X) is sufficient.

  • If there are two sources ( X , Y ) (X,Y) (X,Y), a rate R ≥ H ( X , Y ) R\ge H(X,Y) RH(X,Y) is sufficient.

在这里插入图片描述

  • But what if the X X X and Y Y Y sources must be described separately for some user who wishes to reconstruct both X X X and Y Y Y?

  • Clearly, by separately encoding X X X and Y Y Y, it is seen that a rate R = R x + R y ≥ H ( X ) + H ( Y ) R=R_x+R_y\ge H(X)+H(Y) R=Rx+RyH(X)+H(Y) is sufficient.

  • However, in a surprising and fundamental paper by Slepian and Wolf, it is shown that a total rate R = H ( X , Y ) R=H(X,Y) R=H(X,Y) is sufficient even for separate encoding of correlated sources.

  • Intuitively, since H ( X , Y ) = H ( X ) + H ( Y ∣ X ) H(X,Y)=H(X)+H(Y|X) H(X,Y)=H(X)+H(YX), we can first encode source X X X at a rate R 1 ≥ H ( X ) R_1\ge H(X) R1H(X) after which we encode source Y Y Y, given X X X, at a rate R 2 ≥ H ( Y ∣ X ) R_2\ge H(Y|X) R2H(YX).

    More specifically,

    • Using n H ( X ) n H(X) nH(X) bits we can encode X n X^{n} Xn efficiently, so that the decoder can reconstruct X n X^{n} Xn with arbitrarily low probability of error
    • Associated with every x n x^{n} xn is a typical “fan” of y n y^{n} yn sequences that are jointly typical with the given x n , 2 n H ( Y ∣ X ) x^{n}, 2^{n H(Y | X)} xn,2nH(YX) in total
    • The encoder can send the index of the y n y^{n} yn within this typical fan for which he needs n H ( Y ∣ X ) n H(Y | X) nH(YX) bits
    • The decoder, also knowing x n x^{n} xn, can then construct the typical fan and hence reconstruct y n y^{n} yn

在这里插入图片描述

The graph of the whole process can be presented as

在这里插入图片描述

  • But what if the Y Y Y encoder does not know which sequence x n x^n xn is encoded?

在这里插入图片描述

Slepian-Wolf Coding

Let ( X 1 , Y 1 ) , ( X 2 , Y 2 ) , … \left(X_{1}, Y_{1}\right),\left(X_{2}, Y_{2}\right), \ldots (X1,Y1),(X2,Y2), be a sequence of jointly distributed random variables i.i.d. ∼ p ( x , y ) \sim p(x, y) p(x,y)

Definition 1 (Distributed source code):

A ( ( 2 n R 1 , 2 n R 2 ) , n ) \left(\left(2^{n R_{1}}, 2^{n R_{2}}\right), n\right) ((2nR1,2nR2),n) distributed source code for the joint sources ( X , Y ) (X, Y) (X,Y) consists of two encoder maps
f 1 : X n → { 1 , 2 , … , 2 n R 1 } f 2 : Y n → { 1 , 2 , … , 2 n R 2 } \begin{array}{l} f_{1}: \mathcal{X}^{n} \rightarrow\left\{1,2, \ldots, 2^{n R_{1}}\right\} \\ f_{2}: \mathcal{Y}^{n} \rightarrow\left\{1,2, \ldots, 2^{n R_{2}}\right\} \end{array} f1:Xn{1,2,,2nR1}f2:Yn{1,2,,2nR2}
and a decoder map
g : { 1 , 2 , … , 2 n R 1 } × { 1 , 2 , … , 2 n R 2 } → X n × Y n g:\left\{1,2, \ldots, 2^{n R_{1}}\right\} \times\left\{1,2, \ldots, 2^{n R_{2}}\right\} \rightarrow \mathcal{X}^{n} \times \mathcal{Y}^{n} g:{1,2,,2nR1}×{1,2,,2nR2}Xn×Yn
Definition 2 (Probability of error):

The probability of error for a distributed source code is defined as
P ϵ ( n ) = Pr ⁡ ( g ( f 1 ( X n ) , f 2 ( Y n ) ) ≠ ( X n , Y n ) ) P_{\epsilon}^{(n)}=\operatorname{Pr}\left(g\left(f_{1}\left(X^{n}\right), f_{2}\left(Y^{n}\right)\right) \neq\left(X^{n}, Y^{n}\right)\right) Pϵ(n)=Pr(g(f1(Xn),f2(Yn))=(Xn,Yn))
Definition 3 (Achievable):

A rate pair ( R 1 , R 2 ) \left(R_{1}, R_{2}\right) (R1,R2) is said to be achievable for a distributed source if there exists a sequence of ( ( 2 n R 1 , 2 n R 2 ) , n ) \left(\left(2^{n R_{1}}, 2^{n R_{2}}\right), n\right) ((2nR1,2nR2),n) distributed source codes with probability of error P ϵ ( n ) → 0 P_{\epsilon}^{(n)} \rightarrow 0 Pϵ(n)0. The achievable rate region is the closure of the set of achievable rates.

Theorem 1 (Slepian-Wolf):

For a distributed source coding problem for the source ( X , Y ) (X, Y) (X,Y) drawn i.i.d. ∼ p ( x , y ) \sim p(x, y) p(x,y), the achievable rate region is given by
R 1 ≥ H ( X ∣ Y ) R 2 ≥ H ( Y ∣ X ) R 1 + R 2 ≥ H ( X , Y ) \begin{aligned} R_{1} & \geq H(X | Y) \\ R_{2} & \geq H(Y | X) \\ R_{1}+R_{2} & \geq H(X, Y) \end{aligned} R1R2R1+R2H(XY)H(YX)H(X,Y)

在这里插入图片描述

[Example, Slides 17-18]

Random Binning

It is an encoding and decoding scheme that enables ( R 1 , R 2 ) (R_1,R_2) (R1,R2) to be achievable when R 1 + R 2 = H ( X , Y ) R_1+R_2=H(X,Y) R1+R2=H(X,Y) even if the Y Y Y encoder does not know which sequence x n x^n xn is encoded.

Encoding and Decoding Scheme

Encoding:

  • For each sequence x n x^{n} xn, draw an index at random from { 1 , 2 , … , 2 n R } \left\{1,2, \ldots, 2^{n R}\right\} {1,2,,2nR}.
  • Sequences x n x^{n} xn having the same index are said to form a bin.

在这里插入图片描述

Decoding:

  • Given a bin index, we look for a typical x n x^n xn sequence in the bin
  • If there is one and only one typical x n x^n xn in the bin, we declare it to be the estimate x ^ n \hat x^n x^n; otherwise an error is declared

Error: Given a bin index

  • the sequence in the bin is non-typical
  • there is more than one typical sequence in the bin

在这里插入图片描述

We first prove that under this scheme, if R ≥ H ( X ) R\ge H(X) RH(X), the probability of error is arbitrarily small and the code achieves the same result as the code introduced by Shannon (typical set coding).
Pr ⁡ ( g ( f ( X n ) ) ≠ X n ) = Pr ⁡ ( A ˉ ϵ ( n ) ) + ∑ x n ∈ A ϵ ( n ) p ( x n ) Pr ⁡ ( ∃ x ~ n ≠ x n : f ( x ~ n ) = f ( x n ) ) ≤ ϵ + ∑ x n ∈ A ϵ ( n ) p ( x n ) ∑ x ~ n ∈ A ϵ ( n ) 2 − n R ≤ ϵ + 2 n ( H ( X ) + ϵ ) 2 − n R ≤ ϵ ′ \begin{aligned} \Pr\left(g\left(f\left(X^{n}\right)\right) \neq X^{n}\right)&=\operatorname{Pr}\left(\bar{A}_{\epsilon}^{(n)}\right)+\sum_{x^{n} \in A_{\epsilon}^{(n)}} p\left(x^{n}\right) \operatorname{Pr}\left(\exists \tilde{x}^{n} \neq x^{n}: f\left(\tilde{x}^{n}\right)=f\left(x^{n}\right)\right) \\ &\leq \epsilon+\sum_{x^{n} \in A_{\epsilon}^{(n)}} p\left(x^{n}\right) \sum_{\tilde{x}^{n} \in A_{\epsilon}^{(n)}} 2^{-n R} \\ &\leq \epsilon+2^{n(H(X)+\epsilon)} 2^{-n R} \\ &\leq \epsilon^{\prime} \end{aligned} Pr(g(f(Xn))=Xn)=Pr(Aˉϵ(n))+xnAϵ(n)p(xn)Pr(x~n=xn:f(x~n)=f(xn))ϵ+xnAϵ(n)p(xn)x~nAϵ(n)2nRϵ+2n(H(X)+ϵ)2nRϵ
if R > H ( X ) + ϵ R>H(X)+\epsilon R>H(X)+ϵ and n n n sufficiently large.

Remarks:

  • The binning scheme does not require an explicit characterization of the typical set at the encoder; it is needed only at the decoder
  • It is this property that enables this code to continue to work in the case of a distributed source

Outline of Proof: Achievability

  • Random code generation: Assign every x n ∈ X n x^{n} \in \mathcal{X}^{n} xnXn to one of 2 n R 1 2^{n R_{1}} 2nR1 bins independently according to a uniform distribution on { 1 , 2 , … , 2 n R 1 } . \left\{1,2, \ldots, 2^{n R_{1}}\right\} . {1,2,,2nR1}. Similarly, randomly assign every y n ∈ Y n y^{n} \in \mathcal{Y}^{n} ynYn to one of 2 n R 2 2^{n R_{2}} 2nR2 bins. Reveal the assignments f 1 f_{1} f1 and f 2 f_{2} f2 to both sender and receiver.
  • Encoding: Encoder 1 sends the index of the bin to which x n x^{n} xn belongs. Encoder 2 sends the index of the bin to which y n y^{n} yn belongs.
  • Decoding: Given the received index pair ( i , j ) (i, j) (i,j), declare ( x ^ n , y ^ n ) = \left(\hat{x}^{n}, \hat{y}^{n}\right)= (x^n,y^n)= ( x n , y n ) \left(x^{n}, y^{n}\right) (xn,yn) if there is one and only one pair of sequences ( x n , y n ) \left(x^{n}, y^{n}\right) (xn,yn) such that f 1 ( x n ) = i f_{1}\left(x^{n}\right)=i f1(xn)=i and f 2 ( y n ) = j f_{2}\left(y^{n}\right)=j f2(yn)=j and ( x n , y n ) ∈ A ϵ ( n ) \left(x^{n}, y^{n}\right) \in A_{\epsilon}^{(n)} (xn,yn)Aϵ(n). Otherwise, declare an error.

Let ( X i , Y i ) ∼ p ( x , y ) \left(X_{i}, Y_{i}\right) \sim p(x, y) (Xi,Yi)p(x,y). Define the events
E 0 = { ( x n , y n ) ∉ A ϵ ( n ) } E 1 = { ∃ x ~ n ≠ x n : f 1 ( x ~ n ) = f 1 ( x n )  and  ( x ~ n , y n ) ∈ A ϵ ( n ) } E 2 = { ∃ y ~ n ≠ y n : f 2 ( y ~ n ) = f 2 ( y n )  and  ( x n , y ~ n ) ∈ A ϵ ( n ) } E 3 = { ∃ ( x ~ n , y ~ n ) : x ~ n ≠ x n , y ~ n ≠ y n , f 1 ( x ~ n ) = f 1 ( x n ) f 2 ( y ~ n ) = f 2 ( y n )  and  ( x ~ n , y ~ n ) ∈ A ϵ ( n ) } P e ( n ) = Pr ⁡ ( E 0 ∪ E 1 ∪ E 2 ∪ E 3 ) ≤ Pr ⁡ ( E 0 ) + Pr ⁡ ( E 1 ) + Pr ⁡ ( E 2 ) + Pr ⁡ ( E 3 ) \begin{aligned} E_{0}=&\left\{\left(x^{n}, y^{n}\right) \notin A_{\epsilon}^{(n)}\right\} \\ E_{1}=&\left\{\exists \tilde{x}^{n} \neq x^{n}: f_{1}\left(\tilde{x}^{n}\right)=f_{1}\left(x^{n}\right) \text { and }\left(\tilde{x}^{n}, y^{n}\right) \in A_{\epsilon}^{(n)}\right\} \\ E_{2}=&\left\{\exists \tilde{y}^{n} \neq y^{n}: f_{2}\left(\tilde{y}^{n}\right)=f_{2}\left(y^{n}\right) \text { and }\left(x^{n}, \tilde{y}^{n}\right) \in A_{\epsilon}^{(n)}\right\} \\ E_{3}=&\left\{\exists\left(\tilde{x}^{n}, \tilde{y}^{n}\right): \tilde{x}^{n} \neq x^{n}, \tilde{y}^{n} \neq y^{n}, f_{1}\left(\tilde{x}^{n}\right)=f_{1}\left(x^{n}\right)\right.\\ &\left.f_{2}\left(\tilde{y}^{n}\right)=f_{2}\left(y^{n}\right) \text { and }\left(\tilde{x}^{n}, \tilde{y}^{n}\right) \in A_{\epsilon}^{(n)}\right\} \\ P_{e}^{(n)}=& \operatorname{Pr}\left(E_{0} \cup E_{1} \cup E_{2} \cup E_{3}\right) \\ \leq & \operatorname{Pr}\left(E_{0}\right)+\operatorname{Pr}\left(E_{1}\right)+\operatorname{Pr}\left(E_{2}\right)+\operatorname{Pr}\left(E_{3}\right) \end{aligned} E0=E1=E2=E3=Pe(n)={(xn,yn)/Aϵ(n)}{x~n=xn:f1(x~n)=f1(xn) and (x~n,yn)Aϵ(n)}{y~n=yn:f2(y~n)=f2(yn) and (xn,y~n)Aϵ(n)}{(x~n,y~n):x~n=xn,y~n=yn,f1(x~n)=f1(xn)f2(y~n)=f2(yn) and (x~n,y~n)Aϵ(n)}Pr(E0E1E2E3)Pr(E0)+Pr(E1)+Pr(E2)+Pr(E3)

Pr ⁡ ( E 1 ) = ∑ ( x n , y n ) ∈ A ϵ ( n ) p ( x n , y n ) Pr ⁡ ( ∃ x ~ n ≠ x n : f 1 ( x ~ n ) = f 1 ( x n )  and  ( x ~ n , y n ) ∈ A ϵ ( n ) ) ≤ ∑ ( x n , y n ) ∈ A ϵ ( n ) p ( x n , y n ) ∑ x ~ n : ( x ~ n , y n ) ∈ A ϵ ( n ) 2 − n R 1 ≤ 2 n ( H ( X ∣ Y ) + ϵ ) 2 − n R 1 ≤ ϵ ′ \begin{aligned} \operatorname{Pr}\left(E_{1}\right) &=\sum_{\left(x^{n}, y^{n}\right) \in A_{\epsilon}^{(n)}} p\left(x^{n}, y^{n}\right) \operatorname{Pr}\left(\exists \tilde{x}^{n} \neq x^{n}: f_{1}\left(\tilde{x}^{n}\right)=f_{1}\left(x^{n}\right) \text { and }\left(\tilde{x}^{n}, y^{n}\right) \in A_{\epsilon}^{(n)}\right) \\ & \leq \sum_{\left(x^{n}, y^{n}\right) \in A_{\epsilon}^{(n)}} p\left(x^{n}, y^{n}\right) \sum_{\tilde{x}^{n}:\left(\tilde{x}^{n}, y^{n}\right) \in A_{\epsilon}^{(n)}} 2^{-n R_{1}} \\ & \leq 2^{n(H(X \mid Y)+\epsilon)} 2^{-n R_{1}} \\ & \leq \epsilon^{\prime} \end{aligned} Pr(E1)=(xn,yn)Aϵ(n)p(xn,yn)Pr(x~n=xn:f1(x~n)=f1(xn) and (x~n,yn)Aϵ(n))(xn,yn)Aϵ(n)p(xn,yn)x~n:(x~n,yn)Aϵ(n)2nR12n(H(XY)+ϵ)2nR1ϵ
if R 1 > H ( X ∣ Y ) + ϵ R_{1}>H(X \mid Y)+\epsilon R1>H(XY)+ϵ and n n n sufficiently large.

Similarly, we find that for sufficiently large n n n, Pr ⁡ ( E 2 ) < ϵ ′ \operatorname{Pr}\left(E_{2}\right)<\epsilon^{\prime} Pr(E2)<ϵ if R 2 > R_{2}> R2> H ( Y ∣ X ) H(Y |X) H(YX) and Pr ⁡ ( E 3 ) < ϵ ′ \operatorname{Pr}\left(E_{3}\right)<\epsilon^{\prime} Pr(E3)<ϵ if R 1 + R 2 > H ( X , Y ) R_{1}+R_{2}>H(X, Y) R1+R2>H(X,Y). Since Pr ⁡ ( E 0 ) < ϵ \operatorname{Pr}\left(E_{0}\right)<\epsilon Pr(E0)<ϵ, we conclude that the probability of error P e ( n ) → 0 P_{e}^{(n)} \rightarrow 0 Pe(n)0 as n → ∞ n \rightarrow \infty n

Interpretation of Slepian-Wolf Coding

Consider the corner point of the rate region in Slepian-Wolf encoding, where R 1 = H ( X ) R_1=H(X) R1=H(X) and R 2 = H ( Y ∣ X ) R_2=H(Y|X) R2=H(YX).

在这里插入图片描述

  • Instead of trying to determine the typical fan, the Y Y Y encoder randomly assigns to all y n y^{n} yn sequences ( 2 n H ( Y ) 2^{n H(Y)} 2nH(Y) in total) an index at random from { 1 , 2 , … , 2 n R 2 } \left\{1,2, \ldots, 2^{n R_{2}}\right\} {1,2,,2nR2}

  • If the number of indices is high enough, then with high probability every element in the typical fan associated with x n x^{n} xn will have a unique index

  • For R 2 > H ( Y ∣ X ) R_{2}>H(Y | X) R2>H(YX), this number is exponentially larger than the number of elements in the fan

  • The decoder, also knowing x n x^{n} xn, can construct the typical fan and the received Y Y Y index will uniquely determine the y n y^{n} yn sequence within the x n x^{n} xn fan

在这里插入图片描述

Interpretation:

How does the random binning scheme bypass the problem that the Y Y Y encoder does not know which sequence x n x^n xn is encoded?

Under the random binning scheme the decoder is able to decode x n x^n xn since the assignments f 1 f_{1} f1 is revealed to both sender and receiver. Therefore, the number of possible y n y^n yn is narrowed down from ∣ y n ∣ = 2 n H ( Y ) |y^n|=2^{nH(Y)} yn=2nH(Y) to ∣ y n ∣ = 2 n H ( Y ∣ X ) |y^n|=2^{nH(Y|X)} yn=2nH(YX). The Y Y Y encoder can now focus on discriminate 2 n H ( Y ∣ X ) 2^{nH(Y|X)} 2nH(YX) sequences.

We can conclude that the decoder knowing which sequence x n x^n xn is encoded is equivalent to the Y Y Y encoder knowing which sequence x n x^n xn is encoded in effect.

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值