Channel Capacity 3: Gaussian Channel

最新推荐文章于 2021-10-20 20:57:46 发布

拉普拉斯的汪

最新推荐文章于 2021-10-20 20:57:46 发布

阅读量369

点赞数

分类专栏： Information Theory

本文链接：https://blog.csdn.net/qq_39599295/article/details/114466991

版权

Information Theory 专栏收录该内容

9 篇文章 1 订阅

订阅专栏

本文深入介绍了信息理论的基本概念，包括离散熵、连续熵（差分熵）、联合熵、条件熵和互信息。特别讨论了高斯信道，分析了带宽受限和能量约束的高斯信道的容量，并解释了反馈如何影响信道容量。此外，还探讨了并行高斯信道的功率分配问题以及反馈对高斯信道容量的影响。

摘要由CSDN通过智能技术生成

Reference:

Elements of Information Theory, 2nd Edition

Slides of EE4560, TUD

文章目录

Differential Entropy

We now introduce the concept of differential entropy, which is the entropy of a continuous random variable.

Definition 1 (Differential Entropy):

The differential entropy $h (X)$ of a continuous random variable $X$ with density $f (x)$ is defined as
$h(X)=-\int_S f(x)\log f(x)dx \tag{1}$
where $S$ is the support set of $X$ (where $f (x) > 0$ ).

$h (X)$ is sometimes denoted as $h (f)$ , as $H (X)$ is sometimes denoted as $H (p)$ .
$\log$ here denotes $log _2$ . Do not forget to change the base.

Examples:

Uniform Distribution:
$h(X)=-\int_{0}^{a}\frac{1}{a} \log \frac{1}{a}dx=\log a\tag{2}$
- $\text{larger }a \to \text{larger uncertainty} \to \text{larger } h(X)$
- For $0 < a < 1$ , the differential entropy $\log a$ is negative! This is different from $H (X)$ , which is always $\ge 0$ .
- However, $2^{h(X)}=2^{\log a}=a$ is always positive.
Normal Distribution:
$\begin{aligned} h(f)&=-\int f(x)\log f(x)dx=-\int \frac{f(x)}{\ln 2}\left[\frac{-(x-\mu)^2}{2\sigma^2}-\ln \sqrt{2\pi \sigma^2} \right]dx\\ &=\frac{1}{\ln 2}\left[\frac{E(X^2)}{2\sigma^2}+\frac{1}{2}\ln (2\pi \sigma^2)\right]=\frac{1}{2}\log (2\pi e \sigma^2) \end{aligned}\tag{3}$

Definition 2 (Joint Differential Entropy):

The joint differential entropy $h (X)$ of a set $X_1,X_2,\cdots,X_n$ of random variables with density $f(x_1,x_2,\cdots,x_n)$ is defined as
$h(X_1,X_2,\cdots,X_n)=-\int f(x^n)\log f(x^n)dx^n \tag{4}$
N.B. $x^n$ here is a short notation of $(x_1,x_2,\cdots,x_n)$

Definition 3 (Conditional Differential Entropy):

If $X, Y$ have a joint density function $f (x, y),$ we can define the conditional differential entropy $h (X ∣ Y)$ as
$Y)=-\int f(x, y) \log f(x | y) d x d y=h(X, Y)-h(Y)\tag{5}$

Definition 4 (Mutual Information):

The mutual information $I (X; Y)$ between two random variables $X$ and $Y$ with joint density $f (x, y)$ is defined as
$\begin{aligned} I(X;Y)&=\iint f(x,y)\log \frac{f(x,y)}{f(x)f(y)}dx dy\\ &=h(X)-h(X|Y)=h(Y)-h(Y|X)\\ &=h(X)+h(Y)-h(X,Y) \end{aligned}\tag{6}$
N.B. $I(X;Y)\ge 0$ with equality if and only if $X$ and $Y$ are independent.

Gaussian Channels

The most important continuous alphabet channel is the Gaussian channel. This is a time-discrete channel with output $Y_i$ at time $i$ ,where $Y_i$ is the sum of the input $X_i$ and the noise $Z_i$ . The noise $Z_i$ is drawn i.i.d. from a Gaussian distribution with variance $N$ . Thus,
$Y_i=X_i+Z_i,\quad Z_i \sim \mathcal N(0,N) \tag{7}$
The noise $Z_i$ is assumed to be independent of the signal $X_i$ .

在这里插入图片描述

The most common limitation on the input is an energy or power constraint. We assume an average power constraint. For any codeword $x_1,x_2,...,x_n)$ transmitted over the channel, we require that
$\frac{1}{n}\sum_{i=1}^n x_i^2\le P \tag{8}$
[Example-Binary input, Gaussian noise: Slide 6-7]

Gaussian Channel Capacity

Definition 5 (Information Capacity):

The information capacity of the Gaussian channel with power constraint $P$ is
$C=\max _{f(x): E X^{2} \leq P} I(X ; Y)=\frac{1}{2} \log \left(1+\frac{P}{N}\right) \tag{9}$
where the maximum is achieved when $X\sim \mathcal N(0,P)$ .

Proof: Expanding $I (X; Y),$ we have
$\begin{aligned} I(X ; Y) &=h(Y)-h(Y | X) \\ &=h(Y)-h(X+Z |X) \\ &=h(Y)-h(Z| X) \\ &=h(Y)-h(Z) \end{aligned}$
since $Z$ is independent of $X .$ From Eq. $(3)$ , $h(Z)=\frac{1}{2} \log 2 \pi e N .$ Also,
$E Y^{2}=E(X+Z)^{2}=E X^{2}+2 E X E Z+E Z^{2}=P+N$
since $X$ and $Z$ are independent and $E Z = 0 .$ Given $E Y^{2}=P+N,$ the entropy of $Y$ is bounded by $\frac{1}{2} \log 2 \pi e(P+N)$ by Theorem 8.6.5 (the normal maximizes the entropy for a given variance) [book 254] . Applying this result to bound the mutual information, we obtain
$\begin{aligned} I(X ; Y) &=h(Y)-h(Z) \\ & \leq \frac{1}{2} \log 2 \pi e(P+N)-\frac{1}{2} \log 2 \pi e N \\ &=\frac{1}{2} \log \left(1+\frac{P}{N}\right) \end{aligned}$
Next, it will be shown that this capacity is also the supremum of the rates achievable for the channel, i.e., the operational capacity.

Definition 6 (Code):

An $(M, n)$ code for the Gaussian channel with power constraint $P$ consists of the following:

An index set $\{1,2, \ldots, M\}$
An encoding function $x:\{1,2, \ldots, M\} \rightarrow \mathcal{X}^{n}$ , yielding codewords $x^{n}(1), x^{n}(2), \ldots, x^{n}(M),$ satisfying the power constraint $P;$ that is, for every codeword
$\sum_{i=1}^{n} x_{i}^{2}(w) \leq n P, \quad w=1,2, \ldots, M$
A decoding function

$\mathcal{Y}^{n} \rightarrow\{1,2, \ldots, M\}$

N.B. Rate $R=\frac{\log M}{n}$ , as is defined in the discrete channel.

Definition 7 (Achievable):

A rate $R$ is said to be achievable for a Gaussian channel with a power constraint $P$ if there exists

a sequence of $\left(2^{n R}, n\right)$ codes
with codewords satisfying the power constraint
such that the maximal probability of error $\lambda^{(n)}$ tends to zero.

The capacity of the channel is the supremum of the achievable rates.

Theorem 1 (The capacity of a Gaussian channel):

The capacity of a Gaussian channel with power constraint $P$ and noise variance $N$ is
$C=\frac{1}{2} \log \left(1+\frac{P}{N}\right) \quad \text{ bits per transmission} \tag{10}$
[Proof: book 266-268]

A plausibility argument:

在这里插入图片描述

Band-Limited Channel

A common model for communication over a radio network or a telephone line is a bandlimited channel with white noise. This is a continuous time channel. The output of such a channel can be described as the convolution
$Y(t)=(X(t)+Z(t))*h(t)\tag{11}$
where

$Y (t)$ is output signal waveform
$X (t)$ is input signal waveform
$Z (t)$ is white Gaussian noise waveform
$h (t)$ is impulse response of an ideal bandpass filter (cuts out all frequencies greater than $W$ ).

Theorem 2 (The sampling theorem):

A function $f (t)$ , which is band-limited to $W$ , is completely determined by samples of the function spaced $\frac{1}{2W}$ seconds apart.

[Proof: book 271]

Now we can formulate the problem of communication over a bandlimited channel:

Bandwidth $W$
Number of samples per second $2 W$
Signal power $P$
Noise power $N=N_0W$ , where $N_0$ is the noise power spectral density

If channel is used over the time interval $[0, T]$ , then

energy per sample is $\frac{PT}{2WT}=\frac{P}{2W}$
noise variance per sample is $\frac{N_0 WT}{2WT}=\frac{N_0}{2}$

Using Theorem 1 (Eq. $(10)$ ), we can obtain the capacity per sample:
$C=\frac{1}{2} \log \left(\frac{1+\frac{P}{2 W}}{\frac{N_{0}}{2}}\right)=\frac{1}{2} \log \left(1+\frac{P}{N_{0} W}\right) \quad \text { bits per sample } \tag{12}$
Since there are $2 W$ samples per second, the capacity per second:
$C=W\log \left( 1+\frac{P}{N_0W} \right) \quad \text { bits per second } \tag{13}$
N.B. If $W\to \infty$ , using $\ln (1+x)\sim x ~(x\to 0)$ , then $C=\frac{P \log e}{N_0}\mathrm{bps}$ .

Definition 8 (Band Efficiency):

Bandwidth efficiency $\eta$ is defined as the rate $R$ (in $\mathrm{bit/s}$ ) divided by the bandwidth $W$ (in $\mathrm{Hz}$ ):
$\eta=\frac{R}{W}~\mathrm{bit} / \mathrm{s} / \mathrm{Hz} \tag{14}$
From channel capacity formula it follows that
$\log \left(1+\frac{P}{W N_{0}}\right)=W \log \left(1+\frac{R E_{b}}{W N_{0}}\right)$
where $E_{b}$ is the energy per bit. Hence,
$\eta<\log \left(1+\eta \frac{E_{b}}{N_{0}}\right), \text { i.e., } \frac{E_{b}}{N_{0}}>\frac{2^{\eta}-1}{\eta} \tag{15}$

在这里插入图片描述

Parallel Gaussian Channels

在这里插入图片描述

Problem to be solved:
$\begin{aligned} &\text{minimize} && -\sum_{j=1}^{k}C_j =-\sum_{j=1}^{k} \frac{1}{2}\log \left(1+\frac{P_j}{N_j} \right)\\ &\text{subject to} && \sum_{j=1}^{k} P_j \le P \end{aligned}\tag{16}$
Using Lagrange multipliers gives the function
$L(P_1,\cdots,P_k,\lambda)=-\sum_{j=1}^{k} \frac{1}{2}\log \left(1+\frac{P_j}{N_j} \right)+\lambda(\sum_{j=1}^{k} P_j -P)$
KKT conditions:
$\sum_{j=1}^{k} P_j \le P,\quad \lambda\ge 0\\ \nabla _{P_j}L=0 \Longrightarrow P_j=\frac{1}{2\lambda}-N_j\\ \lambda(\sum_{j=1}^{k} P_j - P)=0$
Together with the condition that $P_j$ are nonnegative gives the solution
$P_j=\max \{0,\frac{1}{2\lambda}-N_j\}\triangleq(\nu-N_j)^+ \tag{17}$
where $\nu$ is chosen such that $\sum (\nu -N_j)^+=P$ .

This solution is illustrated graphically in Figure 9.4. The vertical levels indicate the noise levels in the various channels. As the signal power is increased from zero, we allot the power to the channels with the lowest noise. When the available power is increased still further, some of the power is put into noisier channels.

在这里插入图片描述

The process by which the power is distributed among the various bins is identical to the way in which water distributes itself in a vessel, hence this process is sometimes referred to as water-filling.

[Example: Slides 23-25]

Gaussian Channels with Feedback

The feedback allows the input of the channel to depend on the past values of the output:

在这里插入图片描述

Capacity without feedback:
$\max _{\operatorname{tr}\left(K_{X}\right) \leq n P} \frac{1}{2 n} \log \frac{\left|K_{X}+K_{Z}\right|}{\left|K_{Z}\right|}\tag{18}$
Capacity with feedback:

$\max _{\operatorname{tr}\left(K_{X}\right) \leq n P} \frac{1}{2 n} \log \frac{\left|K_{X+Z}\right|}{\left|K_{Z}\right|}\tag{19}$
where $K_{\ldots}$ is $\times n$ covariance matrix.

Remarks:

Memoryless channels: feedback does not increase capacity!
Channels with memory: feedback does increase capacity!
Feedback does not improve capacity by more than $\frac{1}{2}$ :
$C_{\text {withFB }} \leq C_{\text {withoutFB }}+\frac{1}{2} \tag{20}$
Feedback does not improve capacity by more than a factor of two:
$C_{\text {withFB }} \leq 2 C_{\text {withoutFB }} \tag{21}$
Conclusion: feedback may help, but not much!

拉普拉斯的汪

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Channel Capacity 3: Gaussian Channel

Reference:Elements of Information Theory, 2nd EditionSlides of EE4560, TUD文章目录Differential EntropyGaussian ChannelsGaussian Channel CapacityBand-Limited ChannelParallel Gaussian ChannelsGaussian Channels with FeedbackDifferential EntropyWe now introdu
复制链接

扫一扫

专栏目录