# 线性高斯模型中互信息与MMSE关系

Notations:

• Mutual information (MI)
$\qquad \qquad I(X;Y) =\int p(x,y)\log \frac{p(x|y)}{p(x)}\text{d}x\text{d}y\\ \qquad \qquad \qquad \qquad \ \ \ =\int p(x,y)\log \frac{p(x,y)}{p(x)p(y)}\text{d}x\text{d}y\\ \qquad \qquad \qquad \quad \ \ \ =\int p(x,y)\log \frac{p(y|x)}{p(y)}\text{d}x\text{d}y\\ \ =I(Y;X)$
• integration by parts
$\int u(x)v'(x)\text{d}x= u(x)v(x)|_{x=-\infty}^{x=+\infty}-\int u'(x)v(x)\text{d}x$
where $v'(x)$ denotes $\frac{\text{d}v(x)}{\text{d}x}$

Theorem：Given following linear Gaussian model
$Y=\sqrt{\gamma}X+U\quad U\sim \mathcal{N}(0,1)$where $\gamma>0$ refers to signal-noise-rate (SNR). We have
$\frac{\text{d}I(X;Y)}{\text{d}\gamma}=\frac{1}{2}\text{MMSE}$ in which
$\text{MMSE}=\int (x-\hat{x})^2p(x,y;\gamma)\text{d}x\text{d}y$
and $\hat{x}=\int xp(x|y;\gamma)\text{d}x$

$Proof$：Define
$p_k(y;\gamma)=\int x^k p(y,x;\gamma)\text{d}x=\mathbb{E}_X\left\{X^kp(y|X;\gamma)\right\}$we have follows conclusions
1.
$\frac{\text{d} p_k(y;\gamma)}{\text{d}\gamma} =\frac{1}{2\sqrt{\gamma} }yp_{k+1}(y;\gamma)-\frac{1}{2}p_{k+2}(y;\gamma)\\ =-\frac{1}{2\sqrt{\gamma} }\frac{\text{d} }{\text{d}y}p_{k+1}(y;\gamma)$
2.
$\hat{x}_{\text{MMSE} }=\int xp(x|y;\gamma)\text{d}x=\frac{p_1(y;\gamma)}{p_0(y;\gamma)}$

Mutual information
$I(X;Y) =\int p(y,x;\gamma)\log \frac{p(y|x;\gamma)}{p(y;\gamma)}\text{d}x\text{d}y\\ \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad =\underbrace{\int p(y,x;\gamma)\log p(y|x;\gamma)\text{d}x\text{d}y}_{\xi}-\underbrace{\int p(y,x;\gamma)\log p(y;\gamma)\text{d}x\text{d}y}_{\zeta}$For that，we calculate $\xi$ and $\zeta$ respectively as follows
$\xi =\int p(y|x;\gamma)p(x)\log p(y|x;\gamma)\text{d}x\text{d}y\\ \quad \qquad \qquad \qquad \qquad \qquad \qquad \quad \ \ \overset{(a)}=-\frac{1}{2}\int p(y|x;\gamma)p(x)\log 2\pi \text{d}x\text{d}y-\frac{1}{2}(y-\sqrt{\gamma}x)^2p(y|x;\gamma)p(x)\text{d}x\text{d}y\\ =-\frac{1}{2}\log (2\pi e) \quad \qquad \qquad \qquad \quad \$where the fact $p(y|x;\gamma)=\frac{1}{\sqrt{2\pi} }\exp \left[-\frac{(y-\sqrt{\gamma}x)^2}{2}\right]$ is used in $(a)$.
$\zeta =\int p(y,x;\gamma)\log p(y;\gamma)\text{d}y\\ \quad \qquad =\int_y\int_x p(y,x;\gamma)\text{d}x\log p(y;\gamma)\text{d}y\\ =\int_y p(y;\gamma)\log p(y;\gamma)\text{d}y \ \ \$Computing the partial derivation of $I(X;Y)$ w.r.t. $\gamma$ yields
$\frac{\text{d}I(X;Y)}{\text{d}\gamma} =-\frac{\text{d} }{\text{d}\gamma}p_0(y;\gamma)\log p_0(y;\gamma)\text{d}y \qquad \qquad \\ \qquad \quad \ \ =-\int \left[\log p_0(y;\gamma)+1\right]\frac{\text{d}p_1(y;\gamma)}{\text{d}\gamma}\text{d}y\\ \qquad \qquad \qquad \qquad \qquad \qquad \ =\frac{1}{2\sqrt{\gamma} }\int \log p_0(y;\gamma)\frac{\text{d}p_1(y;\gamma)}{\text{d}y}\text{d}y+\frac{1}{2\sqrt{\gamma} }\underbrace{\int \frac{\text{d}p_1(y;\gamma)}{\text{d}y}\text{d}y}_{\kappa}\\ \qquad \ \ \overset{(a)}{=}\frac{1}{2\sqrt{\gamma} }\int \log p_0(y;\gamma)\frac{\text{d}p_1(y;\gamma)}{\text{d}y}\text{d}y\\ \quad \quad \ \overset{(b)}{=}-\frac{1}{2\sqrt{\gamma} }\int \frac{p_1(y;\gamma)}{p_0(y;\gamma)}\frac{\text{d}p_0(y;\gamma)}{\text{d}y}\text{d}y\\ \qquad \qquad \qquad \qquad \ \ \overset{(c)}{=}\frac{1}{2\sqrt{\gamma} }\int \frac{p_1(y;\gamma)}{p_0(y;\gamma)}\left[y-\sqrt{\gamma}\frac{p_1(y;\gamma)}{p_0(y;\gamma)}\right]p_0(y;\gamma)\text{d}y$
where $(a)$ holds thanks to integration by parts,
$\kappa=\left.p_1(y;\gamma)\right|_{y=-\infty}^{y=+\infty}=0$
$(b)$ holds also based on integration by parts,
$\int \log p_0(y;\gamma)\frac{\text{d}p_1(y;\gamma)}{\text{d}y}\text{d}y \qquad \qquad \\ \qquad \qquad \qquad=\left.{p_1(y;\gamma)\log p_0(y;\gamma)}\right|_{y=-\infty}^{y=+\infty}-\int \frac{p_1(y;\gamma)}{p_0(y;\gamma)}\frac{\text{d}p_0(y;\gamma)}{\text{d}y}\text{d}y\\ =-\frac{1}{2\sqrt{\gamma} }\int \frac{p_1(y;\gamma)}{p_0(y;\gamma)}\frac{\text{d}p_0(y;\gamma)}{\text{d}y}\text{d}y \qquad \quad$and $(c)$ holds by conclusion 1.

Based on the above, we have
$\frac{\text{d}I(X;Y)}{\text{d}\gamma} =\frac{1}{2\sqrt{\gamma} }\int \left(\int xp(x|y;\gamma)\text{d}x\right)\left[y-\sqrt{\gamma}\left(\int xp(x|y;\gamma)\text{d}x\right)\right]p(y;\gamma)\text{d}y\\ \quad \qquad \ =\frac{1}{2\sqrt{\gamma} }\int_{x,y} xyp(x,y;\gamma)\text{d}x\text{d}y-\frac{1}{2}\int_y\left(\int xp(x|y;\gamma)\text{d}x\right)^2p(y;\gamma)\text{d}y\\ =\frac{1}{2}\int x^2p(x,y;\gamma)\text{d}x\text{d}y-\frac{1}{2}\int \hat{x}p(x,y;\gamma)\text{d}x\text{d}y \quad \qquad \quad \ \ \ \\ =\frac{1}{2}\mathbb{E}\left\{X^2-\hat{X}^2\right\} \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \\ \overset{(d)}{=}\frac{1}{2}\text{MMSE} \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad\qquad \$where the expectation is taken over $p(x,y;\gamma)$. In addition, $(d)$ holds by
$\int (x-\hat{x})^2p(x,y;\gamma)\text{d}x\text{d}y \qquad \qquad \qquad \\ \qquad \qquad \qquad \qquad \qquad \quad=\int x^2p(x,y;\gamma)\text{d}x\text{d}y+\int \hat{x}^2 p(x,y;\gamma)\text{d}x\text{d}y-2\int x\hat{x}p(x,y;\gamma)\text{d}x\text{d}y\\ \qquad \qquad \qquad \qquad \quad \ \ \ =\int x^2p(x)\text{d}x+ \int \hat{x}^2p(y;\gamma)\text{d}y-2\int \hat{x} \int x p(x|y;\gamma)\text{d}x p(y;\gamma)\text{d}y\\ \qquad \quad\ \ =\int x^2p(x)\text{d}x+\int \hat{x}^2p(y;\gamma)\text{d}y-2\int \hat{x}^2p(y;\gamma)\text{d}y\\ =\int x^2p(x)\text{d}x-\int \hat{x}^2p(y;\gamma)\text{d}y \qquad \qquad \ \\ =\int (x^2-\hat{x})p(x,y;\gamma)\text{d}x\text{d}y \qquad \qquad \qquad \$Note that $\hat{x}=\int xp(x|y;\gamma)\text{d}x$ is the function of $y$.

# References

[1] Guo D. Gaussian channels: Information, estimation and multiuser detection[D]. Princeton University, 2004.
[2] Guo D, Shamai S, Verdú S. Mutual information and minimum mean-square error in Gaussian channels[J]. IEEE Transactions on Information Theory, 2005, 51(4): 1261-1282.