期望传播算法及其推导

最新推荐文章于 2024-04-03 00:22:09 发布

置顶

「已注销」

最新推荐文章于 2024-04-03 00:22:09 发布

阅读量5.8k

点赞数 10

分类专栏：通信算法统计信号处理信号处理文章标签：期望传播 EP 变分法信号检测

本文链接：https://blog.csdn.net/u012284960/article/details/91945703

版权

个人博客：www.qiuyun-blog.cn

Notations:

$\text{Diag}(\boldsymbol{a})$ : a diagonal matrix with $\boldsymbol{a}$ being its diagonal element.
$\text{diag}(\mathbf{A})$ : a vector from the diagonal element of $\mathbf{A}$ .
$\boldsymbol{a}\odot \boldsymbol{b}$ : componentwise multiply.
$\boldsymbol{a}\oslash \boldsymbol{b}$ : componentwise divide.

Recap of Variational Inference

As mentioned in [1], we have introduced variational inference and its application in Bayesian linear regression. In this blog, we focus on a variational inference perspective on expectation propagation.

In signal processing regime, the posterior distribution is interested. However, it is difficult to obtain owing to many high-dimension integral. For example, we consider linear Gaussian model
$\mathbf{y}=\mathbf{Hx}+\mathbf{w}$
Its posterior distribution denoted by
$p(\mathbf{x}|\mathbf{y})=\frac{p(\mathbf{y}|\mathbf{x})p(\mathbf{x})}{\int p(\mathbf{y}|\mathbf{x})p(\mathbf{x}) \text{d}\mathbf{x} }$
where $p(\mathbf{y}|\mathbf{x})=p_{\mathbf{w} }(\mathbf{y}-\mathbf{Hx})$ . Unless both $p(\mathbf{y}|\mathbf{x})$ and $p(\mathbf{x})$ are Gaussian, we can’t obtain the close-form of $p(\mathbf{x}|\mathbf{y})$ directly. For that, some approximations are necessary.

To thid end, we use $q(\mathbf{x})$ to approximate the posterior distribution and KL-divergence to measure the difference between $q(\mathbf{x})$ and $p(\mathbf{x}|\mathbf{y})$ . For simplification, we generally restrict the form of $q(\mathbf{x})$ from the distribution family $\mathcal{S}$ , i.e.,
$q(\mathbf{x})=\underset{q(\mathbf{x})\in \mathcal{S} } {\arg \min} \ \mathcal{D}_{\text{KL} }(p||q)$
Obviously, a distribution family with excellent properties will greatly reduce the amount of computation. Fortunately, exponential family is one of that.

Exponential Family

The exponential family over $\mathbf{x}$ parametered by $\boldsymbol{\eta}$ is defined by
$p(\mathbf{x};\boldsymbol{\eta})=h(\mathbf{x})g(\boldsymbol{\eta})\exp\left(\boldsymbol{\eta}^T\boldsymbol{u}(\mathbf{x})\right)$
where $g(\boldsymbol{\eta})$ is normalization constant
$g(\boldsymbol{\eta}) \left(\int h(\mathbf{x})\exp\left(\boldsymbol{\eta}^T\boldsymbol{u}(\mathbf{x})\right)\text{d}\mathbf{x}\right)=1$
Taking the gradient of both side of the above w.r.t. $\boldsymbol{\eta}$ , we get
$\nabla g(\boldsymbol{\eta})\int h(\mathbf{x})\exp \left(\boldsymbol{\eta}^T\boldsymbol{u}(\mathbf{x})\right)\text{d}\mathbf{x}+g(\boldsymbol{\eta})\int h(\mathbf{x})\left(\boldsymbol{\eta}^T\boldsymbol{u}(\mathbf{x})\right)\boldsymbol{u}(\mathbf{x})\text{d}\mathbf{x}=0$
Rearranging yields
$-\frac{1}{g(\boldsymbol{\eta})}\nabla g(\boldsymbol{\eta}) =g(\boldsymbol{\eta}) \int \boldsymbol{u}(\mathbf{x})h(\mathbf{x})\exp\left(\boldsymbol{\eta}^T\boldsymbol{u}(\mathbf{x})\right)\text{d}\mathbf{x}\\ \qquad \qquad =\frac{ \int \boldsymbol{u}(\mathbf{x})h(\mathbf{x})\exp\left(\boldsymbol{\eta}^T\boldsymbol{u}(\mathbf{x})\right)\text{d}\mathbf{x} }{ \int h(\mathbf{x})\exp\left(\boldsymbol{\eta}^T\boldsymbol{u}(\mathbf{x})\right)\text{d}\mathbf{x} }\\ =\mathbb{E}[\boldsymbol{u}(\mathbf{x})] \qquad \qquad \quad \ \$
Using the fact $\nabla \log g(\boldsymbol{\eta})=\frac{1}{g(\boldsymbol{\eta})}\nabla g(\boldsymbol{\eta})$ , we have
$\qquad \qquad -\nabla \log g(\boldsymbol{\eta})=\mathbb{E}[\boldsymbol{u}(\mathbf{x})] \quad \cdots \cdots \cdots \cdots \quad (*1)$

A Variational Inference Perspective on EP

For the distribution of $q(\mathbf{x})$ in variational inference, We take exponential family distribution into account
$q(\mathbf{x})=h(\mathbf{x})g(\boldsymbol{\eta})\exp \left(\boldsymbol{\eta}^T\boldsymbol{u}(\mathbf{x})\right)$
we then write $\mathcal{D}_{\text{KL} }(p||q)$ as
$\mathcal{D}_{\text{KL} }(p||q)=-\log g(\boldsymbol{\eta})-\boldsymbol{\eta}^T\mathbb{E}_{p(\mathbf{x})}[\boldsymbol{u}(\mathbf{x})]+\text{const}$
Taking the gradient of the both side of above w.r.t. $\boldsymbol{\eta}$ to zero yields
$-\nabla \log g(\boldsymbol{\eta}) =\mathbb{E}_{p(\mathbf{x})}[\boldsymbol{u}(\mathbf{x})]$
As mentioned in $(* 1)$ , we then get
$\mathbb{E}_{q(\mathbf{x})}[\boldsymbol{u}(\mathbf{x})]=\mathbb{E}_{p(\mathbf{x})}[\boldsymbol{u}(\mathbf{x})]$

最低0.47元/天解锁文章

「已注销」

关注

10
点赞
踩
26

收藏

觉得还不错? 一键收藏
1
评论
期望传播算法及其推导

个人博客：www.qiuyun-blog.cnNotations:Diag(a)\text{Diag}(\boldsymbol{a})Diag(a): a diagonal matrix with a\boldsymbol{a}a being its diagonal element.diag(A)\text{diag}(\mathbf{A})diag(A): a vector fro...
复制链接

扫一扫