Bayesian non-negative matrix factorization核心过程推导

       最近阅读了一篇老文章—Bayesian non-negative matrix factorization,是个论文集,本文在540-547页,这篇文章用贝叶斯方法重做了一遍非负矩阵分解,但其推导过程过于简略,本人将记录一下其核心推导过程,也就是原文公式(5)和公式(7)的推导过程。

       定义${\bf{X}} = {\bf{AB}} + {\bf{E}}$,其中${\bf{X}} \in {R^{I \times J}}$${\bf{A}} \in {R^{I \times N}}$,${\bf{B}} \in {R^{N \times J}}$,关于${\bf{X}}的似然函数为:

$p\left( {​{\bf{X}}\left| {​{\bf{A}},{\bf{B}},{\sigma ^2}} \right.} \right) = \prod\limits_{i,j} {​{\cal N}\left( {​{​{\bf{X}}_{i,j}}\left| {​{​{\left( {​{\bf{AB}}} \right)}_{i,j}},{\sigma ^2}} \right.} \right)} = {\prod\limits_{i,j} {\left( {2\pi {\sigma ^2}} \right)} ^{ - 1/2}}\exp \left\{ { - {​{\left( {​{​{\bf{X}}_{i,j}} - {​{\left( {​{\bf{AB}}} \right)}_{i,j}}} \right)}^2}/\left( {2{\sigma ^2}} \right)} \right\}$

而变量${\bf{A}}$${\bf{B}}$的先验为:

$p\left( {\bf{A}} \right) = \prod\limits_{i,n} {\varepsilon \left( {​{​{\bf{A}}_{i,n}};{\alpha _{i,n}}} \right)} = \prod\limits_{i,n} {​{\alpha _{i,n}}{\rm{exp}}\left( { - {\alpha _{i,n}}{​{\bf{A}}_{i,n}}} \right)} u\left( {​{​{\bf{A}}_{i,n}}} \right)$

$p\left( {\bf{B}} \right) = \prod\limits_{n,j} {\varepsilon \left( {​{​{\bf{B}}_{n,j}};{\beta _{n,j}}} \right)} = \prod\limits_{n,j} {​{\beta _{n,j}}{\rm{exp}}\left( { - {\beta _{n,j}}{​{\bf{B}}_{n,j}}} \right)} u\left( {​{​{\bf{B}}_{n,j}}} \right)$

另外,再定义噪声方差${​{\sigma ^2}}$的先验:

p\left( {​{\sigma ^2}} \right) = {​{\cal G}^{ - 1}}\left( {​{\sigma ^2}{\rm{;}}k{\rm{,}}\theta } \right) = \frac{​{​{\theta ^k}}}{​{\Gamma \left( k \right)}}{\left( {​{\sigma ^2}} \right)^{ - k - 1}}{\rm{exp}}\left( { - \frac{\theta }{​{​{\sigma ^2}}}} \right)

关于${\bf{A}}$${\bf{B}}$的条件后验密度是一个高斯分布乘以一个截断的指数分布,也就是一个截断的高斯分布,我们定义这种形式为${\cal R}\left( {x{\rm{;}}\mu {\rm{,}}{\sigma ^2}{\rm{,}}\lambda } \right) \propto {\cal N}\left( {x{\rm{;}}\mu {\rm{,}}{\sigma ^2}} \right)\varepsilon \left( {x{\rm{;}}\lambda } \right)$,因此,关于${​{\bf{A}}_{i,n}}$的条件概率密度为:

\begin{array}{l} p\left( {​{​{\bf{A}}_{i,n}}\left| {​{\bf{X}},{​{\bf{A}}_{​{\rm{\backslash (}}i,n)}},{\bf{B}}} \right.,{\sigma ^2}} \right) = {\cal R}\left( {​{​{\bf{A}}_{i,n}}{\rm{;}}{\mu _{​{​{\bf{A}}_{i,n}}}}{\rm{,}}\sigma _{​{​{\bf{A}}_{i,n}}}^2{\rm{,}}{\alpha _{i,n}}} \right) = {\cal N}\left( {​{​{\bf{A}}_{i,n}}{\rm{;}}{\mu _{​{​{\bf{A}}_{i,n}}}}{\rm{,}}\sigma _{​{​{\bf{A}}_{i,n}}}^2} \right)\varepsilon \left( {​{​{\bf{A}}_{i,n}}{\rm{;}}{\alpha _{i,n}}} \right)\\ = \varepsilon \left( {​{​{\bf{A}}_{i,n}}{\rm{;}}{\alpha _{i,n}}} \right)\prod\limits_j {​{​{\left( {2\pi {\sigma ^2}} \right)}^{ - 1/2}}} \exp \left\{ { - {​{\left( {​{​{\bf{X}}_{i,j}} - {​{\left( {​{\bf{AB}}} \right)}_{i,j}}} \right)}^2}/2{\sigma ^2}} \right\} \end{array}(1)

 

为了方便表示,我们先考虑上式的指数部分:

\begin{array}{l} - \frac{1}{​{2{\sigma ^2}}}\sum\limits_j {​{​{\left( {​{​{\bf{X}}_{i,j}} - {​{\left( {​{\bf{AB}}} \right)}_{i,j}}} \right)}^2}} \\ = - \frac{1}{​{2{\sigma ^2}}}\sum\limits_j {\left\{ {​{\bf{X}}_{i,j}^2 - 2{​{\bf{X}}_{i,j}}{​{\left( {​{\bf{AB}}} \right)}_{i,j}} + {​{\left[ {​{​{\left( {​{\bf{AB}}} \right)}_{i,j}}} \right]}^2}} \right\}} \\ = - \frac{1}{​{2{\sigma ^2}}}\sum\limits_j {\left\{ {​{\bf{X}}_{i,j}^2 - 2{​{\bf{X}}_{i,j}}{​{\bf{A}}_{i,n}}{​{\bf{B}}_{n,j}} - 2{​{\bf{X}}_{i,j}}\sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} + {​{\left( {\sum\limits_n {​{​{\bf{A}}_{i,n}}{​{\bf{B}}_{n,j}}} } \right)}^2}} \right\}} \end{array}(2)

其中:

{\left( {\sum\limits_n {​{​{\bf{A}}_{i,n}}{​{\bf{B}}_{n,j}}} } \right)^2} = {\left( {​{​{\bf{A}}_{i,n}}{​{\bf{B}}_{n,j}}} \right)^2} + 2{​{\bf{A}}_{i,n}}{​{\bf{B}}_{n,j}}\sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} + {\left( {\sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right)^2}(3)

将(3)回代入(2):

\begin{array}{l} - \frac{1}{​{2{\sigma ^2}}}\sum\limits_j {​{​{\left( {​{​{\bf{X}}_{i,j}} - {​{\left( {​{\bf{AB}}} \right)}_{i,j}}} \right)}^2}} \\ = - \frac{1}{​{2{\sigma ^2}}}\sum\limits_j {\left\{ {​{\bf{X}}_{i,j}^2 - 2{​{\bf{X}}_{i,j}}{​{\bf{A}}_{i,n}}{​{\bf{B}}_{n,j}} - 2{​{\bf{X}}_{i,j}}\sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} + {​{\left( {​{​{\bf{A}}_{i,n}}{​{\bf{B}}_{n,j}}} \right)}^2} + 2{​{\bf{A}}_{i,n}}{​{\bf{B}}_{n,j}}\sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} + {​{\left( {\sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right)}^2}} \right\}} \\ = - \frac{1}{​{2{\sigma ^2}}}\sum\limits_j {\left\{ {​{​{\left( {​{​{\bf{X}}_{i,j}} - \sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right)}^2} - 2{​{\bf{A}}_{i,n}}{​{\bf{B}}_{n,j}}\left( {​{​{\bf{X}}_{i,j}} - \sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right) + {​{\left( {​{​{\bf{A}}_{i,n}}{​{\bf{B}}_{n,j}}} \right)}^2}} \right\}} \end{array}

\begin{array}{l} = - \frac{1}{​{2{\sigma ^2}}}\left[ {​{​{\left( {​{​{\bf{A}}_{i,n}}} \right)}^2}\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2} - 2{​{\bf{A}}_{i,n}}\sum\limits_j {​{​{\bf{B}}_{n,j}}\left( {​{​{\bf{X}}_{i,j}} - \sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right) + \sum\limits_j {​{​{\left( {​{​{\bf{X}}_{i,j}} - \sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right)}^2}} } } } \right]\\ = - \frac{​{\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2}} }}{​{2{\sigma ^2}}}\left[ {​{​{\left( {​{​{\bf{A}}_{i,n}}} \right)}^2} - \frac{​{2{​{\bf{A}}_{i,n}}\sum\limits_j {​{​{\bf{B}}_{n,j}}\left( {​{​{\bf{X}}_{i,j}} - \sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right)} }}{​{\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2}} }} + \cdots } \right] \end{array}

推导进行到这一步后,会发现根本凑不出来完全平方项,但通过观察第二项,发现除了{​{\bf{A}}_{i,n}}\frac{​{2\sum\limits_j {​{​{\bf{B}}_{n,j}}\left( {​{​{\bf{X}}_{i,j}} - \sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right)} }}{​{\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2}} }}{​{\bf{A}}_{i,n}}无关,而且上式中,除了前两项,第三项\sum\limits_j {​{​{\left( {​{​{\bf{X}}_{i,j}} - \sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right)}^2}}也与{​{\bf{A}}_{i,n}}无关,因此,我们可以根据前两项配出完全平方项,而多余的部分由于在指数上最后就会变成一个比例常数项,因此,上式可以改写为:

\begin{array}{l} - \frac{1}{​{2{\sigma ^2}}}\sum\limits_j {​{​{\left( {​{​{\bf{X}}_{i,j}} - {​{\left( {​{\bf{AB}}} \right)}_{i,j}}} \right)}^2}} \\ = - \frac{​{\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2}} }}{​{2{\sigma ^2}}}\left[ {​{​{\left( {​{​{\bf{A}}_{i,n}}} \right)}^2} - \frac{​{2{​{\bf{A}}_{i,n}}\sum\limits_j {​{​{\bf{B}}_{n,j}}\left( {​{​{\bf{X}}_{i,j}} - \sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right)} }}{​{\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2}} }} + {​{\left\{ {\frac{​{\sum\limits_j {​{​{\bf{B}}_{n,j}}\left( {​{​{\bf{X}}_{i,j}} - \sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right)} }}{​{\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2}} }}} \right\}}^2}} \right] \end{array}

= - \frac{​{\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2}} }}{​{2{\sigma ^2}}}{\left\{ {​{​{\bf{A}}_{i,n}} - \frac{​{\sum\limits_j {​{​{\bf{B}}_{n,j}}\left( {​{​{\bf{X}}_{i,j}} - \sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right)} }}{​{\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2}} }}} \right\}^2} + C(4)

将公式(4)回代如公式(1):

将公式(4)回代如公式(1):p\left( {​{​{\bf{A}}_{i,n}}\left| {​{\bf{X}},{​{\bf{A}}_{​{\rm{\backslash (}}i,n)}},{\bf{B}}} \right.,{\sigma ^2}} \right) = {\cal R}\left( {​{​{\bf{A}}_{i,n}}{\rm{;}}{\mu _{​{​{\bf{A}}_{i,n}}}}{\rm{,}}\sigma _{​{​{\bf{A}}_{i,n}}}^2{\rm{,}}{\alpha _{i,n}}} \right) = {\cal N}\left( {​{​{\bf{A}}_{i,n}}{\rm{;}}{\mu _{​{​{\bf{A}}_{i,n}}}}{\rm{,}}\sigma _{​{​{\bf{A}}_{i,n}}}^2} \right)\varepsilon \left( {​{​{\bf{A}}_{i,n}}{\rm{;}}{\alpha _{i,n}}} \right)

\propto \varepsilon \left( {​{​{\bf{A}}_{i,n}}{\rm{;}}{\alpha _{i,n}}} \right)\prod\limits_j {​{​{\left( {2\pi {\sigma ^2}/\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2}} } \right)}^{ - 1/2}}} \exp \left\{ { - \frac{​{\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2}} }}{​{2{\sigma ^2}}}{​{\left[ {​{​{\bf{A}}_{i,n}} - \frac{​{\sum\limits_j {​{​{\bf{B}}_{n,j}}\left( {​{​{\bf{X}}_{i,j}} - \sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right)} }}{​{\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2}} }}} \right]}^2}} \right\}

因此,{\cal N}\left( {​{​{\bf{A}}_{i,n}}{\rm{;}}{\mu _{​{​{\bf{A}}_{i,n}}}}{\rm{,}}\sigma _{​{​{\bf{A}}_{i,n}}}^2} \right)中:

{\mu _{​{​{\bf{A}}_{i,n}}}} = \frac{​{\sum\limits_j {​{​{\bf{B}}_{n,j}}\left( {​{​{\bf{X}}_{i,j}} - \sum\limits_{n' \ne n} {​{​{\bf{A}}_{i,n'}}{​{\bf{B}}_{n',j}}} } \right)} }}{​{\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2}} }}

\sigma _{​{​{\bf{A}}_{i,n}}}^2 = \frac{​{\sum\limits_j {​{​{\left( {​{​{\bf{B}}_{n,j}}} \right)}^2}} }}{​{2{\sigma ^2}}}


关于噪声方差:

\begin{array}{l} p\left( {​{\sigma ^2}\left| {​{\bf{A}},{\bf{B}},{\bf{X}}} \right.} \right) \propto p\left( {​{\sigma ^2}} \right)p\left( {​{\bf{X}}\left| {​{\bf{A}},{\bf{B}},{\sigma ^2}} \right.} \right) = {​{\cal G}^{ - 1}}\left( {​{\sigma ^2};{k_{​{\sigma ^2}}},{\theta _{​{\sigma ^2}}}} \right)\\ = \frac{​{​{\theta ^k}}}{​{\Gamma \left( k \right)}}{\left( {​{\sigma ^2}} \right)^{ - k - 1}}{\rm{exp}}\left( { - \frac{\theta }{​{​{\sigma ^2}}}} \right){\prod\limits_{i,j} {\left( {2\pi {\sigma ^2}} \right)} ^{ - 1/2}}\exp \left\{ { - {​{\left( {​{​{\bf{X}}_{i,j}} - {​{\left( {​{\bf{AB}}} \right)}_{i,j}}} \right)}^2}/\left( {2{\sigma ^2}} \right)} \right\} \end{array}

\begin{array}{l} = \frac{​{​{\theta ^k}}}{​{\Gamma \left( k \right)}}{\left( {​{\sigma ^2}} \right)^{ - k - 1}}{\rm{exp}}\left( { - \frac{\theta }{​{​{\sigma ^2}}}} \right){\left( {2\pi {\sigma ^2}} \right)^{ - IJ/2}}\exp \left\{ { - \sum\limits_{i,j} {​{​{\left( {​{​{\bf{X}}_{i,j}} - {​{\left( {​{\bf{AB}}} \right)}_{i,j}}} \right)}^2}/\left( {2{\sigma ^2}} \right)} } \right\}\\ \propto \frac{​{​{​{\left[ {\theta + \frac{1}{2}\sum\limits_{i,j} {​{​{\left( {​{​{\bf{X}}_{i,j}} - {​{\left( {​{\bf{AB}}} \right)}_{i,j}}} \right)}^2}} } \right]}^k}}}{​{\Gamma \left( {k + IJ/2} \right)}}{\left( {​{\sigma ^2}} \right)^{ - k - 1 - IJ/2}}\exp \left( { - \frac{​{\theta + \frac{1}{2}\sum\limits_{i,j} {​{​{\left( {​{​{\bf{X}}_{i,j}} - {​{\left( {​{\bf{AB}}} \right)}_{i,j}}} \right)}^2}} }}{​{​{\sigma ^2}}}} \right) \end{array}

因此,噪声方差所服从的逆伽马分布的参数更新公式分别为:

{k_{​{\sigma ^2}}} = k + IJ/2(感觉此处原文有误,原文多加了个1),

{\theta _{​{\sigma ^2}}} = \theta + \frac{1}{2}\sum\limits_{i,j} {​{​{\left( {​{​{\bf{X}}_{i,j}} - {​{\left( {​{\bf{AB}}} \right)}_{i,j}}} \right)}^2}}

 

  • 2
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值