PLSA+EM

  1. 加入隐变量的联合概率,条件概率等为:
    p ( d i , z k , w j ) = p ( d i ) p ( z k ∣ d i ) p ( w j ∣ z k ) p\left(d_{i}, z_{k}, w_{j}\right)=p\left(d_{i}\right) p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right) p(di,zk,wj)=p(di)p(zkdi)p(wjzk)
    P ( w j ∣ d i ) = ∑ k = 1 K P ( z k ∣ d i ) P ( w j ∣ z k ) P ( d i , w j ) = P ( d i ) ∑ k = 1 K P ( w j ∣ z k ) P ( z k ∣ d i ) \begin{array}{c} P\left(w_{j} | d_{i}\right)=\sum_{k=1}^{K} P\left(z_{k} | d_{i}\right) P\left(w_{j} | z_{k}\right) \\ P\left(d_{i}, w_{j}\right)=P\left(d_{i}\right) \sum_{k=1}^{K} P\left(w_{j} | z_{k}\right) P\left(z_{k} | d_{i}\right) \end{array} P(wjdi)=k=1KP(zkdi)P(wjzk)P(di,wj)=P(di)k=1KP(wjzk)P(zkdi)

  2. 得到对数似然函数:
    L = ∑ i = 1 N ∑ j = 1 M [ n ( d i , w j ) log ⁡ P ( d i ) + n ( d i , w j ) log ⁡ ∑ k = 1 K P ( w j ∣ z k ) P ( z k ∣ d i ) ] L=\sum_{i=1}^{N} \sum_{j=1}^{M}\left[n\left(d_{i}, w_{j}\right) \log P\left(d_{i}\right)+n\left(d_{i}, w_{j}\right) \log \sum_{k=1}^{K} P\left(w_{j} | z_{k}\right) P\left(z_{k} | d_{i}\right)\right] L=i=1Nj=1M[n(di,wj)logP(di)+n(di,wj)logk=1KP(wjzk)P(zkdi)]

  3. 求E-step,即是求解后验概率,根据步骤一的已知可以得到:
    γ ( z i j k ) = p ( z k ∣ d i , w j ) = p ( d i ) p ( z k ∣ d i ) p ( w j ∣ z k ) ∑ k = 1 K p ( d i ) p ( z k ∣ d i ) p ( w j ∣ z k ) \gamma\left(z_{i j k}\right)=p\left(z_{k} | d_{i}, w_{j}\right)=\frac{p\left(d_{i}\right) p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)}{\sum_{k=1}^{K} p\left(d_{i}\right) p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)} γ(zijk)=p(zkdi,wj)=k=1Kp(di)p(zkdi)p(wjzk)p(di)p(zkdi)p(wjzk)
    p ( d i ) p(d_i) p(di)参数无关,消去得到:
    γ ( z i j k ) = p ( z k ∣ d i ) p ( w j ∣ z k ) ∑ k = 1 K p ( z k ∣ d i ) p ( w j ∣ z k ) \gamma\left(z_{i j k}\right)=\frac{p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)}{\sum_{k=1}^{K} p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)} γ(zijk)=k=1Kp(zkdi)p(wjzk)p(zkdi)p(wjzk)

  4. M-step
    (1)求Q函数,对于一对样本而言,有期望函数为:
    ∑ k = 1 K γ ( z i j k ) log ⁡ p ( d i , z k , w j ) = ∑ k = 1 K γ ( z i j k ) ( log ⁡ p ( z k ∣ d i ) p ( w j ∣ z k ) + log ⁡ p ( d i ) ) \begin{array}{l} \sum_{k=1}^{K} \gamma\left(z_{i j k}\right) \log p\left(d_{i}, z_{k}, w_{j}\right) =\sum_{k=1}^{K} \gamma\left(z_{i j k}\right)\left(\log p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)+\log p\left(d_{i}\right)\right) \end{array} k=1Kγ(zijk)logp(di,zk,wj)=k=1Kγ(zijk)(logp(zkdi)p(wjzk)+logp(di))
    由于和单个样本的 l o g P ( d i ) logP(d_i) logP(di)为常数,可以不考虑在优化中,简化为:
    ∑ k = 1 K γ ( z i j k ) ( log ⁡ p ( z k ∣ d i ) p ( w j ∣ z k ) ) \begin{array}{l} \sum_{k=1}^{K} \gamma\left(z_{i j k}\right)\left(\log p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)\right) \end{array} k=1Kγ(zijk)(logp(zkdi)p(wjzk))
    (2)对全部样本有:

Q = ∑ i = 1 N ∑ j = 1 M n ( d i , w j ) ∑ k = 1 K γ ( z i j k ) ( log ⁡ p ( z k ∣ d i ) p ( w j ∣ z k ) ) Q=\sum_{i=1}^{N} \sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \sum_{k=1}^{K} \gamma\left(z_{i j k}\right)\left(\log p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)\right) Q=i=1Nj=1Mn(di,wj)k=1Kγ(zijk)(logp(zkdi)p(wjzk))

(3)最大化Q函数,结合约束项 ∑ k = 1 K p ( z k ∣ d ) = 1 \sum_{k=1}^{K} p\left(z_{k} | d\right)=1 k=1Kp(zkd)=1和约束项 ∑ w ∈ V p ( w ∣ z k ) = 1 \sum_{w \in V} p\left(w | z_{k}\right)=1 wVp(wzk)=1分别可求到如下:

1)对于 p ( z k ∣ d i ) p\left(z_{k} | d_{i}\right) p(zkdi),根据拉格朗日乘子法:
L g = Q ( θ , θ o l d ) + λ ( ∑ k = 1 K p ( z k ∣ d i ) − 1 ) Lg=Q\left(\theta, \theta^{o l d}\right)+\lambda\left(\sum_{k=1}^{K} p\left(z_{k} | d_{i}\right)-1\right) Lg=Q(θ,θold)+λ(k=1Kp(zkdi)1)
2)对 p ( z k ∣ d i ) p\left(z_{k} | d_{i}\right) p(zkdi)求偏导有,
− ∑ j = 1 M n ( d i , w j ) γ ( z i j k ) = λ p ( z k ∣ d i ) -\sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)=\lambda p\left(z_{k} | d_{i}\right) j=1Mn(di,wj)γ(zijk)=λp(zkdi)
3)由于 ∑ k = 1 K γ ( z i j k ) = 1 \sum_{k=1}^{K}\gamma\left(z_{i j k}\right)=1 k=1Kγ(zijk)=1 ∑ k = 1 K p ( z k ∣ d i ) = 1 \sum_{k=1}^{K}p\left(z_{k} | d_{i}\right)=1 k=1Kp(zkdi)=1,带入上式有:

λ = − ∑ j = 1 M n ( d i , w j ) \lambda=-\sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) λ=j=1Mn(di,wj)
4)把 λ \lambda λ带入到上上式中,得到 p ( z k ∣ d i ) p\left(z_{k} | d_{i}\right) p(zkdi)的表达式:
p ( z k ∣ d i ) = ∑ j = 1 M n ( d i , w j ) γ ( z i j k ) ∑ j = 1 M n ( d i , w j ) p\left(z_{k} | d_{i}\right)=\frac{\sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)}{\sum_{j=1}^{M} n\left(d_{i}, w_{j}\right)} p(zkdi)=j=1Mn(di,wj)j=1Mn(di,wj)γ(zijk)

同理,采用拉格朗日乘子法也可以求得 p ( w j ∣ z k ) p\left(w_{j} | z_{k}\right) p(wjzk)的表达,过程如下:
1)表达式:
L g = Q ( θ , θ old ) + λ ( ∑ k = 1 K p ( w j ∣ z k ) − 1 ) Lg=Q\left(\theta, \theta^{\text {old}}\right)+\lambda\left(\sum_{k=1}^{K} p\left(w_{j} | z_{k}\right)-1\right) Lg=Q(θ,θold)+λ(k=1Kp(wjzk)1)
2)求偏导得:
− ∑ i = 1 N n ( d i , w j ) γ ( z i j k ) = λ p ( w j ∣ z k ) -\sum_{i=1}^{N} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)=\lambda p\left(w_{j} | z_{k}\right) i=1Nn(di,wj)γ(zijk)=λp(wjzk)
3)对参数 j j j的词累加得:
λ = − ∑ i = 1 N ∑ j = 1 M n ( d i , w j ) γ ( z i j k ) \lambda=-\sum_{i=1}^{N} \sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right) λ=i=1Nj=1Mn(di,wj)γ(zijk)
4)再带入(2)中,求得:
p ( w j ∣ z k ) = ∑ i = 1 N n ( d i , w j ) γ ( z i j k ) ∑ i = 1 N ∑ j = 1 M n ( d i , w j ) γ ( z i j k ) p\left(w_{j} | z_{k}\right)=\frac{\sum_{i=1}^{N} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)}{\sum_{i=1}^{N} \sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)} p(wjzk)=i=1Nj=1Mn(di,wj)γ(zijk)i=1Nn(di,wj)γ(zijk)

  1. 总结得到优化的步骤为:
    E-step,求后验概率:
    γ ( z i j k ) = p ( z k ∣ d i ) p ( w j ∣ z k ) ∑ k = 1 K p ( z k ∣ d i ) p ( w j ∣ z k ) \gamma\left(z_{i j k}\right)=\frac{p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)}{\sum_{k=1}^{K} p\left(z_{k} | d_{i}\right) p\left(w_{j} | z_{k}\right)} γ(zijk)=k=1Kp(zkdi)p(wjzk)p(zkdi)p(wjzk)
    M-step:
    p ( z k ∣ d i ) = ∑ j = 1 M n ( d i , w j ) γ ( z i j k ) ∑ j = 1 M n ( d i , w j ) p\left(z_{k} | d_{i}\right)=\frac{\sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)}{\sum_{j=1}^{M} n\left(d_{i}, w_{j}\right)} p(zkdi)=j=1Mn(di,wj)j=1Mn(di,wj)γ(zijk)

p ( w j ∣ z k ) = ∑ i = 1 N n ( d i , w j ) γ ( z i j k ) ∑ i = 1 N ∑ j = 1 M n ( d i , w j ) γ ( z i j k ) p\left(w_{j} | z_{k}\right)=\frac{\sum_{i=1}^{N} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)}{\sum_{i=1}^{N} \sum_{j=1}^{M} n\left(d_{i}, w_{j}\right) \gamma\left(z_{i j k}\right)} p(wjzk)=i=1Nj=1Mn(di,wj)γ(zijk)i=1Nn(di,wj)γ(zijk)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值