# LDA模型中每个单词的主题 zi $z_{i}$的Gibbs抽样公式

p(zi=k|z⃗ i,wi=t,w⃗ i)=n(t)k,i+βtVt=1[n(t)k,i+βt]n(k)m,i+αk[Kk=1n(k)m+αk]1 $p(z_{i}=k|\vec{z}_{-i},w_{i}=t,\vec{w}_{-i})=\frac{n_{k,-i}^{(t)}+\beta_{t}}{\sum_{t=1}^{V}[n_{k,-i}^{(t)}+\beta_{t}]}\cdot \frac{n_{m,-i}^{(k)}+\alpha_{k}}{[\sum_{k=1}^{K}n_{m}^{(k)}+\alpha_{k}]-1}$

# 关于超参数的MH(Metropolis-Hastings)采样

αklogN(μα,σ2α) $\alpha_{k}\sim logN(\mu_{\alpha},\sigma_{\alpha}^{2})$

βtlogN(μβ,σ2β) $\beta_{t}\sim logN(\mu_{\beta},\sigma_{\beta}^{2})$

p(αk|α⃗ k,Z,β⃗ ,W)π(αk)p(Z|α)π(αk)d=1DΓ(Kl=1αl)Γ[Kl=1(αl+nld)]Γ(αk+nkd)Γ(αk)=logN(μα,σ2α)d=1DΓ(ΣKl=1αl)Γ[Kl=1(αl+nld)]Γ(αk+nkd)Γ(αk)

p(βt|β⃗ k,Z,α⃗ ,W)π(βt)p(W|Z,β⃗ )π(βt)k=1KΓ(Vs=1βs)Γ[Vs=1(βs+nsk)]Γ(βt+ntk)Γ(βt)=logN(μβ,σ2β)k=1KΓ(Vs=1βs)Γ[Vs=1(βs+nsk)]Γ(βt+ntk)Γ(βt)

[1] Gregor Heinrich, Parameter estimation for text analysis
[2] 靳志辉，LDA数学八卦
[3] Bruno Jacobs,Model-based Purchase Predictions for Large Assortments Online appendix

01-02

09-29 1万+
11-11 2065
04-19 3719
07-26 8454
11-17 26万+
04-25 1万+
07-31 4万+
05-17 5914
08-24 1551
09-07 80
12-24 2万+
01-14 1万+
05-21 4865
09-12 511