变分推理
变分推理的原理等价于最小化KL散度:
K L ( q θ ( ω ) ∥ p ( ω ∣ D ) ) = ∫ q θ ( ω ) log q θ ( ω ) p ( ω ∣ D ) d ω K L\left(q_{\boldsymbol{\theta}}(\boldsymbol{\omega}) \| p(\boldsymbol{\omega} \mid \mathcal{D})\right)=\int q_{\boldsymbol{\theta}}(\boldsymbol{\omega}) \log \frac{q_{\boldsymbol{\theta}}(\boldsymbol{\omega})}{p(\boldsymbol{\omega} \mid \mathcal{D})} d \boldsymbol{\omega} KL(qθ(ω)∥p(ω∣D))=∫qθ(ω)logp(ω∣D)qθ(ω)dω
其中: ω \boldsymbol{\omega} ω为网络中的权重参数, D \mathcal{D} D为给定的训练数据。因 p ( ω ∣ D ) p(\boldsymbol{\omega} \mid \mathcal{D}) p(ω∣D)的求解十分困难,因此采用 q θ ( ω ) q_{\boldsymbol{\theta}}(\boldsymbol{\omega}) qθ(ω)对 p ( ω ∣ D ) p(\boldsymbol{\omega} \mid \mathcal{D}) p(ω∣D<