VAE 损失函数

最新推荐文章于 2024-04-09 19:25:08 发布

lyang~

最新推荐文章于 2024-04-09 19:25:08 发布

阅读量630

点赞数 1

文章标签：机器学习算法概率论

本文链接：https://blog.csdn.net/qq_69515036/article/details/132403010

版权

实例 $\mathbf{x}$ 由条件分布 $\mathbf{x}|\mathbf{z}$ 生成， $\mathbf{z}$ 是一个服从任意分布的随机变量

VAE 的优化目标是令 $p\left(\mathbf{x}\right)$ 的对数似然最大化，因此有
$\begin{aligned} \ln p\left(\mathbf{x}\right) &=\int_\mathbf{z}q\left(\mathbf{z}|\mathbf{x}\right)\ln p\left(\mathbf{x}\right)d\mathbf{z}\\ &=\int_\mathbf{z}q\left(\mathbf{z}|\mathbf{x}\right)\ln \left(\frac{p\left(\mathbf{x},\mathbf{z}\right)}{q\left(\mathbf{z}|\mathbf{x}\right)}\frac{q\left(\mathbf{z}|\mathbf{x}\right)}{p\left(\mathbf{z}|\mathbf{x}\right)}\right)d\mathbf{z}\\ &=\int_\mathbf{z}q\left(\mathbf{z}|\mathbf{x}\right)\ln\left(\frac{p\left(\mathbf{x},\mathbf{z}\right)}{q\left(\mathbf{z}|\mathbf{x}\right)}\right)d\mathbf{z}+\int_\mathbf{z}q\left(\mathbf{z}|\mathbf{x}\right)\ln\left(\frac{q\left(\mathbf{z}|\mathbf{x}\right)}{p\left(\mathbf{z}|\mathbf{x}\right)}\right)d\mathbf{z}\\ &=LB + KL\left(q\left(\mathbf{z}|\mathbf{x}\right)||p\left(\mathbf{z}|\mathbf{x}\right)\right) \end{aligned}$
总之，要使 $\ln p\left(\mathbf{x}\right)$ 最大化，就等价于使 $L B$ 最大化，而 $L B$ 又有
$\begin{aligned} LB &=\int_\mathbf{z}q\left(\mathbf{z}|\mathbf{x}\right)\ln\left(\frac{p\left(\mathbf{z}|\mathbf{x}\right)p\left(\mathbf{z}\right)}{q\left(\mathbf{z}|\mathbf{x}\right)}\right)d\mathbf{z}\\ &=\int_\mathbf{z}q\left(\mathbf{z}|\mathbf{x}\right)\ln\left(\frac{p\left(\mathbf{z}\right)}{q\left(\mathbf{z}|\mathbf{x}\right)}\right)d\mathbf{z}+\int_\mathbf{z}q\left(\mathbf{z}|\mathbf{x}\right)\ln p\left(\mathbf{x}|\mathbf{z}\right)d\mathbf{z}\\ &=-KL\left(q\left(\mathbf{z}|\mathbf{x}\right)||p\left(\mathbf{z}\right)\right)+\mathbb{E}_{q\left(\mathbf{z}|\mathbf{x}\right)}\ln p\left(\mathbf{x}|\mathbf{z}\right) \end{aligned}$
$L B$ 最大化就是我们的目标，但是一般最优化问题写作求最小值的形式，因此对 $L B$ 取负得到最小化的目标，也就是损失函数，记做 $E L BO$ 有
$ELBO=KL\left(q\left(\mathbf{z}|\mathbf{x}\right)||p\left(\mathbf{z}\right)\right)-\mathbb{E}_{q\left(\mathbf{z}|\mathbf{x}\right)}\ln p\left(\mathbf{x}|\mathbf{z}\right)$
注意到，这里的 $p\left(\mathbf{z}\right)$ 是先验分布，可以为任意分布，而 $q\left(\mathbf{z}|\mathbf{x}\right)$ 和 $p\left(\mathbf{x}|\mathbf{z}\right)$ 也是由我们自行指定的， $q\left(\mathbf{z}|\mathbf{x}\right)$ 实际上是 $p\left(\mathbf{z}|\mathbf{x}\right)$ 的近似，因为我们算不出来 $p\left(\mathbf{z}|\mathbf{x}\right)$ ，因此直接近似它。