【扩散模型】引导扩散方法Classifier-free Guidance

福尔马林灌汤包

于 2024-08-15 11:27:30 发布

阅读量400

点赞数 2

分类专栏：【扩散模型】diffusion图像生成理论学习文章标签：人工智能 stable diffusion

本文链接：https://blog.csdn.net/iloveyouqri/article/details/141213021

版权

【扩散模型】diffusion图像生成理论学习专栏收录该内容

5 篇文章 0 订阅

订阅专栏

Classifier-free Guidance

现在我们来介绍引导扩散的另一个方法Classifier-free Guidance，该方法也广泛运用于后续的各种扩散模型。

论文：CLASSIFIER-FREE DIFFUSION GUIDANCE

https://arxiv.org/pdf/2207.12598.pdf

参考视频

classifier guidance到classifier-free guidance之间扩散模型也有了更多的研究，描述的符号也发生了一些改变，采样时间也发生了变化。

文章使用了不同之前提到的文章的公式符号，并且将采样时间使用 $\lambda=\log\alpha_{\lambda}^{2}/\sigma_{\lambda}^{2}$ 进行代替。主要贡献是将有条件的和无条件的扩散模型结合到一起，无需另外训练分类器。

公式推导过程

将Classifier Guidance论文所得到的公式

Classifier Guidance论文相关解析可以查看另一篇博文
【扩散模型】引导扩散方法ClassifierGuidance

$\hat{\epsilon}(x_t):=\epsilon_\theta(x_t)-\sqrt{1-\bar{\alpha}_t} \nabla_{x_t}\log p_\phi(y|x_t)$

写成新的形式

$\tilde{\epsilon}_\theta(\mathbf{z}_\lambda,\mathbf{c})=\epsilon_\theta(\mathbf{z}_\lambda,\mathbf{c})-w\sigma_\lambda\nabla_{\mathbf{z}_\lambda}\log p_\theta(\mathbf{c}|\mathbf{z}_\lambda)$

（也可以写成）

$\tilde{p}_\theta(\mathbf{z}_\lambda|\mathbf{c})\propto p_\theta(\mathbf{z}_\lambda|\mathbf{c})p_\theta(\mathbf{c}|\mathbf{z}_\lambda)^w$

若将权重w变为w+1，使用score方法进行求导并将 $p(\mathbf{z}_{\lambda}|\mathbf{c})$ 使用贝叶斯公式转化得到：

$\begin{aligned} \boldsymbol{\epsilon}_{\theta}(\mathbf{z}_{\lambda})-(w+1)\sigma_{\lambda}\nabla_{\mathbf{z}_{\lambda}}\log p_{\theta}(\mathbf{c}|\mathbf{z}_{\lambda})& \approx-\sigma_{\lambda}\nabla_{\mathbf{z}_{\lambda}}[\log p(\mathbf{z}_{\lambda})+(w+1)\log p_{\theta}(\mathbf{c}|\mathbf{z}_{\lambda})] \\ &=-\sigma_{\lambda}\nabla_{\mathbf{z}_{\lambda}}[\log p(\mathbf{z}_{\lambda}|\mathbf{c})+w\log p_{\theta}(\mathbf{c}|\mathbf{z}_{\lambda})], \end{aligned}$

可以证明

$p_\theta(\mathbf{z}_\lambda|\mathbf{c})p_\theta(\mathbf{c}|\mathbf{z}_\lambda)^w\propto p_\theta(\mathbf{z}_\lambda)p_\theta(\mathbf{c}|\mathbf{z}_\lambda)^{w+1}$

因此，可以得到：

$\tilde{\boldsymbol{\epsilon}}_\theta(\mathbf{z}_\lambda,\mathbf{c})=\boldsymbol{\epsilon}_\theta(\mathbf{z}_\lambda,\mathbf{c})-w\sigma_\lambda\nabla_{\mathbf{z}_\lambda}\log p_\theta(\mathbf{c}|\mathbf{z}_\lambda)\approx-\sigma_\lambda\nabla_{\mathbf{z}_\lambda}[\log p(\mathbf{z}_\lambda|\mathbf{c})+w\log p_\theta(\mathbf{c}|\mathbf{z}_\lambda)]$

=====================================================
接下来要将无条件的采样和有条件的采样结合到同一个模型中，最终得到：

$\tilde{\boldsymbol{\epsilon}}_\theta(\mathbf{z}_\lambda,\mathbf{c})=(1+w)\boldsymbol{\epsilon}_\theta(\mathbf{z}_\lambda,\mathbf{c})-w\boldsymbol{\epsilon}_\theta(\mathbf{z}_\lambda)$

推导过程:

由新的采样公式两边除以 $p_\theta(\mathbf{z}_{\lambda})$ 得到【其中， $p^{i}(\mathbf{c}|\mathbf{z}_{\lambda})$ 代表 $p_{\theta}(\mathbf{c}|\mathbf{z}_{\lambda})^{w+1}$ 】
$p^{i}(\mathbf{c}|\mathbf{z}_{\lambda})\propto p(\mathbf{z}_{\lambda}|\mathbf{c})/p(\mathbf{z}_{\lambda})$

对两边使用score方法求导，得到：

$\nabla_{\mathbf{z}_{\lambda}}\log p^{i}(\mathbf{c}|\mathbf{z}_{\lambda})=-\frac{1}{\sigma_{\lambda}}[\epsilon^{*}(\mathbf{z}_{\lambda},\mathbf{c})-\epsilon^{*}(\mathbf{z}_{\lambda})]$

将 $i = w + 1$ 提下来，两边同乘 $-\sigma_{\lambda}$ 并将 $\boldsymbol{\epsilon}^{*}(\mathbf{z}_{\lambda})$ 移到左边，最后共同提出 $w + 1$ ，得到：

$\tilde{\boldsymbol{\epsilon}}_\theta(\mathbf{z}_\lambda,\mathbf{c})=(1+w)\boldsymbol{\epsilon}_\theta(\mathbf{z}_\lambda,\mathbf{c})-w\boldsymbol{\epsilon}_\theta(\mathbf{z}_\lambda)$