1、背景
1)Classifier Guidance的问题
a)需要额外训练一个分类器(要基于噪声图像训练,因此无法用现成的预训练分类器),使得扩散模型的训练pipeline更加复杂
b)whether classifier guidance is successful at boosting classifier-based metrics such as FID and Inception score (IS) simply because it is adversarial against such classifiers (classifier guidance mixes a score estimate with a classifier gradient during sampling, classifier-guided diffusion sampling can be interpreted as attempting to confuse an image classifier with a gradient-based adversarial attack)
c)whether classifier-guided diffusion models perform well on classifier-based metrics because they are beginning to resamble GANs, which are already known to perform well on such metrics (stepping in direction of classifier gradients also bears some resamblance to GAN traning, particularly with nonparameteric generators)
2)此外,像GAN和基于flow的模型,可以通过在采样时降低方差或者噪声输入的范围来实现truncation或者低温采样,从而平衡生成结果的variaty和fidelity。而在diffusion的reverse过程中对模型score进行缩放或者降低高斯噪声的方差则会生成模糊和低质量的图像
2、方法
1)通过随机(概率为)将c置为可以实现用同一个网络同时对条件和非条件生成的训练
2)通过对条件生成结果和非条件生成结果进行加权,即可得到最终的生成结果
3、结果
除了训练时采用continuous time以外,模型结构和超参数与Classifier-Guided Diffusion一致
1)取0.1或0.3时,FID最佳(fidelity); >=4时,IS最佳(variaty)
2)取0.1或0.2时,整体的IS/FID最佳。说明模型只有少部分能力用于无条件生成任务的训练即可
3)T↑,采样质量↑。取T=256可以很好的平衡采样质量和采样速度。需要注意的是,每个采样步骤要对降噪模型infer两次,分别得到条件和无条件