Diffusion Models With Efficient Sampling

最新推荐文章于 2024-07-09 18:21:24 发布

李木木乃伊

最新推荐文章于 2024-07-09 18:21:24 发布

阅读量709

点赞数 2

分类专栏： AIGC Diffusion 文章标签：人工智能深度学习机器学习

本文链接：https://blog.csdn.net/u014266895/article/details/129672075

版权

AIGC 同时被 2 个专栏收录

1 篇文章 0 订阅

订阅专栏

Diffusion

1 篇文章 0 订阅

订阅专栏

引言

generate samples这一步一般要经历很多步，因此很多工作focus在加速sampling过程，如stable diffusion提供的sampling method就有DPM这种加速版本：
在这里插入图片描述

从方法上来说分为两种类型：learning-free sampling和learning-based sampling。

learning-free sampling方法介绍

大部分leanring-free方法都是基于SongYong博士推导的SDE/ODE形式 [Score-based generative modeling through stochastic differential equations]，也可以看他的blog
这种形式能够通过新的数值解形式优化step size、model iteration
如：

dpm solver、dpm solver++，15步收敛 step-size：Gotta Go Fast When Generating Data with Score-Based Models
the order of the solver：Elucidating the Design Space of Diffusion-Based Generative Models
the initialization point：Come-Closer-Diffuse-Faster: Accelerating Conditional Diffusion Models for Inverse Problems through Stochastic Contraction

再比如DDIM里提到的加速方法：
在这里插入图片描述

下面介绍两篇不从数值解出发的paper，可以扩展下思维，分别是learning-free的和learning-base的：

ReDi: Efficient Learning-Free Diffusion Inference via Trajectory Retrieval

对于一个反向迭代序列，很多方法是增大step size，本文目的是做跳步。用到了Sdedit: Guided image synthesis and editing with stochastic differential equations和Uncovering the disentanglement capability in text-to-image diffusion models提到的：前几步决定图像的layout，接下来的步骤决定细节：
在这里插入图片描述
算法框架：

依赖的Knowledge Base的构建过程以及整体inference过程：

实验结果1：quality

实验结果2：Trajectories are better retrieval keys than text-image representations
相当于knowledge base的keys换成CLIP embeddings

实验结果3：REDI can perform zero-shot domain adaptation without a domain-specific knowledge base
在这里插入图片描述
Ablation Study：

Flow Straight and Fast

sampling过程当做是A分布向B分布的transport，并且需要在里面找一条最短路径出来，也就是论文中说的走直线。Generate modeling和transfer modeling的区别在于，前者中的A分布是一个基础分布，而后者都是empirically observed unknown distributions。一般的optimal transport问题求解之后我会再写篇文章介绍。
本文的方法叫做Rectified flow，通过ODE来隐式的学习transport map T
在这里插入图片描述
一步到位的方法就是走直线，因此我们的目的就是找到一个满足A到B的线性插值：

这里有一个简单的例子说明：
two distributions:

rectified_flow_1, N=1：

rectified_flow_1, N=100：

rectified_flow_2，N=1：

这篇文章的核心是理解non-crossing这件事，说实话作者写的我觉得不太清楚，一句话带过了。
在这里插入图片描述
比较相关的证明是利用杰森不等式来得出传输代价一直在降低，从而能知道在朝着代价更小的路线走，配对过程就更加non-crossing？传输代价降低的证明：

这部分我也没完全看懂，有看懂的朋友在评论区交流下呀

Reflow VS Distillation
让t=0其实就是一步蒸馏：
区别：蒸馏硬学配对，reflow注重于得到正确的边际分布，降低了交叉的概率
Reflow和Distillation也可以组合使用：先用Reflow得到比较好的配对，最后再用已经很好的配对进行Distillation
在这里插入图片描述
结果：