Custom Diffusion 项目使用教程-CSDN博客

本文链接：https://blog.csdn.net/gitblog_00548/article/details/142838796

Custom Diffusion 项目使用教程

custom-diffusion Custom Diffusion: Multi-Concept Customization of Text-to-Image Diffusion (CVPR 2023) 项目地址: https://gitcode.com/gh_mirrors/cu/custom-diffusion

1. 项目介绍

Custom Diffusion 是一个用于多概念定制文本到图像扩散模型的开源项目，由 Adobe Research 开发。该项目在 CVPR 2023 上发布，旨在通过少量图像（约4-20张）对预训练的文本到图像扩散模型（如 Stable Diffusion）进行微调，从而生成新的概念图像。Custom Diffusion 通过仅微调模型中的关键和值投影矩阵，显著减少了计算时间和存储需求，同时支持多概念的组合使用，如新对象与新艺术风格的结合。

2. 项目快速启动

2.1 环境准备

首先，确保你已经安装了必要的依赖项。你可以使用以下命令克隆项目并设置环境：

git clone https://github.com/adobe-research/custom-diffusion.git
cd custom-diffusion
git clone https://github.com/CompVis/stable-diffusion.git
cd stable-diffusion
conda env create -f environment.yaml
conda activate ldm
pip install clip-retrieval tqdm

2.2 下载预训练模型

下载 Stable Diffusion 的预训练模型：

wget https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4.ckpt

2.3 单概念微调

使用真实图像作为正则化样本进行微调：

# 下载数据集
wget https://www.cs.cmu.edu/~custom-diffusion/assets/data.zip
unzip data.zip

# 运行训练（需要2个GPU，每个30GB）
bash scripts/finetune_real.sh "cat" data/cat real_reg/samples_cat cat finetune_addtoken.yaml <pretrained-model-path>

# 保存更新后的模型权重
python src/get_deltas.py --path logs/<folder-name> --newtoken 1

# 生成样本
python sample.py --prompt "<new1> cat playing with a ball" --delta_ckpt logs/<folder-name>/checkpoints/delta_epoch\=000004.ckpt --ckpt <pretrained-model-path>