Stable Diffusion是基于潜在扩散模型Latent Diffusion Models(LDMs)实现的Txt2Img生成模型。
相关的论文有:
- 《High-Resolution Image Synthesis with Latent Diffusion Models》(CVPR2022)
笔者在2025.3.30向生成模型转型,根据后续需要进行补充
#1.环境搭建
# 下载项目
git clone https://github.com/CompVis/stable-diffusion.git
# 创建环境
conda env create -f environment.yaml
conda activate ldm
Bug处理:执行
conda env create -f environment.yaml
时,卡在安装pip依赖包原因:yaml文件默认不翻墙,所以速度非常满
解决:建议自己将yaml文件中的pip依赖包手动翻墙安装。例如:
pip install albumentations==0.4.3 --proxy="socks5:xx:xxxxx" .. pip install -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers --proxy="socks5:xx:xxxxx" pip install -e git+https://github.com/openai/CLIP.git@main#egg=clip --proxy="socks5:xx:xxxxx" pip -e . --proxy="socks5:xx:xxxxx"
#2.下载模型权重
https://huggingface.co/CompVis/stable-diffusion中,推荐用红框标出的这一版:
#3.推理:Text2Image
proxychains python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms --ckpt weights/sd-v1-4.ckpt
Bug处理#1:
OSError: We couldn't connect to 'https://huggingface.co' to load this model, couldn't find it in the cached files and it looks like CompVis/stable-diffusion-safety-checker is not the path to a directory containing a preprocessor_config.json file. Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
原因:未在本地找到
clip-vit-large-patch14
相关文件,也无法连接到https://huggingface.co
进行下载解决:翻墙执行,就会自动下载相关文件到
~/.cache/huggingface/hub
目录Bug处理#2:
CUDA out of memory
原因:显寸12GB是不够的
解决:
- Way1. 设置生成样本数量
--n_samples 1
,该参数就是batchsize
- Way2. 修改生成图像的大小
--H 256 --W 256
,默认值为512
- Way3. 在
scripts/txt2img.py
中的model.cuda()
修改成model.cuda().half()
,降低了模型所有参数的精度,但可以显著减少显存占用。
#4.推理:Image Modification
python scripts/img2img.py --prompt "A fantasy landscape, trending on artstation" --init-img demo.jpg --strength 0.8 --config configs/stable-diffusion/v1-inference.yaml --ckpt weights/sd-v1-4.ckpt
Bug处理#1:
OSError: Can't load tokenizer for 'openai/clip-vit-large-patch14'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'openai/clip-vit-large-patch14' is the correct path to a directory containing all relevant files for a CLIPTokenizer tokenizer.
原因:未在本地找到
clip-vit-large-patch14
相关文件,也无法连接到https://huggingface.co/models
进行下载解决:翻墙执行,就会自动下载相关文件到
~/.cache/huggingface/hub
目录Bug处理#2:
CUDA out of memory
原因:显寸12GB是不够的
解决:
- Way1. 设置生成样本数量
--n_samples 1
,该参数就是batchsize
- Way2. 使用尺寸更小的
init-img
作为输入