T2I适配器是一个比ControlNet更轻的适配器,它提供了一个额外的输入来调节预训练的模型。它比ControlNet更快,但结果可能稍差。
可以在TencentArc的存储库中找到针对其他输入进行训练的其他T2I适配器Checkpoint。
加载在Canny边缘检测图像上训练的T2IAdapter,并将其传递给[StableDiffusionXLAdapterPipeline]。然后将LCM Checkpoint加载到[UNet2DConditionModel]中,并用[LCM scheduler]替换调度器。
下面试着将Canny边缘检测图像传递到管道并生成图像。
# 以下代码为程序运行进行设置
import os
os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"
mport torch
import cv2
import numpy as np
from PIL import Image
from diffusers import StableDiffusionXLAdapterPipeline, UNet2DConditionModel, T2IAdapter, LCMScheduler
from diffusers.utils import load_image, make_image_grid
# 在低分辨率的条件下进行图像边缘检测以避免高频细节
image = load_image(
"https://hf-mirror.com/datasets/huggingface/documentation-images/resolve/main/diffusers/input_image_vermeer.png"
).resize((384, 384))
image = np.array(image)
low_threshold = 100
high_threshold = 200
image = cv2.Canny(image, low_threshold, high_threshold)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
canny_image = Image.fromarray(image).resize((1024, 1216))
# 以下代码会引入与T2I相关的模型
adapter = T2IAdapter.from_pretrained("TencentARC/t2i-adapter-canny-sdxl-1.0", torch_dtype=torch.float16,
varient="fp16").to("cuda")
unet = UNet2DConditionModel.from_pretrained(
"latent-consistency/lcm-sdxl",
torch_dtype=torch.float16,
variant="fp16",
)
pipe = StableDiffusionXLAdapterPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
unet=unet,
adapter=adapter,
torch_dtype=torch.float16,
variant="fp16",
).to("cuda")
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
prompt = "the mona lisa, 4k picture, high quality"
negative_prompt = "extra digit, fewer digits, cropped, worst quality, low quality, glitch, deformed, mutated, ugly, disfigured"
generator = torch.manual_seed(0)
image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
image=canny_image,
num_inference_steps=4,
guidance_scale=5,
adapter_conditioning_scale=0.8,
adapter_conditioning_factor=1,
generator=generator,
).images[0]
image.show()
以下为原始图像
以下为生成的图像(变化确实有点大啊)