T2IAdapter代码解读

stable diffusion XL 和 t2i-adapter-sketch-sdxl的模型都可以从modelscope下载

import torch
from diffusers import T2IAdapter, StableDiffusionXLAdapterPipeline, DDPMScheduler
from diffusers.utils import load_image
from PIL import Image

sketch_image = load_image("https://www.modelscope.cn/api/v1/models/AI-ModelScope/t2iadapter_sketch_sd14v1/repo?Revision=master&FilePath=.%2Fimages%2Fsketch.png&View=true")

model_id = "./modelscope/hub/AI-ModelScope/stable-diffusion-xl-base-1___0"
adapter = T2IAdapter.from_pretrained(
    "./modelscope/hub/AI-ModelScope/t2i-adapter-sketch-sdxl-1___0",
    # subfolder="models_XL",
    torch_dtype=torch.float16,
    adapter_type="full_adapter_xl",
)
scheduler = DDPMScheduler.from_pretrained(model_id, subfolder="scheduler")
pipe = StableDiffusionXLAdapterPipeline.from_pretrained(
    model_id, adapter=adapter, torch_dtype=torch.float16, variant="fp16", scheduler=scheduler
).to("cuda")
generator = torch.manual_seed(42)
sketch_image_out = pipe(
    prompt="a photo of a room in real world, high quality",
    negative_prompt="extra digit, fewer digits, cropped, worst quality, low quality",
    image=sketch_image,
    generator=generator,
    guidance_scale=7.5,
).images[0]

sketch_image_out.save("sketch.jpg")

网络结构图

在这里插入图片描述
T2IAdapter 使用了AdapterBlock,总共四层,每层包含两个AdapterResnetBlock。可以得到每层的输出共计四个结果,和stable diffusion unet的encoder每层输出的结果相加,然后再一起进入下一个encoder参与计算。
controlnet是直接和encoder每层输出的结果相加,不再进入下一层encoder,而是skip connection进入到decoder部分。
controlnet代码中的一些细节
(代码用的是stable diffusion XL,和图片数值有出入)

网络结构代码

ModuleList(
  (0): AdapterBlock(
    (resnets): Sequential(
      (0): AdapterResnetBlock( #(1,320,64,64)
        (block1): Conv2d(320, 320, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (act): ReLU()
        (block2): Conv2d(320, 320, kernel_size=(1, 1), stride=(1, 1))
      )
      (1): AdapterResnetBlock( )
    )
  )
  (1): AdapterBlock(
    (in_conv): Conv2d(320, 640, kernel_size=(1, 1), stride=(1, 1))  #(1,640,64,64)
    (resnets): Sequential(
      (0): AdapterResnetBlock( )
      (1): AdapterResnetBlock( )
    )
  )
  (2): AdapterBlock(
    (downsample): AvgPool2d(kernel_size=2, stride=2, padding=0)  #(1,640,32,32)
    (in_conv): Conv2d(640, 1280, kernel_size=(1, 1), stride=(1, 1)) #(1,1280,32,32)
    (resnets): Sequential(
      (0): AdapterResnetBlock( )
      (1): AdapterResnetBlock( )
    )
  )
  (3): AdapterBlock(
    (resnets): Sequential(
      (0): AdapterResnetBlock() #(1,1280,32,32)
      (1): AdapterResnetBlock()
    )
  )
)
  • 5
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值