最新级联文生图技术,StableCascade模型部署

Stable Cascade是 Stability AI 开发的一款先进的文生图(Text-to-Image)生成模型。

Stable Cascade由三个模型组成:Stage A、Stage B 和 Stage C,它们分别处理图像生成的不同阶段,形成了一个“级联”(Cascade)的过程。

Stage C 模型会根据给定的文本生成低分辨率潜像,然后输入到 Stage B 模型中进行放大,最后输入到 Stage A 模型中再次放大并转换为像素空间,生成最终图像。

这种分阶段的架构模式使得 Stable Cascade 在生成图像时更加灵活高效,它不仅允许每个阶段使用不同大小的模型,还能让用户根据自身硬件条件选择合适的模型,从而降低了硬件要求。

github项目地址:https://github.com/Stability-AI/StableCascade。

一、环境安装

1、python环境

建议安装python版本在3.10以上。

2、pip库安装

pip install torch==2.3.0+cu118 torchvision==0.18.0+cu118 torchaudio==2.3.0 --extra-index-url https://download.pytorch.org/whl/cu118

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install gradio accelerate -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install git+https://github.com/kashif/diffusers.git@wuerstchen-v3

3、模型下载

bash models/download_models.sh essential big-big small-small bfloat16

、功能测试

1、命令行运行测试

(1)python代码调用测试

import torch
from tqdm import tqdm

# Assuming calculate_latent_sizes, core, models, extras, core_b, models_b, extras_b, show_images are all properly imported and defined elsewhere

batch_size = 4
caption = "Cinematic photo of an anthropomorphic penguin sitting in a cafe reading a book and having a coffee"
height, width = 1024, 1024
stage_c_latent_shape, stage_b_latent_shape = calculate_latent_sizes(height, width, batch_size=batch_size)

# Stage C Parameters
extras.sampling_configs['cfg'] = 4
extras.sampling_configs['shift'] = 2
extras.sampling_configs['timesteps'] = 20
extras.sampling_configs['t_start'] = 1.0

# Stage B Parameters
extras_b.sampling_configs['cfg'] = 1.1
extras_b.sampling_configs['shift'] = 1
extras_b.sampling_configs['timesteps'] = 10
extras_b.sampling_configs['t_start'] = 1.0

# Prepare conditions
batch = {'captions': [caption] * batch_size}
conditions = core.get_conditions(batch, models, extras, is_eval=True, is_unconditional=False, eval_image_embeds=False)
unconditions = core.get_conditions(batch, models, extras, is_eval=True, is_unconditional=True, eval_image_embeds=False)    
conditions_b = core_b.get_conditions(batch, models_b, extras_b, is_eval=True, is_unconditional=False)
unconditions_b = core_b.get_conditions(batch, models_b, extras_b, is_eval=True, is_unconditional=True)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Sampling
with torch.no_grad(), torch.cuda.amp.autocast(dtype=torch.bfloat16):

    torch.manual_seed(42)

    # Stage C Sampling
    sampling_c = extras.gdf.sample(
        models.generator, conditions, stage_c_latent_shape,
        unconditions, device=device, **extras.sampling_configs,
    )
    for (sampled_c, _, _) in tqdm(sampling_c, total=extras.sampling_configs['timesteps']):
        sampled_c = sampled_c

    # Uncomment to preview stage C results
    # preview_c = models.previewer(sampled_c).float()
    # show_images(preview_c)

    # Update conditions for Stage B
    conditions_b['effnet'] = sampled_c
    unconditions_b['effnet'] = torch.zeros_like(sampled_c, device=device)

    # Stage B Sampling
    sampling_b = extras_b.gdf.sample(
        models_b.generator, conditions_b, stage_b_latent_shape,
        unconditions_b, device=device, **extras_b.sampling_configs
    )
    for (sampled_b, _, _) in tqdm(sampling_b, total=extras_b.sampling_configs['timesteps']):
        sampled_b = sampled_b

    # Decode and display final sampled images
    sampled = models_b.stage_a.decode(sampled_b).float()

show_images(sampled)

未完......

更多详细的内容欢迎关注:杰哥新技术

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值