视频生成SOTA模型推理开箱即用!MindSpore支持Wan2.1

在AI视频生成领域,Wan2.1作为最新的视觉生成模型,能够根据文本、图像或其他控制信号生成视频,以其卓越表现备受关注。在VBench评测中,Wan2.1以86.22%的总分,摘得桂冠。

MindSpore快速对Wan2.1进行适配,并将其开源至MindSpore ONE仓库,结合昇腾硬件,为开发者提供高效体验。本文将详细介绍如何基于昇思MindSpore和单机Atlas 800T A2,完整实现Wan2.1视频生成的部署流程。

快速上手:5分钟玩转Wan2.1

01 环境准备 

image.png

CANN下载:

https://www.hiascend.com/developer/download/community/result

MindSpore下载:

https://www.mindspore.cn/install

02 安装依赖

git clone https://github.com/mindspore-lab/mindone
cd mindone/examples/wan2_1

pip install -r requirements.txt

03 模型下载

模型

🤗Huggingface

🤖Modelscope

说明

T2V-14B

https://huggingface.co/Wan-AI/Wan2.1-T2V-14B

https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-14B

支持480P 和720P

I2V-14B-720P

https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-720P

https://www.modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-720P

支持720P

I2V-14B-480P

https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-480P

https://www.modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-480P

支持480P

T2V-1.3B

https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B

https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-1.3B

支持480P

从 Hugging Face 或 ModelScope 下所需的模型,如:

huggingface-cli download Wan-AI/Wan2.1-T2V-14B --local-dir ./Wan2.1-T2V-14B

04 文生视频(T2V)

支持1.3B和14B模型,分辨率可选480P或720P。

  • 单卡推理:

python generate.py  \
    --task t2v-1.3B \
    --size 832*480 \
    --ckpt_dir ./Wan2.1-T2V-1.3B \
    --sample_shift 8 \
    --sample_guide_scale 6 \
    --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."

可自定义prompt,生成480P个性化视频,通过调小sample_guide_scale参数增强视频画面质量,或者调大该参数以增强视频-文本匹配程度。

  • 多卡加速:

msrun --worker_num=4 --local_worker_num=4 generate.py \
    --task t2v-14B \
    --size 1280*720 \
    --ckpt_dir ./Wan2.1-T2V-14B \
    --dit_zero3 --t5_zero3 --ulysses_sp \
    --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."

可开启序列并行和优化器并行加速720P视频生成

05 图生视频(I2V)

支持14B模型,分辨率可选480P或720P。

  • 单卡推理:

python generate.py \
    --task i2v-14B \
    --size 832*480 \
    --ckpt_dir ./Wan2.1-I2V-14B-480P \
    --image examples/i2v_input.JPG \
    --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
  • 多卡加速:

msrun --worker_num=2 --local_worker_num=2 generate.py \
    --task i2v-14B --size 1280*720 \
    --ckpt_dir ./Wan2.1-I2V-14B-720P \
    --dit_zero3 --t5_zero3 --ulysses_sp \
    --image examples/i2v_input.JPG \
    --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."

06 性能实测:昇腾硬件加速提升效率

性能测试结果如下:

image.png

1.3B模型 :资源占用低,生成速度快,适合轻量应用场景。

14B模型 :支持更高分辨率,生成质量更优;多卡并行可显著提升效率,4卡可提速 3.8倍 。

总结

想把脑洞变成视频?赶紧去MindSpore ONE下载代码,动手试试吧!有什么问题,欢迎留言,我们会第一时间帮你解答。

MindSpore ONE开源代码仓链接:https://github.com/mindspore-lab/mindone/tree/master/examples/wan2_1

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值