在AI视频生成领域,Wan2.1作为最新的视觉生成模型,能够根据文本、图像或其他控制信号生成视频,以其卓越表现备受关注。在VBench评测中,Wan2.1以86.22%的总分,摘得桂冠。
MindSpore快速对Wan2.1进行适配,并将其开源至MindSpore ONE仓库,结合昇腾硬件,为开发者提供高效体验。本文将详细介绍如何基于昇思MindSpore和单机Atlas 800T A2,完整实现Wan2.1视频生成的部署流程。
快速上手:5分钟玩转Wan2.1
01 环境准备
CANN下载:
https://www.hiascend.com/developer/download/community/result
MindSpore下载:
https://www.mindspore.cn/install
02 安装依赖
git clone https://github.com/mindspore-lab/mindone
cd mindone/examples/wan2_1
pip install -r requirements.txt
03 模型下载
模型 | 🤗Huggingface | 🤖Modelscope | 说明 |
T2V-14B | https://huggingface.co/Wan-AI/Wan2.1-T2V-14B | https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-14B | 支持480P 和720P |
I2V-14B-720P | https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-720P | https://www.modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-720P | 支持720P |
I2V-14B-480P | https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-480P | https://www.modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-480P | 支持480P |
T2V-1.3B | https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B | https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-1.3B | 支持480P |
从 Hugging Face 或 ModelScope 下所需的模型,如:
huggingface-cli download Wan-AI/Wan2.1-T2V-14B --local-dir ./Wan2.1-T2V-14B
04 文生视频(T2V)
支持1.3B和14B模型,分辨率可选480P或720P。
-
单卡推理:
python generate.py \
--task t2v-1.3B \
--size 832*480 \
--ckpt_dir ./Wan2.1-T2V-1.3B \
--sample_shift 8 \
--sample_guide_scale 6 \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
可自定义prompt,生成480P个性化视频,通过调小sample_guide_scale参数增强视频画面质量,或者调大该参数以增强视频-文本匹配程度。
-
多卡加速:
msrun --worker_num=4 --local_worker_num=4 generate.py \
--task t2v-14B \
--size 1280*720 \
--ckpt_dir ./Wan2.1-T2V-14B \
--dit_zero3 --t5_zero3 --ulysses_sp \
--prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
可开启序列并行和优化器并行加速720P视频生成
05 图生视频(I2V)
支持14B模型,分辨率可选480P或720P。
-
单卡推理:
python generate.py \
--task i2v-14B \
--size 832*480 \
--ckpt_dir ./Wan2.1-I2V-14B-480P \
--image examples/i2v_input.JPG \
--prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
-
多卡加速:
msrun --worker_num=2 --local_worker_num=2 generate.py \
--task i2v-14B --size 1280*720 \
--ckpt_dir ./Wan2.1-I2V-14B-720P \
--dit_zero3 --t5_zero3 --ulysses_sp \
--image examples/i2v_input.JPG \
--prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."
06 性能实测:昇腾硬件加速提升效率
性能测试结果如下:
1.3B模型 :资源占用低,生成速度快,适合轻量应用场景。
14B模型 :支持更高分辨率,生成质量更优;多卡并行可显著提升效率,4卡可提速 3.8倍 。
总结
想把脑洞变成视频?赶紧去MindSpore ONE下载代码,动手试试吧!有什么问题,欢迎留言,我们会第一时间帮你解答。
MindSpore ONE开源代码仓链接:https://github.com/mindspore-lab/mindone/tree/master/examples/wan2_1