视频生成SOTA模型推理开箱即用！MindSpore支持Wan2.1

本文链接：https://blog.csdn.net/Kenji_Shinji/article/details/146098871

在AI视频生成领域，Wan2.1作为最新的视觉生成模型，能够根据文本、图像或其他控制信号生成视频，以其卓越表现备受关注。在VBench评测中，Wan2.1以86.22%的总分，摘得桂冠。

MindSpore快速对Wan2.1进行适配，并将其开源至MindSpore ONE仓库，结合昇腾硬件，为开发者提供高效体验。本文将详细介绍如何基于昇思MindSpore和单机Atlas 800T A2，完整实现Wan2.1视频生成的部署流程。

快速上手：5分钟玩转Wan2.1

01 环境准备

CANN下载：

https://www.hiascend.com/developer/download/community/result

MindSpore下载：

https://www.mindspore.cn/install

02 安装依赖

git clone https://github.com/mindspore-lab/mindone
cd mindone/examples/wan2_1

pip install -r requirements.txt

03 模型下载

模型	🤗Huggingface	🤖Modelscope	说明
T2V-14B	https://huggingface.co/Wan-AI/Wan2.1-T2V-14B	https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-14B	支持480P 和720P
I2V-14B-720P	https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-720P	https://www.modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-720P	支持720P
I2V-14B-480P	https://huggingface.co/Wan-AI/Wan2.1-I2V-14B-480P	https://www.modelscope.cn/models/Wan-AI/Wan2.1-I2V-14B-480P	支持480P
T2V-1.3B	https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B	https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-1.3B	支持480P

从 Hugging Face 或 ModelScope 下所需的模型，如：

huggingface-cli download Wan-AI/Wan2.1-T2V-14B --local-dir ./Wan2.1-T2V-14B

04 文生视频（T2V）

支持1.3B和14B模型，分辨率可选480P或720P。

单卡推理：

python generate.py  \
    --task t2v-1.3B \
    --size 832*480 \
    --ckpt_dir ./Wan2.1-T2V-1.3B \
    --sample_shift 8 \
    --sample_guide_scale 6 \
    --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."

可自定义prompt，生成480P个性化视频，通过调小sample_guide_scale参数增强视频画面质量，或者调大该参数以增强视频-文本匹配程度。

多卡加速：

msrun --worker_num=4 --local_worker_num=4 generate.py \
    --task t2v-14B \
    --size 1280*720 \
    --ckpt_dir ./Wan2.1-T2V-14B \
    --dit_zero3 --t5_zero3 --ulysses_sp \
    --prompt "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."

可开启序列并行和优化器并行加速720P视频生成

05 图生视频（I2V）

支持14B模型，分辨率可选480P或720P。

单卡推理：

python generate.py \
    --task i2v-14B \
    --size 832*480 \
    --ckpt_dir ./Wan2.1-I2V-14B-480P \
    --image examples/i2v_input.JPG \
    --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."

多卡加速：

msrun --worker_num=2 --local_worker_num=2 generate.py \
    --task i2v-14B --size 1280*720 \
    --ckpt_dir ./Wan2.1-I2V-14B-720P \
    --dit_zero3 --t5_zero3 --ulysses_sp \
    --image examples/i2v_input.JPG \
    --prompt "Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside."

06 性能实测：昇腾硬件加速提升效率

性能测试结果如下：