【动作生成】MoMask: Generative Masked Modeling of 3D Human Motions

MoMask: Generative Masked Modeling of 3D Human Motions 三维人体运动的生成式屏蔽建模

2023.11 CVPR 2024

论文地址
代码地址
动作生成CVPR2024最新论文 MoMask: Generative Masked Modeling of 3D Human Motions

请添加图片描述

Abstract

We introduce MoMask, a novel masked modeling framework for text-driven 3D human motion generation. In MoMask, a hierarchical quantization scheme is employed to represent human motion as multi-layer discrete motion tokens with high-fidelity details. Starting at the base layer, with a sequence of motion tokens obtained by vector quantization, the residual tokens of increasing orders are derived and stored at the subsequent layers of the hierarchy. This is consequently followed by two distinct bidirectional transformers. For the base-layer motion tokens, a Masked Transformer is designated to predict randomly masked motion tokens conditioned on text input at training stage. During generation (i.e. inference) stage, starting from an empty sequence, our Masked Transformer iteratively fills up the missing tokens; Subsequently, a Residual Transformer learns to progressively predict the next-layer tokens based on the results from current layer. Extensive experiments demonstrate that MoMask outperforms the state-of-art methods on the text-to-motion generation task, with an FID of 0.045 (vs e.g. 0.141 of T2M-GPT) on the HumanML3D dataset, and 0.228 (vs 0.514) on KIT-ML, respectively. MoMask can also be seamlessly applied in related tasks without further model fine-tuning, such as text-guided temporal inpainting.

我们介绍了用于文本驱动三维人体运动生成的新型遮罩建模框架 MoMask。

MoMask 采用分层量化方案,将人体运动表示为具有高保真细节的多层离散运动标记。

  • 从底层开始,通过矢量量化获得运动标记序列,然后推导出递增阶次的残余标记,并将其存储在层次结构的后续层中。

  • 随后是两个不同的双向变换器。对于基础层的运动标记,指定了一个屏蔽变换器来预测随机屏蔽的运动标记,并以训练阶段的文本输入为条件。

  • 在生成(即推理)阶段,从一个空序列开始,我们的屏蔽变换器会反复填补缺失的标记;随后,残差变换器会根据当前层的结果逐步预测下一层的标记。

大量实验证明,MoMask 在文本到动作生成任务上的表现优于最先进的方法,在 HumanML3D 数据集上的 FID 为 0.045(与 T2M-GPT 的 0.141 相比),在 KIT-ML 上的 FID 为 0.228(与 T2M-GPT 的 0.514 相比)。

MoMask 还可以无缝地应用于相关任务中,而无需进一步的模型微调,例如文本引导的temporal inpainting。

  • 16
    点赞
  • 18
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值