Self-supervised Learning of Motion Capture阅读笔记

最新推荐文章于 2024-06-15 09:47:49 发布

菜鸟本尊

最新推荐文章于 2024-06-15 09:47:49 发布

阅读量1.6k

点赞数

文章标签： pose estimation 关键点检测 motion capture

本文链接：https://blog.csdn.net/qq_40045309/article/details/89817787

版权

备注：1.作者Hsiao-Yu Fish Tung，Katerina Fragkiadaki 卡耐基梅隆大学一、概述1. abstract（1）跟直接优化mesh and skeleton 的参数不一样的是，我们通过优化网络的权重来预测一个 monocular RGB video中的3D shape and skeleton 的配置；（2）模型采用end-to-end ...

摘要由CSDN通过智能技术生成

备注：

1.作者

Hsiao-Yu Fish Tung，Katerina Fragkiadaki 卡耐基梅隆大学

一、概述

1. abstract

（1）跟直接优化mesh and skeleton 的参数不一样的是，我们通过优化网络的权重来预测一个 monocular RGB video中的3D shape and skeleton 的配置；

（2）模型采用end-to-end framework；

（3）模型训练联合使用 strong supervision from synthetic(合成的) data 和 self supervision from differentiable rendering of skeleton keypoints, dense 3D mesh motion , human-background segmentation;

（4）联合使用supervised learning 和 test-time optimization，监督学习在合适的时间对模型参数进行初始化，确保测试时候 good pose and surface initialization;

（5）优点：self-supervision by BP through differentiable rendering allows(unsupervised) adaptation of model to the test data,and offer much tighter fit than a pretrained fixed model.

2 .应用方向

对于非设定实验场中单视觉的人体以及其运动理解是很重要的，可有以下应用场景：

automated gym, dancing teacher , rehabilitation guidance, patient monitoringand safer human-robot interactions;

对于影视行业的 character motion capture（MOCAP）and retargeting (that still require tedious labor effort of artists to achieve the disired accuracy ,or the use espensive multi-camera setups and green-scerrn backgrounds.)

二、网络架构

1. 主旨描述

（1）提出一个基于monocular video的运动捕捉的网络模型，学习将图片序列映射到相应的3D 网格序列；

（2）使用合成的渲染模型进行strong supervision；以及从3D 到2D的渲染模型并对应于2D监测点的真实单目视频进行 self-supervision；

（3）self-supervision利用 2D body joint detection ，2D figure-ground segmentation， 2D optical flow；除此之外，2D身体关节标注更易获取，以及optical flow 能容易的从合成数据泛化到真实数据；

（4）跟以往的基于motion capture work进行优化的不同点，我们使用 differentiable warping and differentiable camera project for optical flow and segmantation losses ;这些方法的综合运用有利于进行end-to-end with BP的学习；

（5）使用SMPL 作为 dense human 3D mesh model;我们的任务是对渲染过程进行逆向工程操作，并且预测SMPL的参数；

（6）给出了连续两帧的三维网格预测，可微投影网格顶点的三维运动矢量，并将其与估计的二维光流矢量进行匹配；可微运动渲染和匹配需要顶点可见性估计，我们使用光线投射和我们的代码加速神经模型来完成；（如下图）相似的，in each frame，3D keypoint are projected and their distances to corresponding d

最低0.47元/天解锁文章

菜鸟本尊

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
2
评论
Self-supervised Learning of Motion Capture阅读笔记

备注：1.作者Hsiao-Yu Fish Tung，Katerina Fragkiadaki 卡耐基梅隆大学一、概述1. abstract（1）跟直接优化mesh and skeleton 的参数不一样的是，我们通过优化网络的权重来预测一个 monocular RGB video中的3D shape and skeleton 的配置；（2）模型采用end-to-end ...
复制链接

扫一扫