DensePose: Dense Human Pose Estimation In The Wild

最新推荐文章于 2023-06-02 08:57:33 发布

qq_36356761

最新推荐文章于 2023-06-02 08:57:33 发布

阅读量2.1k

点赞数

分类专栏： paper reading notes

本文链接：https://blog.csdn.net/qq_36356761/article/details/80794932

版权

DensePose是一种在野外环境下进行密集人体姿态估计的技术，它提供了RGB图像到人体表面表示的密集对应。文章介绍了DensePose-COCO，一个包含50K张COCO图像的大规模手动标注数据集，以及DensePose-RCNN，一种能够以每秒多个帧的速度回归部分特定的UV坐标的方法。尽管手动标注难度大，但通过引入新的标注技术和训练CNN系统，实现了在复杂背景、遮挡和尺度变化情况下的精确对应。文章还提出了基于ROI的DensePose-RCNN，结合了DenseReg和Mask R-CNN的优点，提高了密集对应预测的准确性。

摘要由CSDN通过智能技术生成

DensePose: Dense Human Pose Estimation In The Wild

Rıza Alp G¨uler, Natalia Neverova, Natalia Neverova
这里写图片描述
DensePose-COCO: a large-scale ground-truth dataset with image-to-surface correspondences manually annotated on 50K COCO images（标数据的难度可想而知）
DensePose-RCNN: densely regress part-specific UV coordinates within every human region at multiple frames per second（居然还是使用回归这么原始的方法？还R-CNN？）

Abstract

dense human pose estimation: dense correspondences between an RGB image and a surface-based representation of the human body
可想而知即使是标一张数据的难度也是较大的，因此作者介绍了一种有效的标注方法
in the wild: in the presence of background, occlusions and scale variations，这样的数据标注更困难，遑论预测

Introduction

二维图像的理解和三维重建密切相关的
基于DenseReg，用CNN回归3D模型与RGB图像间点的对应关系。但是这里的问题相比于DenseReg更困难，因为in the wild，人的姿势变化更剧烈。
contributions:
1. introduce the first manually-collected ground truth dataset for the task, by gathering dense correspondences between the SMPL model and persons appearing in the COCO dataset
2. use the resulting dataset to train CNN-based systems that deliver dense correspondence ‘in the wild’, by regressing ody surface coordinates at any image pixel, observing a superiority of region-based models over fully-convolutional networks
3. use sparse correspondences defined over a randomly chosen subset of image pixels per training sample to ‘inpaint’ the supervision signal in the rest of the image domain

COCO-DensePose Dataset

Head, Torso, Lower/Upper Arms, Lower/Upper Legs, Hands and Feet
head, hands and feet: use the manually obtained UV fields provided in the SMPL model
rest of the parts: obtain the unwrapping via multidimensional scaling applied to pairwise geodesic distances

Accuracy of human annotators

人标记的数据也是有误差的，尤其是对于比较精细的部位，如头、手脚等

Evaluation Measures

Pointwise evaluation: evaluates correspondence accuracy over the whole image domain through the Ratio of Correct Point (RCP) correspondences (a correspondence is declared correct if the geodesic distance is below a certain threshold). 对于不同阈值 $t$ 计算AUC(Area Under the Curve)