1.By Object
1.1 Human Body
- Learning to Estimate 3D Human Pose and Shape From a Single Color Image
- Recognizing Human Actions as the Evolution of Pose Estimation Maps
- Human Pose Estimation With Parsing Induced Learner
- Monocular 3D Pose and Shape Estimation of Multiple People in Natural Scenes - The Importance of Multiple Scene Constraints
- Jointly Optimize Data Augmentation and Network Training: Adversarial Data Augmentation in Human Pose Estimation
- V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation From a Single Depth Map
- PoseTrack: A Benchmark for Human Pose Estimation and Tracking
- Cascaded Pyramid Network for Multi-Person Pose Estimation
- Ordinal Depth Supervision for 3D Human Pose Estimation
- Through-Wall Human Pose Estimation Using Radio Signals
- Learning Monocular 3D Human Pose Estimation From Multi-View Images
1.2 Hands
- First-Person Hand Action Benchmark With RGB-D Videos and 3D Hand Pose Annotations
- Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals
- Dense 3D Regression for Hand Pose Estimation
- Gesture Recognition: Focus on the Hands
- Hand PointNet: 3D Hand Pose Estimation Using Point Sets
- Cross-Modal Deep Variational Hand Pose Estimation
- Augmented Skeleton Space Transfer for Depth-Based Hand Pose Estimation
- GANerated Hands for Real-Time 3D Hand Tracking From Monocular RGB
1.3 Others
- Detect-and-Track: Efficient Pose Estimation in Videos
- “检测和跟踪:视频中的高效姿态估计”
- Feature Mapping for Learning Fast and Accurate 3D Pose Inference From Synthetic Images
- “用于学习的特征映射从合成图像学习快速且准确的3D姿势推断”
- DensePose: Dense Human Pose Estimation in the Wild
- ”密集人体:野外人体姿势估计“
- 3D Human Pose Estimation in the Wild by Adversarial Learning
- “对抗性学习在野外的人体姿态估计”
- 3D Pose Estimation and 3D Model Retrieval for Objects in the Wild
- “野外物体的三维姿态估计与三维模型检索”
- RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews From Unsupervised Viewpoints
- “RotationNet:使用来自无监督视点的多视图的联合对象分类和姿态估计”
- 2D/3D Pose Estimation and Action Recognition Using Multitask Deep Learning
- “使用多任务深度学习的2D / 3D姿态估计和动作识别”
- Learning Pose Specific Representations by Predicting Different Views
- “通过预测不同视角来学习姿势的具体表现”
- Real-Time Seamless Single Shot 6D Object Pose Prediction
- “实时无缝单射6D对象姿态预测”
- Multi-View Consistency as Supervisory Signal for Learning Shape and Pose Prediction
- “多视图一致性作为学习形状和姿态预测的监督信号”
2.By Task
2.1 Pose Estimation:
- DensePose: Dense Human Pose Estimation In The Wild
- Total Capture: A 3D Deformation Model for Tracking Faces, Hands
- Weakly Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer
- Synthesizing Images of Humans in Unseen Poses
- Cascaded Pyramid Network for Multi-Person Pose Estimation
2.2 Video Classification/Action Recognition:
- Non-local Neural Networks
- Appearance-and-Relation Networks for Video Classification
- Optical Flow Guided Feature: A Fast and Robust Motion Representation for Video Action Recognition
- Learning to Localize Sound Source in Visual Scenes
- Towards Universal Representation for Unseen Action Recognition
- Non-Linear Temporal Subspace Representations for Activity Recognition
- Fine-grained Activity Recognition in Baseball Videos(workshop)
- Learning Latent Super-Events to Detect Multiple Activities in Videos
2.3 Video Understanding:
- What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets(spotlight,Feifei组)
- What have we learned from deep representations for action recognition?
- A Closer Look at Spatiotemporal Convolutions for Action Recognition
- Rethinking Spatiotemporal Feature Learning For Video Understanding
- On the Integration of Optical Flow and Action Recognition(推荐)
- End-to-End Learning of Motion Representation for Video Understanding(Tencent AI Lab)
- Guess Where? Actor-Supervision for Spatiotemporal Action Localization
- A Unifying Contrast Maximization Framework for Event Cameras, with Applications to Motion, Depth, and Optical Flow Estimation
- Video Representation Learning Using Discriminative Pooling
- Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
- Fast End-to-End Trainable Guided Filter
- Density-aware Single Image De-raining using a Multi-stream Dense Network
3.Related Work
3.1 合成 Synthesis
- Multistage Adversarial Losses for Pose-Based Human Image Synthesis
- “基于姿态的人体图像合成的多级对抗损失”
- Synthesizing Images of Humans in Unseen Poses
- “在看不见的姿势中合成人类的图像”
- Unsupervised Person Image Synthesis in Arbitrary Poses
- “任意姿势下的无监督人体图像合成”
- End-to-End Recovery of Human Shape and Pose
- “人体形态和姿势的端对端恢复”
- Deformable GANs for Pose-Based Human Image Generation
- “用于基于姿势的人类图像生成的可变形GAN”
3.2 相机机位 Camera
- GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose
- “GeoNet:密集深度的无监督学习,光流和相机姿势”
- Hybrid Camera Pose Estimation
- “混合相机姿势估计”
- Camera Pose Estimation With Unknown Principal Point
- “带有未知主要点的相机姿态估计”
3.3 人脸 Face
- Super-FAN: Integrated Facial Landmark Localization and Super-Resolution of Real-World Low Resolution Faces in Arbitrary Poses With GANs
- “Super-FAN:集成的人脸地标定位和任意姿势下的真实世界低分辨率人脸的超分辨率”
- Disentangling 3D Pose in a Dendritic CNN for Unconstrained 2D Face Alignment
- “在无限制2D面对准的树突状CNN中解构三维姿态”
- Joint Pose and Expression Modeling for Facial Expression Recognition
- “面部表情识别的联合姿态和表情建模”
- Towards Pose Invariant Face Recognition in the Wild
- “面向野外姿态不变的人脸识别”
- Pose-Robust Face Recognition via Deep Residual Equivariant Mapping
- “基于深度残差等变映射的姿态鲁棒人脸识别”
- UV-GAN: Adversarial Facial UV Map Completion for Pose-Invariant Face Recognition
- “UV-GAN:用于姿态不变脸部识别的对抗面部UV映射完成”
- Pose-Guided Photorealistic Face Rotation
- “姿势指导真实感脸部旋转”
- Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies
- “全部捕获:用于追踪面部,手部和身体的3D变形模型”
3.4 其他
- Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer
- “通过姿态引导知识转移进行弱监督和半监督人体部位解析”
- A Certifiably Globally Optimal Solution to the Non-Minimal Relative Pose Problem
- “非最小相对姿态问题的一个可证明的全局最优解”
- Fight Ill-Posedness With Ill-Posedness: Single-Shot Variational Depth Super-Resolution From Shading
- “以不适当的姿态应对不适应:来自阴影的单发变分深度超分辨率”
- Factoring Shape, Pose, and Layout From the 2D Image of a 3D Scene
- “从3D场景的2D图像中分解形状,姿态和布局”
- A Pose-Sensitive Embedding for Person Re-Identification With Expanded Cross Neighborhood Re-Ranking
- “扩展交叉邻居重新排序的人员重新识别的姿态敏感嵌入“
- Improving Occlusion and Hard Negative Handling for Single-Stage Pedestrian Detectors
- ”改善单级行人检测器的遮挡和硬性负面处理“
- End-to-End Learning of Keypoint Detector and Descriptor for Pose Invariant 3D Matching
- ”针对姿态不变三维匹配的关键点检测器和描述符的端到端学习“
- Non-Blind Deblurring: Handling Kernel Uncertainty With CNNs
- “非盲去模糊:用CNN处理内核不确定性”
- Pose Transferrable Person Re-Identification
- “姿态可移动的人员重新识别”
- LSTM Pose Machines
- “LSTM姿势机器”
- MX-LSTM: Mixing Tracklets and Vislets to Jointly Forecast Trajectories and Head Poses
- “MX-LSTM:混合Tracklets和Vislets来共同预测轨迹和头部姿势”
- PoseFlow: A Deep Motion Representation for Understanding Human Behaviors in Videos
- ”PoseFlow:用于理解视频中人类行为的深层运动表示“
- PoTion: Pose MoTion Representation for Action Recognition
- ”主题:构成动作识别的MoTion表示“
- Analysis of Hand Segmentation in the Wild
- ”野生动物手部分割分析“
References
Author: 王弗兰克 https://zhuanlan.zhihu.com/p/34604585
Author: 李光睿 https://zhuanlan.zhihu.com/cvpr2018
Author: 旷视科技 https://zhuanlan.zhihu.com/p/37582402
Author: 梦寐mayshine http://www.cvmart.net/community/article/detail/286?from=groupmessage&appinstall=0