Broad applications of 3D data
- Robotics
- Autonomous driving
- Augmented Reality
- Medical Image Processing
3D deep learning tasks
- 3D geometry analysis: classification, parsing(object/scene), correspondence (类似3D物体的对应部分)
- 3D-assisted image analysis: cross-view image retrieval(给图片retrieval 3D模型), intrinsic decomposition
- 3D synthesis: monocular 3D reconstruction(单目), shape completion(补充残缺部分), shape modeling(other constraits)
3D has many representations:
- multi-view RGB(D) images: 一个物体的不同视角的照片
- volumetric (医学中常用)
- polygonal mesh
- point cloud
- primitive-based CAD models(建模中)
主要分为两种:
- Rasterized form(regular grids): RGB(D) images, volumetric
- Geometric form(irregular): polygonal mesh, point cloud primitive-based CAD models
Rasterized form(regular grids): Can directly apply CNN, 但是有其他的问题存在
Geometric form(irregular): Cannot directly apply CNN, 必须要设计新的网络结构
Part I: Deep learning on regular structures
Multi-view representation & Volumetric representation
Deep learning on multi-view representation
- classification: 假设有多个view的相机,拍照,多view图片输入CNN网络中,然后集合pooling(或者接另一个CNN)用来分类 代表 MVCNN
- segmentation
- reconstruction