object shapes decoded from latent code
Foreground objects background features and camera poses
monocular stereo sterero+lidar
stereo+LiDAR:
-
incorporates a sparse set of LiDAR measurements (as few as 50 per object) for object reconstruction and pose-only optimization.
-
exactly, 50 3D points per object to obtain accurate shape estimates
for object shape and pose estimation: improve quantitative and qualitatively over auto-labelli
stereo+LiDAR sota
monocular: achieves promising qualitative reconstruction result
对比其他方法:
- FroDO:batch method
- Node-SLAM:feature与object没有一起优化
- DeepSLAM++: forward shape generation不稳定
物体表示:
Each object is represented as a compact and optimizable code vector z
方法:
employ DeepSDF [25] as the shape embedding,输入shape code z(64 dims)和3D query location,outputs the signed distance function (SDF) value s = G(x, z) at the given point
- Detections:
- 每个关键帧处估计2D bouding box和分割的mask,initial estimate for the object pose estimation来自3D bounding box检测
- Data association:
- 新的检测关联到 existing map objects,
- 或者instantiated as a new object via object-level data association
- 物体实例包括2D bounding box B, a 2D mask M, the dpeth observation of sparse 3D point cloud D, and the initial object pose Tco,0.
- Prior-based object reconstruction:
- 对于新的实例,输入3D点,optimises the shape code and object pose to minimise surface consistency and depth rendering losses
- 对于已经存在的实例,只优化 their 6-dof pose
Object Reconstruction with Shape Priors
Surface Consistency Term
![image-20211214151709395](https://i-blog.csdnimg.cn/blog_migrate/6da1eaf8dac88daf13595bca0df31718.png)
Differentiable SDF Renderer
- 计算Occupancy Probabilities
![image-20211214161656140](https://i-blog.csdnimg.cn/blog_migrate/baf26ad4b30c1b6302cc29123cc53024.png)
- 计算Event Probabilities
![image-20211214161754432](https://i-blog.csdnimg.cn/blog_migrate/4b126040fe985f3360d76292666f7a37.png)
- Rendered Depth and Rendering Term
![image-20211214161821027](https://i-blog.csdnimg.cn/blog_migrate/a8a1d56dfa58fd592bbeb7d60ce8435e.png)
![image-20211214161837979](https://i-blog.csdnimg.cn/blog_migrate/aaa5be601e13527aaab379c05dc4a40b.png)
union of surface pixels and pixels not on object surface but inside the 2D bounding box B
Optimization detail
![image-20211214162403969](https://i-blog.csdnimg.cn/blog_migrate/3f436c069c6baff052377f8290773ecc.png)