目录
[1] Avetisyan A , Khanova T , Choy C , et al. SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans[J]. 2020.
Fig. 1. Our method takes as input a 3D scan and a set of CAD models. We jointly detect
objects and layout elements in the scene. Each detected object or layout component
then forms a node in a graph neural network which estimates object-object relationships
and object-layout relationships. This holistic understanding of the scene enables results
in a lightweight CAD-based representation of the scene.
Abstract.
We present a novel approach to reconstructing lightweight, CAD-based representations of scanned 3D environments from commodity RGB-D sensors. Our key idea is to jointly optimize for both CAD model alignments as well as layout estimations of the scanned scene, explicitly modeling inter-relationships between objects-to-objects and objects-to-layout. Since object arrangement and scene layout are intrinsically coupled, we show that treating the problem jointly signicantly helps to produce globally-consistent representations of a scene. Object CAD
models are aligned to the scene by establishing dense correspondences between geometry, and we introduce a hierarchical layout prediction approach to estimate layout planes from corners and edges of the scene. To this end, we propose a message-passing graph neural network to model
the inter-relationships between objects and layout, guiding generation of a globally object alignment in a scene. By considering the global scene layout, we achieve signicantly improved CAD alignments compared to state-of-the-art methods, improving from 41.83% to 58.41% alignment accuracy on SUNCG and from 50.05% to 61.24% on ScanNet, respectively. The resulting CAD-based representations makes our method well-suited for applications in content creation such as augmented- or virtual reality.
Conributions
Conclusion
In this work we formulated a method to digitize 3D scans that goes beyond the
focus of objects in the scene. We propose a novel method that estimates the
layout of the scene by sequentially predicting corners, then edges and nally
quads in a fully differentiable way. The estimated layout is used in conjunction
with an object detector to predict contact relationships between objects and
the layout and ultimately to predict a CAD arrangement of the scene. We can
show that objects and the surrounding (scene layout) go hand in hand and are a
crucial factor towards full scene digitization and scene understanding. Objects
in the scene are often not arbitrarily arranged, for instance often cabinets are
leaned at walls or a table is surrounded by chairs in a dining room, hence
we leverage the inherent coupling between objects and layout structure in the
learning process. Our approach improves global CAD alignment accuracy by
learning those patterns on both real and synthetic scans. We hope that we can
encourage further research towards this avenue, and see as next immediate steps
for future work the necessity of texturing digitized shapes in order to enhance
the immersive experience in VR environments.
Fig. 7. Qualitative CAD alignment and layout estimation results on ScanNet [9] scans
(zoomed in views on the bottom). Our approach incorporating object and layout
relationships produces globally consistent alignments along with the room layout.