大场景3D点云领域自适应论文

最新推荐文章于 2022-09-07 10:24:59 发布

Tony2wang

最新推荐文章于 2022-09-07 10:24:59 发布

阅读量605

点赞数

文章标签： 3d 深度学习人工智能

本文链接：https://blog.csdn.net/Tony2wang/article/details/126381678

版权

本文总结了点云数据在不同传感器间迁移以及从仿真到真实数据的域适应方法。涉及论文包括点云检测与分割的深度域适应、传感器导向的转移方法、无监督域适应策略、边界感知的域适应模型、跨模态无监督域适应以及点云补全与分割。这些研究主要聚焦于减少不同数据集间的差距，提升模型在新环境中的泛化能力。

摘要由CSDN通过智能技术生成

这里主要分为两类：
1）真实数据集之间迁移；
2）从仿真数据到真实数据的迁移；

一真实数据集之间迁移

##论文1：Cross-Sensor Deep Domain Adaptation for LiDAR Detection and Segmentation
作者：德国戴姆勒公司，慕尼黑工业大学 IV 2019
代码无

思路：
We analyze the transferability of point cloud features between two different LiDAR sensor set-ups (32 and 64 vertical scanning planes with different geometry).
评价：
第一篇做点云的Domain Adaptation文章
任务是MultiTask，分割、检测都做
数据集是Velodyne HDL64 (KITTI) and Velodyne VLP32，他自己标注了KITTI32位
we have manually annotated scans from Velodyne-VLP32 recordings
##论文2：Domain Transfer for Semantic Segmentation of LiDAR Data using Deep Neural Networks
作者：波恩大学和Argo.ai IROS 2020
思路：
The main contribution of this paper is a sensor-oriented transfer method that allows us to exploit existing labels provided for a specific sensor setup and use it in a new setup. the aim of our paper is adapting the weights of a deep neural network to another dataset without utilizing human labeling for the target dataset. we aim to transfer Velodyne HDL-64 scans to match scans from a Velodyne HDL-32, a sensor with a lower resolution and different FOV.
任务：点云分割
数据集：
Source Dataset：SemanticKITTI
Target Dataset：nuScenes
基准模型：
RangeNet++ (https://github.com/PRBonn/lidar-bonnetal)
pre-trained on the SemanticKITTI dataset
##论文3：Domain Adaptation in LiDAR Semantic Segmentation
作者：西班牙萨拉戈萨大学Universidad de Zaragoza arxiv 2020

思路：
This work proposes two strategies to improve unsupervised domain adaptation (UDA) in LiDAR semantic segmentation, see a sample result on Fig. 1. The first strategy addresses this problem by applying a set of simple steps to align the data distribution reducing the domain gap on the input space. The second strategy proposes how to align the distribution on the output space by aligning the class distribution. These two proposed strategies can be applied in conjunction with current state-of-the-art approaches boosting their performance.
数据集：
source domain dataset： SemanticKITTI（Velodyne HDL-64E）
target domain data：
1 Paris-Lille-3D （VelodyneHDL-32E）
2 SemanticPoss （北大赵卉菁课题组，专门做汽车和机器人，主页：http://www.poss.pku.edu.cn/papers.html）
3 I3A（Velodyne VLP-16）
任务：点云分割，
点云分割基准模型：3D-MiniNet（IROS 2020 https://github.com/Shathe/3D-MiniNet）
##论文4：LiDARNet: A Boundary-Aware Domain Adaptation Model for Lidar Point Cloud Semantic Segmentation
作者：德克萨斯A&M大学 arxiv 2020
思路： the model has two branches: domain private branch and domain shared branch.
任务：点云分割
Source Dataset：SemanticKITTI（Velodyne
HDL-64）
Target Dataset：target dataset was collected by us on Texas
A& M University, College Station campus, and RELLIS campus.（Ouster OS1-64）
传感器不一样
基准模型：RangeNet++

##论文5：xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation
CVPR 2020 法国国家信息与自动化研究所（Inria）和法雷奥人工智能和深度学习研究中心(Valeo.ai)
github：https://github.com/valeoai/xmuda

思路：
Unsupervised Domain Adaptation (UDA) is crucial to tackle the lack of annotations in a new domain. There are many multi-modal datasets, but most UDA approaches are uni-modal. In this work, we explore how to learn from multi-modality and propose cross-modal UDA (xMUDA) where we assume the presence of 2D images and 3D point clouds for 3D semantic segmentation. This is challenging as the two input spaces are heterogeneous and can be impacted differently by domain shift. In xMUDA, modalities learn from each other through mutual mimicking, disentangled from the segmentation objective, to prevent the stronger modality from adopting false predictions from the weaker one. We evaluate on new UDA scenarios including day-to-night, country-to-country and datasetto-dataset, leveraging recent autonomous driving datasets. xMUDA brings large improvements over uni-modal UDA on all tested scenarios.
数据集：nuScenes, A2D2 and SemanticKITTI
任务：
To evaluate xMUDA, we identified 3 real-to-real adaptation scenarios. In the day-to-night case, LiDAR has a small domain gap, as it is an active sensor sending out laser beams which are mostly invariant to lighting conditions. In contrast, camera has a large domain gap as its passive sensing suffers from lack of light sources, leading to drastic changes in object appearance. The second scenario is country-to-country adaptation, where the domain gap can be larger for LiDAR or camera: for some classes the 3D
shape might change more than the visual appearance or vice
versa. The third scenario, dataset-to-dataset, comprises
changes in the sensor setup, such as camera optics, but most
importantly a higher LiDAR resolution on target. 3D networks are sensitive to varying point cloud density and the image could help to guide and stabilize adaptation.

##论文 6： Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds
谷歌 arxiv 2020
思路：
Our network architecture is composed of two phases: surface completion and semantic labeling. In the first phase, we use a sparse voxel completion network (SVCN) to recover the 3D surface from a LiDAR point cloud. In the second phase, we use a sparse convolutional U-Net to predict a semantic label for each voxel on the completed surface.
先做点云补全，再做分割
Our contributions are three-fold. First and foremost, we identify the cross-sensor domain gap for LiDAR point clouds caused by sampling differences, and we propose to recover complete 3D surfaces from the point clouds to eliminate the discrepancies in sampling patterns. Second, we present a novel sparse voxel completion network, which efficiently processes sparse and incomplete LiDAR point clouds and completes the underlying 3D surfaces with high resolution. Third, we provide thorough quantitative evaluations to validate our design choices on three datasets.
任务：点云补全和分割
数据集： Waymo、 nuScenes、 KITTI 互相迁移

##论文7：Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection
百度CVPR 2020
任务：点云3D目标检测
域：perceptual domain 和conceptual domain
思路：
In this paper, we innovatively propose a domain adaptation like approach to enhance the robustness of the feature representation. More specifically, we bridge the gap between the perceptual domain where the feature comes from a real scene and the conceptual domain where the feature is extracted from an augmented scene consisting of non-occlusion point cloud rich of detailed information. This domain adaptation approach mimics the functionality of the human brain when proceeding object perception.
还没太看懂

#二仿真数据迁移到真实数据
##论文1： SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud
开源 github：https://github.com/xuanyuzhou98/SqueezeSegV2
作者：UC Berkeley ICRA 2019
思路：first, before training, we render intensity channels in synthetic data through learned intensity rendering. We train a neural network that takes the point coordinates as input, and predicts intensity values. This rendering network can be trained in a ”self-supervised”fashion on unlabeled real data. After training the network, we feed the synthetic data into the network and render the intensity channel, which is absent from the original simulation. Second, we use the synthetic data augmented with rendered intensity to train the network. Meanwhile, we follow [6] and use geodesic correlation alignment to align the batch statistics between real data and synthetic data. 3) After training, we propose progressive domain calibration to further reduce the gap between the target domain and the trained network
评价：
第一个,仿真数据到真实数据点云分割算法。
提供仿真数据集开源 We create a large-scale 3D LiDAR point cloud dataset, GTA-LiDAR, which consists of 100,000 samples of synthetic point cloud augmented with rendered intensity.
GTA-LiDAR 迁移到KITTI。

##论文2：ePointDA: An End-to-End Simulation-to-Real Domain Adaptation Framework for LiDAR Point Cloud Segmentation
作者：滴滴出行 AAAAI 2021
思想：
First, we render the dropout noise for synthetic data based on a self-supervised model trained on unlabeled real data, taking point coordinates as input and dropout noise as predictions. Second, we align the features of the simulation and real domains based on higher-order moment matching (Chen et al. 2020) with statistics-invariant normalized features by instance normalization (Ulyanov, Vedaldi, and Lempitsky 2016) and domain-invariant spatial attention by improving spatially-adaptive convolution (Xuet al. 2020). The specific feature alignment method not onlyhelps bridge the spatial feature gap, but also does not requireprior access to sufficient real data to obtain the statistics, allowing it to deal better with the incremental real data andthus making it more robust and practical. Finally, we learna transferable segmentation model based on the adapted images and corresponding synthetic labels.
数据集：synthetic GTA-LiDAR to real KITTI and SemanticKITTI
there are only two categories in the GTA-LiDAR dataset, i.e.
car and pedestrian
评价：
从仿真软件迁移到真实数据simulation-to-real domain adaptation (SRDA)；
任务是点云分割；采用的分割算法是SqueezeSeg V2；
和SqueezeSeg V2进行了对比分析。
GTA-LiDAR 迁移到 KITTI和SemanticKITTI。

后面陆续增加这方面的论文阅读笔记吧~~~