PointPillars

最新推荐文章于 2022-11-25 21:36:57 发布

erzayi

最新推荐文章于 2022-11-25 21:36:57 发布

阅读量1.5k

点赞数

分类专栏：点云目标检测文章标签：深度学习

本文链接：https://blog.csdn.net/baidu_32284829/article/details/110119440

版权

点云目标检测专栏收录该内容

8 篇文章 4 订阅

订阅专栏

PointPillars: Fast Encoders for Object Detection from Point Clouds

CVPR 2019

点云中的目标检测是许多机器人应用程序（例如自动驾驶）的重要部分。在本文中，我们考虑将点云编码为适合下游检测pipeline的格式。最近的文献提出了两种类型的编码器。固定编码器趋向于快速但牺牲准确性，而从数据中学习的编码器则更准确，但速度较慢。在这项工作中，我们提出了PointPillars，这是一种新颖的编码器，它利用PointNets学习在vertical columns (pillars)中的点云的表示形式。 尽管编码的特征可以与任何标准的2D卷积检测网络一起使用，但我们进一步提出了精益的下游网络。大量的实验表明，就速度和准确性而言，PointPillars的性能均优于以前的编码器。尽管仅使用了激光雷达，但在3D和BEV KITTI benchmarks中，我们的pipeline甚至优于fusion methods。以62 Hz的频率运行时可达到这种检测性能：运行时间的性能提高了2-4倍。我们方法的更快版本可以达到105 Hz。这些benchmarks表明PointPillars是点云目标检测中合适的编码方式。

问题

作者认为3D conv会使推理速度变慢。浪费计算资源。

创新

We propose a novel point cloud encoder and network, PointPillars, that operates on the point cloud to enable end-to-end training of a 3D object detection network.
We show how all computations on pillars can be posed as dense 2D convolutions which enables inference at 62 Hz; a factor of 2-4 times faster than other methods.
We conduct experiments on the KITTI dataset and demonstrate state of the art results on cars, pedestrians,and cyclists on both BEV and 3D benchmarks.
We conduct several ablation studies to examine the key factors that enable a strong detection performance.

数据集

KITTI数据扩充：

First, following SECOND [2], we create a lookup table of the ground truth 3D boxes for all classes and the associated point clouds that falls inside these 3D boxes. Then for each sample, we randomly select 15; 0; 8 ground truth samples for cars, pedestrians, and cyclists respectively and place them into the current point cloud. We found these settings to perform better than the proposed settings.
Next, all ground truth boxes are individually augmented. Each box is rotated (uniformly drawn from [-π/20,π/20]) and translated (x, y, and z independently drawn from N(0, 0.25)) to further enrich the training set.
Finally, we perform two sets of global augmentations that are jointly applied to the point cloud and all boxes.First, we apply random mirroring flip along the x axis, then a global rotation and scaling. Finally, we apply a global translation with x, y, z drawn from N(0, 0.2) to simulate localization noise.