3D目标检测入门论文: CaDDN, DETR3D

[CVPR2021] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

  • 概要

transform the front-view monocular image to the top-view road layout

提出cross-view transformation module,利用视图之间的关系和cycle consistency加强视图变换。

提出context-aware discriminator进一步精炼结果,把车辆与路面的空间关系纳入了车辆占用估计任务的考量。

  • 方法

Cross-view Transformation = Cycled View Projection (CVP) + Cross-View Transformer (CVT)

Cycled View Projection (CVP):先用MLP把fov投到bev,再引入cycled self-supervision把bev投回fov,引入cycle loss限制两个fov特征的差别。

 Cross-View Transformer (CVT):

 Context-aware Discriminator:

  • 实验

在road layout estimation and vehicle occupancy estimation达到sota。


[CVPR2021] Categorical Depth Distribution Network for Monocular 3D Object Detection

  • 概要

过去试图通过直接估计深度来辅助3D检测的方法受制于深度估计的不精确性。这篇论文将深度离散化,提出为每个像素估计类别深度分布的单目3D目标检测方法。

  • 方法

CaDDN的四个模块:对输入图片的每个像素估计深度类别,得到Frustum Features;通过相机参数和插值采样得到Voxel Features;连接voxel grid的z和c维度,降维得到BEV;最后用BEV做3D检测。

 

  • 实验

1st on KITTI & first monocular 3D detection results on Waymo


DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries

 DETR3D uses several tricks to boost the performance. First is the iterative refinement of object queries. Essentially, the predicts the bbox centers in BEV is reprojected back to images with camera transformation matrices (intrinsics and extrinsics), and multi-cam image features are sampled and integrated to refine the queries. This process can be repeated multiple times (6 in this paper) to boost the performance.

The second trick is to use pretrained mono3D network backbone to boost the performance. Initialization seems to matter quite a lot for Transformers-based BEV perception network.


End-to-End Object Detection with Transformers

 Object Query是对anchor的编码,并且这个anchor是一个全参数可学习的


 [CVPR2020] SampleNet: Differentiable Point Cloud Sampling

一、概要:

点云规模增大导致计算量增大的问题可以通过在执行下游任务前对点云采样来解决。经典采样方法没有考虑下游任务,考虑了下游任务的采样方式也没有处理采样操作的不可微问题。为此论文提出可微松弛采样方法,能输出针对下游任务优化的更小的点云。在point cloud classification, registration and reconstruction中超过所有non-learned和learned采样方法。

二、方法介绍:

 

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值