论文阅读笔记：(2021.10 CoRL) DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries

chaoqinyou

已于 2022-12-01 13:20:16 修改

阅读量682

点赞数 1

分类专栏：感知二分图匹配文章标签：深度学习计算机视觉目标检测

于 2022-04-19 22:12:20 首次发布

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/chaoqinyou/article/details/124271965

版权

感知同时被 2 个专栏收录

19 篇文章 1 订阅

订阅专栏

二分图匹配

2 篇文章 0 订阅

订阅专栏

论文地址：DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries | OpenReviewWe introduce a framework for multi-camera 3D object detection. In contrast to existing works, which estimate 3D bounding boxes directly from monocular images or use depth prediction networks to generate input for 3D object detection from 2D information, our method manipulates predictions directly in 3D space. Our architecture extracts 2D features from multiple camera images and then uses a sparse set of 3D object queries to index into these 2D features, linking 3D positions to multi-view images using camera transformation matrices. Finally, our model makes a bounding box prediction per object query, using a set-to-set loss to measure the discrepancy between the ground-truth and the prediction. This top-down approach outperforms its bottom-up counterpart in which object bounding box prediction follows per-pixel depth estimation, since it does not suffer from the compounding error introduced by a depth prediction model. Moreover, our method does not require post-processing such as non-maximum suppression, dramatically improving inference speed. We achieve state-of-the-art performance on the nuScenes autonomous driving benchmark.https://openreview.net/forum?id=xHnJS2GYFDz

代码开源了, 基于mmDetection3D进行了修改： https://github.com/WangYueFt/detr3dhttps://github.com/WangYueFt/detr3d

一、目的，贡献，创新点

基于2D的方法后处理多，基于伪点云的方法受深度估计的影响大，因此提出本方法；

1. 一种自顶而下的单目多视图3D目标检测网络，在各个层级都能够融合多个试图的信息, the first attempt to cast multi-camera detection as 3D set-to-set prediction

2. 直接关联2D feature和3D框，避免深度估计不准带来的影响：uses information from multiple cameras by back-projecting 3D information onto all available frames

3. 无NMS, 在相机视角重合部分效果比较好: 其实这是2的结果， 2中的multi_head_attention中已经有框之间的相互作用， set-to-set loss (hangarian loss)驱使网络对于一个ground truth只有一个最优预测；

二、精度

主要在nuScene数据集上：

三、实现

3.1 特征提取：用基于Resnet-FPN的backbone，提取各个相机上的图片特征；

3.2 检测头（2D-to-3D Feature Transformation），其中： i 是query的序号， L是transformer的层级， m是相机编号， k是backbone的feature level

3.3 loss ：hangarian loss, 参考:

(2015.06 cvpr) End-to-end people detection in crowded scenes

(2015.06 cvpr) End-to-end people detection in crowded scenes.pdf

或者也可以看DETR的论文，大致流程如下：

1. 把gt的数量补充到object query的数量N，一般来说， gt的数量肯定要少一些，不足的部分用空的补上；

2. 计算{gt} set 和 {obj query} set的N*N cost 矩阵, 具体的cost是类别和box iou的加权；

3. 用带权重的匈牙利匹配，比如munkrs, 计算一个最优的2分匹配（assignment）

4. 基于这个assignment， minimize此时的loss, loss也是由类比loss和框的loss构成；

四、个人观点、重要参考文献

3Dproposal投影到2D + 多层迭代解决多相机问题

query之间的multi-head-attention + set-to-set loss干掉nms

牛啊！

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
2
评论
论文阅读笔记：(2021.10 CoRL) DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries

论文地址：DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries | OpenReviewWe introduce a framework for multi-camera 3D object detection. In contrast to existing works, which estimate 3D bounding boxes directly from monocular images or use de
复制链接

扫一扫

专栏目录

chaoqinyou CSDN认证博客专家 CSDN认证企业博客

码龄13年

42: 原创

14万+: 周排名

8万+: 总排名

4万+: 访问

: 等级

449: 积分

31: 粉丝

23: 获赞

13: 评论

136: 收藏

私信

关注

热门文章

分类专栏

最新评论

(2022 IV) RCBEV
sangba2019: 博主，你好。我看你分享的论文多是radar-camera feature级的融合，请问博主对前融合有研究么
stable diffusion webui 搭建和初步使用
CSDN-Ada助手: 恭喜你这篇博客进入【CSDN每天值得看】榜单，全部的排名请看 https://bbs.csdn.net/topics/618243930。
论文阅读笔记：(2021.10 CoRL) DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries
K Ronaldo: 博主您好，关于这个论文我看了非常久，有一个问题一直弄不明白。就是他们的速度预测是怎么实现的呢？他们似乎完全没有用到时序信息
(2022.05) BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird‘s-Eye View Representation
CSDN-Ada助手: 你好，CSDN 开始提供 #论文阅读# 的列表服务了。请看：https://blog.csdn.net/nav/advanced-technology/paper-reading?utm_source=csdn_ai_ada_blog_reply 。如果你有更多需求，请来这里 https://gitcode.net/csdn/csdn-tags/-/issues/34?utm_source=csdn_ai_ada_blog_reply 给我们提。
(2019.01, iclr) Decoupled Weight Decay Regularization
CSDN-Ada助手: 你好，CSDN 开始提供 #论文阅读# 的列表服务了。请看：https://blog.csdn.net/nav/advanced-technology/paper-reading?utm_source=csdn_ai_ada_blog_reply 。如果你有更多需求，请来这里 https://gitcode.net/csdn/csdn-tags/-/issues/34?utm_source=csdn_ai_ada_blog_reply 给我们提。

您愿意向朋友推荐“博客详情页”吗？

强烈不推荐
不推荐
一般般
推荐
强烈推荐

提交

最新文章

目录

评论 2

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。