自动驾驶BEV&Occupancy论文(五十篇左右)及算法总结

作者 | eyesighting  编辑 | 自动驾驶之星

原文链接:https://zhuanlan.zhihu.com/p/624241501

点击下方卡片,关注“自动驾驶之心”公众号

戳我-> 领取自动驾驶近15个方向学习路线

>>点击进入→自动驾驶之心BEV&占用网络技术交流群

本文只做学术分享,如有侵权,联系删文

写在前面的话:

    随着自动驾驶技术的不断发展,从BEV Transformer 已经卷到端到端技术,我们将会带来一系列相关的论文,和大家一起探讨学习!

1.BirdEyesView综述论文

BEVPerceptionReviewEvaluationRecipe

题目:Vision-Centric BEV Perception: A Survey

名称:以视觉为中心的 BEV 感知:一项调查

论文:https://arxiv.org/abs/2208.02797

SurroundViewVision3DDetSurvey

题目:Delving into the Devils of Bird's-eye-view Perception: A Review, Evaluation and Recipe

名称:深入研究鸟瞰感知的恶魔:回顾、评估和秘诀

论文:https://arxiv.org/abs/2209.05324

代码:https://github.com/OpenPerceptionX/BEVPerception-Survey-Recipe

VisionBEVPerceptionSurvey

题目:Surround-View Vision-based 3D Detection for Autonomous Driving: A Survey

名称:基于环视视觉的自动驾驶 3D 检测:一项调查

论文:https://arxiv.org/abs/2302.06650

VisionRadarFusionRobBEVDetSurvey

题目:Vision-RADAR fusion for Robotics BEV Detections: A Survey

名称:基于环视视觉的自动驾驶 3D 检测:一项调查

论文:https://arxiv.org/abs/2302.06643

2.BirdEyesView开源算法

360BEV

题目:360BEV: Panoramic Semantic Mapping for Indoor Bird's-Eye View

名称:360BEV:室内鸟瞰全景语义映射

论文:https://arxiv.org/abs/2303.11910

代码:https://github.com/jamycheung/360BEV

BEVDepth

题目:BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection

名称:BEVDepth:为多视图 3D 对象检测获取可靠的深度

论文:https://arxiv.org/abs/2206.10092

代码:https://github.com/Megvii-BaseDetection/BEVDepth

BEVDet

题目:BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View

名称:BEVDet:鸟瞰图中的高性能多相机 3D 目标检测

论文:https://arxiv.org/abs/2112.11790

代码:https://github.com/HuangJunJie2017/BEVDet

BEVDistill

题目:BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection

名称:BEVDistill:用于多视图 3D 对象检测的跨模态 BEV 蒸馏

论文:https://arxiv.org/abs/2211.09386

代码:https://github.com/zehuichen123/BEVDistill

BEVerse

题目:BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving

名称:BEVerse:以视觉为中心的自动驾驶鸟瞰图的统一感知和预测

论文:https://arxiv.org/abs/2205.09743

代码:https://github.com/zhangyp15/BEVerse

BEVFeatSitch

题目:Understanding Bird's-Eye View of Road Semantics using an Onboard Camera

名称:使用车载摄像头了解道路语义的鸟瞰图

论文:https://arxiv.org/abs/2012.03040

代码:https://github.com/ybarancan/BEV_feat_stitch

BEVFormer

题目:BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers

名称:BEVFormer:通过时空变换器从多相机图像中学习鸟瞰图表示

论文:https://arxiv.org/abs/2203.17270

代码:https://github.com/zhiqi-li/BEVFormer

BEVFormerV2

题目:BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision

名称:BEVFormer v2:通过透视监督使现代图像主干适应鸟瞰图识别

论文:https://arxiv.org/abs/2211.10439

代码:https://github.com/zhiqi-li/BEVFormer

BEVFusion

题目:BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework

名称:BEVFusion:一个简单而强大的 LiDAR-相机融合框架

论文:https://arxiv.org/abs/2205.13790

代码:https://github.com/ADLab-AutoDrive/BEVFusion

BEV-LaneDet

题目:BEV-LaneDet: a Simple and Effective 3D Lane Detection Baseline

名称:BEV-LaneDet:一种简单有效的 3D 车道检测基线

论文:https://arxiv.org/abs/2210.06006

代码:https://github.com/gigo-team/bev_lane_det

BEVPlace

题目:BEVPlace: Learning LiDAR-based Place Recognition using Bird's Eye View Images

名称:BEVPlace:使用鸟瞰图像学习基于 LiDAR 的地点识别

论文:https://arxiv.org/abs/2302.14325

代码:https://github.com/zjuluolun/BEVPlace

BEVSimDet

题目:BEVSimDet: Simulated Multi-modal Distillation in Bird's-Eye View for Multi-view 3D Object Detection

名称:BEVSimDet:用于多视图 3D 对象检测的鸟瞰图中的模拟多模态蒸馏

论文:https://arxiv.org/abs/2303.16818

代码:https://github.com/ViTAE-Transformer/BEVSimDet

BEVStereo

题目:BEVStereo: Enhancing Depth Estimation in Multi-view 3D Object Detection with Dynamic Temporal Stereo

名称:BEVStereo:使用动态时间立体增强多视图 3D 对象检测中的深度估计

论文:https://arxiv.org/abs/2209.10248

代码:https://github.com/Megvii-BaseDetection/BEVStereo

Cam2BEV

题目:A Sim2Real Deep Learning Approach for the Transformation of Images from Multiple Vehicle-Mounted Cameras to a Semantically Segmented Image in Bird's Eye View

名称:一种 Sim2Real 深度学习方法,用于将图像从多个车载摄像头转换为鸟瞰图的语义分割图像

论文:https://arxiv.org/abs/2005.04078

代码:https://github.com/ika-rwth-aachen/Cam2BEV

DeepIPC

题目:DeepIPC: Deeply Integrated Perception and Control for an Autonomous Vehicle in Real Environments

名称:DeepIPC:真实环境中自动驾驶汽车的深度集成感知和控制

论文:https://arxiv.org/abs/2207.09934

代码:https://github.com/oskarnatan/DeepIPC

DETR3D

题目:DETR3D 3D Object Detection from Multi-view Images via 3D-to-2D Queries

名称:DETR3D:通过 3D 到 2D 查询从多视图图像中检测 3D 对象

论文:https://arxiv.org/abs/2110.06922、

代码:https://github.com/WangYueFt/detr3d

Fast-BEV

题目:Fast-BEV: Towards Real-time On-vehicle Bird's-Eye View Perception

名称:Fast-BEV:迈向实时车载鸟瞰图感知

论文:https://arxiv.org/abs/2301.07870

代码:https://github.com/Sense-GVT/Fast-BEV

FIERY

题目:FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras

名称:FIERY:环绕单目相机鸟瞰未来实例预测

论文:https://arxiv.org/abs/2104.10490

代码:https://github.com/wayveai/fiery

GKT-BEV

题目:Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer

名称:通过几何引导的内核变换器进行高效且稳健的 2D 到 BEV 表示学习

论文:https://arxiv.org/abs/2206.04584

代码:https://github.com/hustvl/GKT

HoPMV3D

题目:Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction

名称:通过历史对象预测对多视图 3D 对象检测器进行时间增强训练

论文:https://arxiv.org/abs/2304.00967

代码:https://github.com/Sense-X/HoP

Img2Maps

题目:Translating Images into Maps

名称:将图像转化为地图

论文:https://arxiv.org/abs/2110.00966

代码:https://github.com/avishkarsaha/translating-images-into-maps

LaRa-BEV

题目:LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation

名称:LaRa:用于多相机鸟瞰图语义分割的潜在和射线

论文:https://arxiv.org/abs/2206.13294

代码:https://github.com/valeoai/LaRa

LSS

题目:Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D

名称:Lift、Splat、Shoot:通过隐式取消投影到 3D 对来自任意相机装备的图像进行编码

论文:https://arxiv.org/abs/2008.05711

代码:https://nv-tlabs.github.io/lift-splat-shoot

MetaBEV

题目:MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

名称:MetaBEV:解决 BEV 检测和地图分割的传感器故障

论文:https://arxiv.org/abs/2304.09801

代码:https://chongjiange.github.io/metabev.html

MotionNet

题目:MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps

名称:MotionNet:基于鸟瞰图的自动驾驶联合感知和运动预测

论文:https://arxiv.org/abs/2003.06754

代码:https://github.com/pxiangwu/MotionNet

PersFormer

题目:PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark

名称:PersFormer:通过 Perspective Transformer 和 OpenLane Benchmark 进行 3D 车道检测

论文:https://arxiv.org/abs/2203.11089

代码:https://github.com/OpenPerceptionX/OpenLane

PETR

题目:PETR Position Embedding Transformation for Multi-View 3D Object Detection

名称:PETR:用于多视图 3D 对象检测的位置嵌入变换

论文:https://arxiv.org/abs/2203.05625

代码:https://github.com/megvii-research/PETR

PETRv2

题目:PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

名称:PETRv2:多相机图像 3D 感知的统一框架

论文:https://arxiv.org/abs/2206.01256

代码:https://github.com/megvii-research/PETR

PolarBEV

题目:Vision-based Uneven BEV Representation Learning with Polar Rasterization and Surface Estimation

名称:具有极光栅化和表面估计的基于视觉的不均匀 BEV 表示学习

论文:https://arxiv.org/abs/2207.01878

代码:https://github.com/SuperZ-Liu/PolarBEV

RoboBEV

题目:RoboBEV: Towards Robust Bird's Eye View Perception under Corruptions

名称:RoboBEV:在腐败下实现稳健的鸟瞰图感知

论文:https://arxiv.org/abs/2304.06719

代码:https://github.com/Daniel-xsy/RoboBEV

STSU

题目:Structured Bird's-Eye-View Traffic Scene Understanding from Onboard Images

名称:基于车载图像的结构化鸟瞰交通场景理解

论文:https://arxiv.org/abs/2110.01997

代码:https://github.com/ybarancan/STSU

TiG-BEV

题目:TiG-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning

名称:TiG-BEV:通过目标内部几何学习进行多视图 BEV 3D 对象检测

论文:https://arxiv.org/abs/2212.13979

代码:https://github.com/ADLab3Ds/TiG-BEV

TransFusion

题目:TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers

名称:TransFusion:用于使用 Transformer 进行 3D 对象检测的稳健 LiDAR-相机融合

论文:https://arxiv.org/abs/2203.11496

代码:https://github.com/XuyangBai/TransFusion

UniDistill

题目:UniDistill: A Universal Cross-Modality Knowledge Distillation Framework for 3D Object Detection in Bird's-Eye View

名称:UniDistill:用于鸟瞰 3D 对象检测的通用跨模态知识蒸馏框架

论文:https://arxiv.org/abs/2303.15083

代码:https://github.com/megvii-research/CVPR2023-UniDistill

3.Occupancy综述论文

GridCentricFusPepADSurvey

题目:Grid-Centric Traffic Scenario Perception for Autonomous Driving: A Comprehensive Review

名称:自动驾驶的以网格为中心的交通场景感知:综合回顾

论文 :https://arxiv.org/abs/2303.01212

4.Occupancy开源算法

MonoScene

题目:MonoScene: Monocular 3D Semantic Scene Completion

名称:MonoScene:单目3D语义场景补全

论文:https://arxiv.org/abs/2112.00726

代码:https://github.com/astra-vision/MonoScene

OccDepth

题目:OccDepth: A Depth-Aware Method for 3D Semantic Scene Completion

名称:OccDepth:一种用于 3D 语义场景补全的深度感知方法

论文:https://arxiv.org/abs/2302.13540

代码:https://github.com/megvii-research/OccDepth

OccFormer

题目:OccFormer: Dual-path Transformer for Vision-based 3D Semantic Occupancy Prediction

名称:OccFormer:用于基于视觉的 3D 语义占用预测的双路径转换器

论文:https://arxiv.org/abs/2304.05316

代码:https://github.com/zhangyp15/OccFormer

OpenOccupancy

题目:OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception

名称:OpenOccupancy:周围语义占用感知的大规模基准

论文:https://arxiv.org/abs/2303.03991

代码:https://github.com/JeffWang987/OpenOccupancy

SimpleOccupancy

题目:A Simple Attempt for 3D Occupancy Estimation in Autonomous Driving

名称:自动驾驶中 3D 占用率估计的简单尝

论文:https://arxiv.org/abs/2303.10076

代码:https://github.com/GANWANSHUI/SimpleOccupancy

StereoVoxelNet

题目:StereoVoxelNet: Real-Time Obstacle Detection Based on Occupancy Voxels from a Stereo Camera Using Deep Neural Networks

名称:StereoVoxelNet:基于使用深度神经网络的立体相机占用体素的实时障碍物检测

论文:https://arxiv.org/abs/2209.08459

代码:https://github.com/RIVeR-Lab/stereovoxelnet

SurroundOcc

题目:SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving

名称:SurroundOcc:自动驾驶的多摄像头 3D 占用预测

论文:https://arxiv.org/abs/2303.09551

代码:https://github.com/weiyithu/SurroundOcc

TPVFormer

题目:Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction

名称:基于视觉的 3D 语义占用预测的三视角视图

论文:https://arxiv.org/pdf/2302.07817

代码:https://github.com/wzzheng/TPVFormer

VoxFormer

题目:VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

名称:VoxFormer:用于基于相机的 3D 语义场景完成的稀疏体素变换器

论文:https://arxiv.org/abs/2302.12251

代码:https://github.com/NVlabs/VoxFormer

总结

先前的的前融合、后融合感知方案要逐渐被淘汰了。基于多传感器融合、多任务、时序的BEV、Occupancy的大统一算法框架会有力推动L3/L4大规模落地。配套的芯片、工具链、方法论、产业链关系、商业模式也会发生变化。

投稿作者为『自动驾驶之心知识星球』特邀嘉宾,欢迎加入交流!

① 全网独家视频课程

BEV感知、BEV模型部署、BEV目标跟踪、毫米波雷达视觉融合多传感器标定多传感器融合多模态3D目标检测车道线检测轨迹预测在线高精地图世界模型点云3D目标检测目标跟踪Occupancy、cuda与TensorRT模型部署大模型与自动驾驶Nerf语义分割自动驾驶仿真、传感器部署、决策规划、轨迹预测等多个方向学习视频(扫码即可学习

b5d481a0e7de4540ac4febecf7edccf0.png

网页端官网:www.zdjszx.com

② 国内首个自动驾驶学习社区

国内最大最专业,近3000人的交流社区,已得到大多数自动驾驶公司的认可!涉及30+自动驾驶技术栈学习路线,从0到一带你入门自动驾驶感知2D/3D检测、语义分割、车道线、BEV感知、Occupancy、多传感器融合、多传感器标定、目标跟踪)、自动驾驶定位建图SLAM、高精地图、局部在线地图)、自动驾驶规划控制/轨迹预测等领域技术方案大模型、端到端等,更有行业动态和岗位发布!欢迎扫描下方二维码,加入自动驾驶之心知识星球,这是一个真正有干货的地方,与领域大佬交流入门、学习、工作、跳槽上的各类难题,日常分享论文+代码+视频

b85d0db174955f87c42febc6385d033f.png

③【自动驾驶之心】技术交流群

自动驾驶之心是首个自动驾驶开发者社区,聚焦感知、定位、融合、规控、标定、端到端、仿真、产品经理、自动驾驶开发、自动标注与数据闭环多个方向,目前近60+技术交流群,欢迎加入!

自动驾驶感知:目标检测、语义分割、BEV感知、毫米波雷达视觉融合、激光视觉融合、车道线检测、目标跟踪、Occupancy、深度估计、transformer、大模型、在线地图、点云处理、模型部署、CUDA加速等技术交流群;

多传感器标定:相机在线/离线标定、Lidar-Camera标定、Camera-Radar标定、Camera-IMU标定、多传感器时空同步等技术交流群;

多传感器融合:多传感器后融合技术交流群;

规划控制与预测:规划控制、轨迹预测、避障等技术交流群;

定位建图:视觉SLAM、激光SLAM、多传感器融合SLAM等技术交流群;

三维视觉:三维重建、NeRF、3D Gaussian Splatting技术交流群;

自动驾驶仿真:Carla仿真、Autoware仿真等技术交流群;

自动驾驶开发:自动驾驶开发、ROS等技术交流群;

其它方向:自动标注与数据闭环、产品经理、硬件选型、求职面试、自动驾驶测试等技术交流群;

扫码添加汽车人助理微信邀请入群,备注:学校/公司+方向+昵称(快速入群方式)

88c78a92ccc702da316d25f5e9738620.jpeg

④【自动驾驶之心】全平台矩阵

fcff581846f972ddd6214a9f742a1f34.png

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值