超全!CVPR 2024自动驾驶有哪些值得关注的热点方向?

点击下方卡片,关注“自动驾驶之心”公众号

戳我-> 领取自动驾驶近15个方向学习路线

>>点击进入→自动驾驶之心CVPR2024技术交流群

编辑 | 自动驾驶之心

CVPR2024的工作陆续放出来了,自动驾驶Daily也一直再跟进,今天为大家盘点下会上优秀的工作,涉及端到端自动驾驶、大语言模型、Occupancy、SLAM、车道线检测、3D检测、协同感知、点云处理、MOT、毫米波雷达、Nerf、Gaussian Splatting等方向;

这里也推荐下我们的CVPR2024仓库链接:https://github.com/autodriving-heart/CVPR-2024-Papers-Autonomous-Driving,欢迎收藏点赞,第一时间掌握最新内容。

1) End to End | 端到端自动驾驶

Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?

  • Paper: https://arxiv.org/pdf/2312.03031.pdf

  • Code: https://github.com/NVlabs/BEV-Planner

Visual Point Cloud Forecasting enables Scalable Autonomous Driving

  • Paper: https://arxiv.org/pdf/2312.17655.pdf

  • Code: https://github.com/OpenDriveLab/ViDAR

PlanKD: Compressing End-to-End Motion Planner for Autonomous Driving

  • Paper: https://arxiv.org/pdf/2403.01238.pdf

  • Code: https://github.com/tulerfeng/PlanKD

VLP: Vision Language Planning for Autonomous Driving

  • Paper:https://arxiv.org/abs/2401.05577

2)LLM Agent | 大语言模型智能体

ChatSim: Editable Scene Simulation for Autonomous Driving via LLM-Agent Collaboration

  • Paper: https://arxiv.org/pdf/2402.05746.pdf

  • Code: https://github.com/yifanlu0227/ChatSim

LMDrive: Closed-Loop End-to-End Driving with Large Language Models

  • Paper: https://arxiv.org/pdf/2312.07488.pdf

  • Code: https://github.com/opendilab/LMDrive

MAPLM: A Real-World Large-Scale Vision-Language Dataset for Map and Traffic Scene Understanding

  • Code: https://github.com/LLVM-AD/MAPLM

One Prompt Word is Enough to Boost Adversarial Robustness for Pre-trained Vision-Language Models

  • Paper:https://arxiv.org/pdf/2403.01849.pdf

  • Code:https://github.com/TreeLLi/APT

PromptKD: Unsupervised Prompt Distillation for Vision-Language Models

  • Paper:https://arxiv.org/pdf/2403.02781

RegionGPT: Towards Region Understanding Vision Language Model

  • Paper:https://arxiv.org/pdf/2403.02330

3)SSC: Semantic Scene Completion | 语义场景补全

Symphonize 3D Semantic Scene Completion with Contextual Instance Queries

  • Paper: https://arxiv.org/pdf/2306.15670.pdf

  • Code: https://github.com/hustvl/Symphonies

PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness

  • Paper: https://arxiv.org/pdf/2312.02158.pdf

  • Code: https://github.com/astra-vision/PaSCo

4)OCC: Occupancy Prediction | 占用感知

SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction

  • Paper: https://arxiv.org/pdf/2311.12754.pdf

  • Code: https://github.com/huang-yh/SelfOcc

Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications

  • Paper: https://arxiv.org/pdf/2311.17663.pdf

  • Code: https://github.com/haomo-ai/Cam4DOcc

PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation

  • Paper: https://arxiv.org/pdf/2306.10013.pdf

  • Code: https://github.com/Robertwyq/PanoOcc

5)车道线检测

Lane2Seq: Towards Unified Lane Detection via Sequence Generation

  • Paper:https://arxiv.org/abs/2402.17172

6)Pre-training | 预训练

UniPAD: A Universal Pre-training Paradigm for Autonomous Driving

  • Paper: https://arxiv.org/pdf/2310.08370.pdf

  • Code: https://github.com/Nightmare-n/UniPAD

7)AIGC | 人工智能内容生成

Panacea: Panoramic and Controllable Video Generation for Autonomous Driving

  • Paper: https://arxiv.org/pdf/2311.16813.pdf

  • Code: https://github.com/wenyuqing/panacea

SemCity: Semantic Scene Generation with Triplane Diffusion

  • Paper:

  • Code: https://github.com/zoomin-lee/SemCity

BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation

  • Paper: https://arxiv.org/pdf/2312.02136.pdf

  • Code: https://github.com/zqh0253/BerfScene

8)3D Object Detection | 三维目标检测

PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection

  • Paper: https://arxiv.org/pdf/2312.08371.pdf

  • Code: https://github.com/KuanchihHuang/PTT

VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection

  • Code: https://github.com/skmhrk1209/VSRD

CaKDP: Category-aware Knowledge Distillation and Pruning Framework for Lightweight 3D Object Detection

  • Code: https://github.com/zhnxjtu/CaKDP

CN-RMA: Combined Network with Ray Marching Aggregation for 3D Indoors Object Detection from Multi-view Images

  • Paper:https://arxiv.org/abs/2403.04198

  • Code:https://github.com/SerCharles/CN-RMA

UniMODE: Unified Monocular 3D Object Detection

  • Paper:https://arxiv.org/abs/2402.18573

Enhancing 3D Object Detection with 2D Detection-Guided Query Anchors

  • Paper:https://arxiv.org/abs/2403.06093

  • Code:https://github.com/nullmax-vision/QAF2D

SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection

  • Paper:https://arxiv.org/abs/2403.05817

  • Code:https://github.com/zhanggang001/HEDNet

RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

  • Paper:https://arxiv.org/pdf/2403.05061

9)Stereo Matching | 双目立体匹配

MoCha-Stereo: Motif Channel Attention Network for Stereo Matching

  • Code: https://github.com/ZYangChen/MoCha-Stereo

Learning Intra-view and Cross-view Geometric Knowledge for Stereo Matching

  • Paper:https://arxiv.org/abs/2402.19270

  • Code:https://github.com/DFSDDDDD1199/ICGNet

Selective-Stereo: Adaptive Frequency Information Selection for Stereo Matching

  • Paper:https://arxiv.org/abs/2403.00486

  • Code:https://github.com/Windsrain/Selective-Stereo

10)Cooperative Perception | 协同感知

RCooper: A Real-world Large-scale Dataset for Roadside Cooperative Perception

  • Code: https://github.com/ryhnhao/RCooper

11)SLAM

SNI-SLAM: SemanticNeurallmplicit SLAM

  • Paper: https://arxiv.org/pdf/2311.11016.pdf

CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition

  • Paper:https://arxiv.org/abs/2402.19231

  • Code:https://github.com/Lu-Feng/CricaVPR

12)Scene Flow Estimation | 场景流估计

DifFlow3D: Toward Robust Uncertainty-Aware Scene Flow Estimation with Iterative Diffusion-Based Refinement

  • Paper: https://arxiv.org/pdf/2311.17456.pdf

  • Code: https://github.com/IRMVLab/DifFlow3D

3DSFLabeling: Boosting 3D Scene Flow Estimation by Pseudo Auto Labeling

  • Paper: https://arxiv.org/pdf/2402.18146.pdf

  • Code: https://github.com/jiangchaokang/3DSFLabelling

Regularizing Self-supervised 3D Scene Flows with Surface Awareness and Cyclic Consistency

  • Paper: https://arxiv.org/pdf/2312.08879.pdf

  • Code: https://github.com/vacany/sac-flow

13)Point Cloud | 点云

Point Transformer V3: Simpler, Faster, Stronger

  • Paper: https://arxiv.org/pdf/2312.10035.pdf

  • Code: https://github.com/Pointcept/PointTransformerV3

Rethinking Few-shot 3D Point Cloud Semantic Segmentation

  • Paper: https://arxiv.org/pdf/2403.00592.pdf

  • Code: https://github.com/ZhaochongAn/COSeg

PDF: A Probability-Driven Framework for Open World 3D Point Cloud Semantic Segmentation

  • Code: https://github.com/JinfengX/PointCloudPDF

14)  Efficient Network

Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications

  • Paper: https://arxiv.org/pdf/2401.06197.pdf

RepViT: Revisiting Mobile CNN From ViT Perspective

  • Paper: https://arxiv.org/pdf/2307.09283.pdf

  • Code: https://github.com/THU-MIG/RepViT

15) Segmentation

OMG-Seg: Is One Model Good Enough For All Segmentation?

  • Paper: https://arxiv.org/pdf/2401.10229.pdf

  • Code: https://github.com/lxtGH/OMG-Seg

Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation

  • Paper: https://arxiv.org/pdf/2312.04265.pdf

  • Code: https://github.com/w1oves/Rein

SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation

  • Paper:https://arxiv.org/abs/2311.15707

SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation

  • Paper:https://arxiv.org/abs/2311.15537

Style Blind Domain Generalized Semantic Segmentation via Covariance Alignment and Semantic Consistence Contrastive Learning

  • Paper:https://arxiv.org/abs/2403.06122

16)Radar | 毫米波雷达

DART: Doppler-Aided Radar Tomography

  • Code: https://github.com/thetianshuhuang/dart

17)Nerf与Gaussian Splatting

Dynamic LiDAR Re-simulation using Compositional Neural Fields

  • Paper: https://arxiv.org/pdf/2312.05247.pdf

  • Code: https://github.com/prs-eth/Dynamic-LiDAR-Resimulation

GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding

  • Paper:https://arxiv.org/abs/2403.03608

NARUTO: Neural Active Reconstruction from Uncertain Target Observations

  • Paper:https://arxiv.org/abs/2402.18771

DNGaussian: Optimizing Sparse-View 3D Gaussian Radiance Fields with Global-Local Depth Normalization

  • Paper:https://arxiv.org/abs/2403.06912

S-DyRF: Reference-Based Stylized Radiance Fields for Dynamic Scenes

  • Paper:https://arxiv.org/pdf/2403.06205

SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting

  • Paper:https://arxiv.org/pdf/2403.05087

DaReNeRF: Direction-aware Representation for Dynamic Scenes

  • Paper:https://arxiv.org/pdf/2403.02265

18)MOT: Muti-object Tracking | 多物体跟踪

Delving into the Trajectory Long-tail Distribution for Muti-object Tracking

  • Code: https://github.com/chen-si-jia/Trajectory-Long-tail-Distribution-for-MOT

DeconfuseTrack:Dealing with Confusion for Multi-Object Tracking

  • Paper:https://arxiv.org/abs/2403.02767

19)Multi-label Atomic Activity Recognition

Action-slot: Visual Action-centric Representations for Multi-label Atomic Activity Recognition in Traffic Scenes

  • Paper: https://arxiv.org/pdf/2311.17948.pdf

  • Code: https://github.com/HCIS-Lab/Action-slot

20) Motion Prediction | 运动预测

SmartRefine: An Scenario-Adaptive Refinement Framework for Efficient Motion Prediction

  • Code: https://github.com/opendilab/SmartRefine

21)卷积网络相关

CAM Back Again: Large Kernel CNNs from a Weakly Supervised Object Localization Perspective

  • Paper:https://arxiv.org/abs/2403.06676

  • Code:https://github.com/snskysk/CAM-Back-Again

投稿作者为『自动驾驶之心知识星球』特邀嘉宾,欢迎加入交流!

① 全网独家视频课程

BEV感知、毫米波雷达视觉融合多传感器标定多传感器融合多模态3D目标检测车道线检测轨迹预测在线高精地图世界模型点云3D目标检测目标跟踪Occupancy、cuda与TensorRT模型部署大模型与自动驾驶Nerf语义分割自动驾驶仿真、传感器部署、决策规划、轨迹预测等多个方向学习视频(扫码即可学习

5af2ba2d95dd13f296e6a9fe5b5c41ca.png 网页端官网:www.zdjszx.com

② 国内首个自动驾驶学习社区

国内最大最专业,近2700人的交流社区,已得到大多数自动驾驶公司的认可!涉及30+自动驾驶技术栈学习路线,从0到一带你入门自动驾驶感知2D/3D检测、语义分割、车道线、BEV感知、Occupancy、多传感器融合、多传感器标定、目标跟踪)、自动驾驶定位建图SLAM、高精地图、局部在线地图)、自动驾驶规划控制/轨迹预测等领域技术方案大模型、端到端等,更有行业动态和岗位发布!欢迎扫描下方二维码,加入自动驾驶之心知识星球,这是一个真正有干货的地方,与领域大佬交流入门、学习、工作、跳槽上的各类难题,日常分享论文+代码+视频

6ed21f8345eca1a5c24d41d83f933fcc.png

③【自动驾驶之心】技术交流群

自动驾驶之心是首个自动驾驶开发者社区,聚焦2D/3D目标检测、语义分割、车道线检测、目标跟踪、BEV感知、多模态感知、Occupancy、多传感器融合、transformer、大模型、在线地图、点云处理、端到端自动驾驶、SLAM与高精地图、深度估计、轨迹预测、NeRF、Gaussian Splatting、规划控制、模型部署落地、cuda加速、自动驾驶仿真测试、产品经理、硬件配置、AI求职交流等方向。扫码添加汽车人助理微信邀请入群,备注:学校/公司+方向+昵称(快速入群方式)

78204b5a5723479eb2eab0a18ceff1d1.jpeg

④【自动驾驶之心】平台矩阵,欢迎联系我们!

ecf600b1b2d288b0eaf51b31a83bde1f.jpeg

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值