盘一盘世界模型与自动驾驶场景生成-CSDN博客

作者 | 冰锐编辑 | 自动驾驶之心

原文链接：https://zhuanlan.zhihu.com/p/686277501

点击下方卡片，关注“自动驾驶之心”公众号

戳我-> 领取自动驾驶近15个方向学习路线

>>点击进入→自动驾驶之心『世界模型』技术交流群

本文只做学术分享，如有侵权，联系删文

框架

mmagic：https://github.com/open-mmlab/mmagic

综述

World Models for Autonomous Driving: An Initial Survey

The JEPA model aims to construct mapping relationships between different inputs in the encoding space by minimizing input information and prediction errors.

Towards Knowledge-driven Autonomous Driving

Embodied AI is a facet of intelligence emphasizing the direct interaction between an intelligent system and its environment, involving perception, understanding, and action.

扩散模型

Diffusion model

扩散模型 - Diffusion Model【李宏毅2023】_哔哩哔哩_bilibili

Diffusion扩散模型大白话讲解，看完还不懂？不可能！

多模态预训练CLIP

Diffusion论文：Denoising Diffusion Probabilistic Models

StableDiffusion论文：High-Resolution Image Synthesis with Latent Diffusion Models

Diffusers

stable-diffusion-v1-5权重地址：runwayml/stable-diffusion-v1-5 at main

pytorch手写Diffusion Model：The Annotated Diffusion Model

Stable Diffusion with Diffusers

手写扩散模型-diffuers介绍_哔哩哔哩_bilibili

世界模型

GAIA-1（2023.9.29）

MAGICDRIVE（2024.1.26）

论文地址：MAGICDRIVE: STREET VIEW GENERATION WITHDIVERSE 3D GEOMETRY CONTROL

github：https://github.com/cure-lab/MagicDrive

Drive WM（2023.11.29）

https://github.com/BraveGroup/Drive-WM?tab=readme-ov-file

MUVO（2023.11.23）

多模态生成

DriveDreamer（2023.11.27）

https://github.com/JeffWang987/DriveDreamer

https://drivedreamer.github.io/

DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation（2024.3.11）

多视角视频生成

WorldDreamer

https://world-dreamer.github.io/

OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving

Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving (in CARLA-v2)

Driving with LLMs

DRIVEVLM: The Convergence of Autonomous Driving and Large Vision-Language Models（2024.2.25）

DriveVLM集成了场景描述、场景分析和分层规划的思维链（CoT）模块

VLM计算量大，提出DriveVLM Dual，将DriveVLM的优势与传统自动驾驶流水线协同混合

据说可以部署在orin芯片上，但猜测只是在orin上跑通，还没有大规模部署到车上

DriveVLM识别了一个倒塌的树木，并给出了稍微向右偏移的驾驶决策

Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving

论文地址：Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving

github：https://github.com/wayveai/Driving-with-LLMs

ADriver-I: A General World Model for Autonomous Driving

GAN（图生图）

CYCLEGAN

Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks

mmgeneration：https://github.com/open-mmlab/mmgeneration?tab=readme-ov-file

Lidar GAN

GAN-Based LiDAR Translation between Sunny and Adverse Weather for Autonomous Driving and Driving Simulation

① 2025中国国际新能源技术展会

自动驾驶之心联合主办中国国际新能源汽车技术、零部件及服务展会。展会将于2025年2月21日至24日在北京新国展二期举行，展览面积达到2万平方米，预计吸引来自世界各地的400多家参展商和2万名专业观众。作为新能源汽车领域的专业展，它将全面展示新能源汽车行业的最新成果和发展趋势，同期围绕个各关键板块举办论坛，欢迎报名参加。

② 国内首个自动驾驶学习社区

『自动驾驶之心知识星球』近4000人的交流社区，已得到大多数自动驾驶公司的认可！涉及30+自动驾驶技术栈学习路线，从0到一带你入门自动驾驶感知（端到端自动驾驶、世界模型、仿真闭环、2D/3D检测、语义分割、车道线、BEV感知、Occupancy、多传感器融合、多传感器标定、目标跟踪）、自动驾驶定位建图（SLAM、高精地图、局部在线地图）、自动驾驶规划控制/轨迹预测等领域技术方案、大模型，更有行业动态和岗位发布！欢迎扫描加入

③全网独家视频课程

端到端自动驾驶、仿真测试、自动驾驶C++、BEV感知、BEV模型部署、BEV目标跟踪、毫米波雷达视觉融合、多传感器标定、多传感器融合、多模态3D目标检测、车道线检测、轨迹预测、在线高精地图、世界模型、点云3D目标检测、目标跟踪、Occupancy、CUDA与TensorRT模型部署、大模型与自动驾驶、NeRF、语义分割、自动驾驶仿真、传感器部署、决策规划、轨迹预测等多个方向学习视频（扫码即可学习）