【论文翻译】nuPlan A closed-loop ML-based planning benchmark for autonomous vehicles

nuPlan是一个开创性的闭环机器学习规划benchmark,针对自动驾驶车辆。它弥补了现有预测数据集和度量标准的不足,提供了一个大规模的真实世界驾驶数据集,包括来自4个城市的1500小时数据。此基准强调了长期规划的重要性,通过闭环评估、场景特定的规划任务和度量,促进ML在自动驾驶规划中的应用。
摘要由CSDN通过智能技术生成

论文链接:https://arxiv.org/pdf/2106.11810.pdf

标题

nuPlan: A closed-loop ML-based planning benchmark for autonomous vehicles
nuPlan:基于机器学习的闭环规划benchmark,用于自动驾驶车辆

1 摘要/Abstract

In this work, we propose the world’s first closed-loop ML-based planning benchmark for autonomous driving. While there is a growing body of ML-based motion planners, the lack of established datasets and metrics has limited the progress in this area. Existing benchmarks for autonomous vehicle motion prediction have focused on short-term motion forecasting, rather than long-term planning. This has led previous works to use open-loop evaluation with L2-based metrics, which are not suitable for fairly evaluating long-term planning. Our benchmark overcomes these limitations by introducing a largescale driving dataset, lightweight closed-loop simulator, and motion-planning-specific metrics. We provide a highquality dataset with 1500h of human driving data from 4 cities across the US and Asia with widely varying traffic patterns (Boston, Pittsburgh, Las Vegas and Singapore). We will provide a closed-loop simulation framework with reactive agents and provide a large set of both general and scenario-specific planning metrics. We plan to release the dataset at NeurIPS 2021 and organize benchmark challenges starting in early 2022

在这项工作中,我们提出了世界上第一个用于自动驾驶的基于机器学习的闭环规划benchmark 。虽然越来越多的基于机器学习的运动规划器,但缺乏成熟的数据集和度量指标限制了这一领域的进展。现有用于自动驾驶车辆运动预测的benchmark主要关注短期运动预测,而不是长期规划。这导致以前的工作采用基于L2度量指标(欧氏距离)的开环评估方式,不适用于进行有效的长期规划的评估。为了克服这些限制,我们的benchmark引入大规模驾驶数据集、轻量级闭环模拟器和特定于运动规划的指标 。我们提供了一个高质量的数据集,其中包含来自美国和亚洲四个交通模式迥异的城市(波士顿、匹兹堡、拉斯维加斯和新加坡)的1500小时人类驾驶数据。我们将提供一个带有反馈车辆(reactive agents)的闭环仿真框架,并提供一大组通用且基于特定场景的规划指标。我们计划在NeurIPS 2021 发布该数据集,并从2022年初开始组织benchmark 挑战赛。

引言/Introduction

Large-scale human labeled datasets in combination with deep Convolutional Neural Networks have led to an impressive performance increase in autonomous vehicle (AV) perception over the last few years [9, 4]. In contrast, existing solutions for AV planning are still primarily based on carefully engineered expert systems, that require significant amounts of engineering to adapt to new geographies and do not scale with more training data. We believe that providing suitable data and metrics will enable ML-based planning and pave the way towards a full “Software 2.0” stack.

在过去的几年里,大规模的人类标注数据集与深度卷积神经网络相结合,使得自动驾驶车辆(AV)感知的性能显著提高[9,4]。相比之下,现有的AV规划解决方案仍然主要基于精心设计的专家系统,这些系统需要大量的工程设计来适应新的道路环境,并且不能随着更多的训练数据来提升模型。我们相信,提供合适的数据和度量指标,将实现基于机器学习(ML)的规划,并为实现完整的“软件2.0”技术铺平道路。

Existing real-world benchmarks are focused on shortterm motion forecasting, also known as prediction [6, 4, 11, 8], rather than planning. This is evident in the lack of high-level goals, the choice of metrics, and the openloop evaluation. Prediction focuses on the behavior of other agents, while planning relates to the ego vehicle behavior. Prediction is typically multi-modal, which means that for each agent we predict the N most likely trajectories. In contrast, planning is typically uni-modal (except for contingency planning) and we predict a single trajectory. As an example, in Fig. 1a, turning left or right at an intersection are equally likely options. Prediction datasets lack a baseline navigation route to indicate the high-level goals of the agents. In Fig. 1b, the options of merging immediately or later are both equally valid, but the commonly used L2 distance-based metrics (minADE, minFDE, and miss rate) penalize the option that was not observed in the data. Intuitively, the distance between the predicted trajectory and the observed trajectory is not a suitable indicator in a multimodal scenario. In Fig. 1c, the decision whether to continue to overtake or get back into the lane should be based on the consecutive actions of all agent vehicles, which is not possible in open-loop evaluation. Lack of closed-loop evaluation leads to systematic drift, making it difficult to evaluate beyond a short time horizon (3-8s).

在这里插入图片描述
Figure 1. We show different driving scenarios to emphasize the limitations of existing benchmarks. The observed driving route of the ego vehicle in shown in white and the hypothetical planner route in red. (a) The absence of a goal leads to ambiguity at intersections. (b) Displacement metrics do not take into account the multi-modal nature of driving. open-loop evaluation does not take into account agent interaction
图1. 我们展示了不同的驾驶场景,以强调现有benchmark的局限性。观察到的自车行驶路线显示为白色,假设的规划路线显示为红色。(a) 没有目标会导致交叉路口的模糊性。(b) 位移指标没有考虑到驾驶的多模态特性。(c) 开环评估不考虑车辆交互

现有的真实世界的benchmark主要关注短期运动预测[6,4,11,8],而不是规划,体现在缺乏高层目标、指标选择和开环评估。预测侧重于他车行为,而规划则与自车行为有关。预测通常是多模态的,即对于每个代理都需要预测出N条最可能的轨迹。相比之下,规划通常是单模态的(连续规划除外),我们预测的是单一轨迹。例如,在图1a中,在十字路口左转或右转是同样可能的选择。预测数据集缺乏一条基线导航路线来指示车辆的高层目标。在图1b中,立即或稍后并道的选项都同样有效,但常用的基于L2距离的度量(minADE、minFDE和未命中率)会惩罚数据中未观察到的选项。直觉上,将预测轨迹和观测轨迹之间的距离为基准在多模态场景中不是一个合适的指标。在图1c中,是否继续超车或返回车道的决定应基于所有车辆的连续动作,这在开环评估中是不可能的。缺乏闭环评估会导致系统漂移,难以在短时间范围(3-8秒)之外进行评估。

We instead provide a planning benchmark to address these shortcomings. Our main contributions are:

  • The largest existing public real-world dataset for autonomous driving with high quality autolabeled tracks from 4 cities.
  • Planning metrics related to traffic rule violation, human driving similarity, vehicle dynamics, goal achievement, as well as scenario-based.
  • The first public benchmark for real-world data with a closed-loop planner evaluation protocol.

相反,我们提供了一个规划benchmark来解决这些缺点。我们的主要贡献是:

  • 提出现有最大的公共真实世界数据集用于自动驾驶,数据来源于4个城市的高质量自动标注轨迹。
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
HPO-B是一个基于OpenML的大规模可复现的黑盒超参数优化(HPO)基准。超参数优化是机器学习中非常重要的一环,它涉及在给定的模型框架下选择最优的超参数配置,以提高模型的性能和泛化能力。 HPO-B基准的目的是为了提供一个可靠且可复现的平台,用于评估不同HPO方法的效果。通过使用OpenML作为基础数据集和算法库,HPO-B能够提供广泛的机器学习任务和模型,从而覆盖不同领域的实际应用。 HPO-B基准的黑盒性质意味着它仅仅观察模型的输入和输出,而不考虑模型内部的具体实现。这种设置模拟了现实世界中许多机器学习任务的情况,因为在实际应用中,我们通常无法获得关于模型的全部信息。 HPO-B基准旨在解决现有HPO方法的一些挑战,例如难以比较和复制不同方法之间的实验结果。它通过提供标准任务、固定的训练-验证-测试数据分割方式和一致的评估协议,使得不同方法之间的比较更加公平和可靠。 通过使用HPO-B基准,研究人员和从业者可以在统一的实验环境中进行黑盒超参数优化方法的评估和对比。这有助于推动该领域的发展,促进更好的超参数优化算法的提出和运用。 总而言之,HPO-B是一个基于OpenML的大规模可复现的黑盒超参数优化基准,旨在解决现有方法比较困难和结果复现性差的问题,并推动超参数优化算法的发展。它为机器学习任务提供了一个统一的实验平台,以评估不同方法在不同领域的性能。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值