An Auto-tuning Framework for Autonomous Vehicles

随着场景复杂性增加,自动驾驶运动规划器的调优变得困难。为此,开发了一个基于Apollo框架的数据驱动自动调优系统,该系统利用专家驾驶数据和环境信息自动标记,并通过模仿学习和优化奖励功能的方法优化性能。系统包括在线轨迹优化和离线参数调优,旨在适应不同驾驶场景,确保优化性和鲁棒性。
摘要由CSDN通过智能技术生成

动机:

As the scenario becomes more complicated, tuning to improve the motion planner performance becomes increasingly diffificult. To systematically solve this issue, we develop a data-driven auto-tuning framework based on the Apollo autonomous driving framework
【CC】简单来说,场景越来越复杂,系统的解决这个问题只能通过data-driven的 方式

Third, the expert driving data and information about the surrounding environment are collected and automatically l
labeled
【CC】这套框架能自动打标签?整个思路都是向着自动化的

Typically, two major approaches are used to develop such a map: learning via demonstration (imitation learning) or
through optimizing the current reward/cost functional.
【CC】背景知识,典型的motion Planner的处理方式: 要么imitation learning 要么走优化的路子

In an imitation learning system, the state-to-action mapping is directly learned from expert demonstration,a multimodal distribution loss function is necessary but will slow the training process
【CC】imitaiton learning的典型思路是通过数据构建一个分布的map: state-> action,一般比较慢

Optimizing through a reward functional,the reward/cost functionals are typically provided by an expert or learned from data via inverse reinforcement learning
【cc】优化的路子,cost function要么专家定义,要么通过IRL来学习,本文就是走的IRL学习的路子

Expert driving data from different scenarios are easy to collect but are extremely diffificult to reproduce in simulation since the ego car requires interaction with the surrounding environment
【cc】数据获取的痛点: 数据好收集,但是不好在虚拟环境上复现,因为跟环境有交互

we build an auto-tuning system that includes both online trajectory optimization and offlfline parameter tuning
【cc】在线进行轨迹优化,离线进行参数调优;按照下图,是训练好的cost func/参数回塞到在线系统中;但是,这个没有看到数据流的走向,不知道会不会做数据孪生
在这里插入图片描述
Our motion planner module is not tied to a specifific approach.
【CC】按照APOLLO的框架,个人猜测,motion planner可能不止有优化版本,还可能有imitation learning的版本;需要做的是“定义好”/“学习好”cost func来评价优化/生成出来的结果,这也是本文的IRL的重点

The performance of these motion planners is be evaluated with the metrics that quantify both optimality and robustness.The optimality o

  • 2
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值