【自动驾驶】训练集优化 | 扩充数据量真的可以提高训练精准度吗?(上)

文章讨论了高级驾驶辅助系统(ADAS)的训练数据对性能的影响。当前,为了提升ADAS的功能,大量的数据被用于训练神经网络,但这可能导致包含不准确或不适用的数据。ARRKEngineering提出了一种方法,侧重于数据集的质量而非数量,以提高ADAS的功能安全性和开发效率,这对于自动驾驶的发展至关重要。
摘要由CSDN通过智能技术生成

背景

Advanced driver-assistance systems (ADAS) are nowadays not only a standard accessory in new cars, but also an important milestone on the road to autonomous driving.

高级驾驶辅助系统(ADAS)如今不仅是新车的标配,也是自动驾驶道路上的一个重要里程碑。

Be it keeping the lane and distance to the vehicle in front or parking in tight spaces: The more the technical assistants are supposed to be able to do independently, the better the neural networks on which the systems are based have to be trained for this. Accordingly, the data sets used continue to grow.

无论是车道保持、与前车的距离保持,还是在狭小空间停车:辅助系统能够独立完成的工作越多,作为系统基础的神经网络就必须受到更优质的训练。相应地,使用的数据集将持续扩充。

Yet, this raises a question: to what extent do the training data actually reflect the operational domains of the ADAS? This is often of secondary importance and rarely checked. In order to reduce the systems’ susceptibility to errors, only the quantity of data has been kept constantly increasing up to now. This results in unnecessarily complex, lengthy and thus also inefficient development processes.

然而,这引发了一个问题:训练的数据在多大程度上能真正反映ADAS系统在操作域的情况?这个问题通常不被人重视,并且也很少有人去检验。为了减少系统对错误的敏感度,到目前为止,采用的方法是不断增加数据数量。这导致了不必要的复杂、冗长,因此也是低效的开发过程。

ARRK Engineering has therefore developed an approach to analyze the models with regard to concrete operational domains and relevant scenarios, such as urban traffic or motorways. Data that is inaccurate or distorts reality can be corrected or removed and the ADAS can be trained in a targeted, reliable and at the same time resource-efficient manner.

因此,ARRK Engineering|埃尔科工程 开发了一种方法来分析有关具体操作域和相关场景的模型,如城市交通或高速公路。不准确或失真的数据可以被纠正或删除,ADAS得以以一种有针对性的、可靠的、同时也是节约资源的方式进行训练。

图片

 现状

Well-known OEMs are leading the way, mobility start-ups are following suit and consumers want it: more and more vehicles are being equipped with level 2 and level 3 driver-assistance systems. 

知名汽车主机厂正在引领潮流,初创汽车公司积极跟进,消费者也渴望拥有:越来越多的车辆正在配备L2级和L3级的驾驶辅助系统。

Thus, every day, numerous road users rely on lane keeping assists or autosteer (LKA/LCA), automated parking and adaptive cruise control (ACC). The general safety on the roads – and thus the safety of all road users – therefore depends to a large extent on the proper functioning of these systems. To ensure this, their neuronal networks are trained with the help of huge data sets. The models are intended to represent all possible situations that the vehicle may encounter in everyday traffic and thus serve as a recognition and calculation basis for the autonomous reactions of the ADAS in the field.

因此,每天都有许多道路使用者依赖车道保持辅助系统或自动转向系统(LKA/LCA)、自动泊车和自适应巡航控制(ACC)。道路上的总体安全以及所有道路使用者的安全在很大程度上取决于这些系统的正常运作。为此,他们的神经元网络需要借助大量数据集进行训练。这些训练模型旨在代表车辆在日常交通中可能遇到的所有情况,从而作为ADAS在现场做出自主反应的识别与计算的基础。

图片

 Huge data sets: the vicious cycle of the mass

庞大的数据集:大规模的恶性循环

The more complex the functionalities of different ADAS, the more specific data sets are required for their training. In order to cover all possible traffic situations, the data sets have been expanded more and more in recent years, focusing primarily on mass, i.e. the sheer number of recording hours or of annotated objects in different weather and lighting conditions.

不同的ADAS的功能越复杂,其训练所需的数据集就越具体。为了涵盖所有可能的交通情况,近年来,数据集被越来越多地扩充,主要集中在数量上,例如在不同的天气和照明条件下,记录的小时数或注释对象的数量。


However, this inevitably increases the proportion of data that is inaccurate or simply unsuitable for a particular operational domain. To ensure that the newly developed ADAS continue to function reliably, their quality deficit has in turn been compensated for with quantity – a vicious cycle. This has already led to very long development times with many iteration loops, in which the training of the neural networks alone takes several weeks.

然而,这不可避免地增加了不准确或根本不适合特定操作域的数据比例。为了确保新开发的ADAS持续可靠地运行,以数量掩盖质量缺陷——这是一个恶性循环。这导致了非常长的开发时间和许多迭代循环,其中仅神经网络的训练就需要几周时间。

技术解决方案

To escape this dilemma, the automotive industry needs to shift the focus away from quantity and towards quality of data sets. Therefore, the machine learning specialists at ARRK Engineering have developed an approach to validate the processes with regard to an operational domain and to correct them if necessary. In this way, development can be made more efficient and, more importantly, the functional safety of the ADAS can be increased – an essential prerequisite for its future further development into higher levels of autonomous driving.

为了摆脱这种困境,汽车行业需要将重点从数据集的数量转移到质量上。因此,ARRK Engineering|埃尔科工程 的机器学习专家已经开发出一种方法,以验证与操作领域有关的流程,并在必要时对其进行修正。通过这种方式,可以提高开发效率,更重要的是,可以提高ADAS的功能性安全,这是未来进一步发展为更高水平的自动驾驶的必要前提。

The above content was created by Václav Diviš, Senior Expert Machine Learning at ARRK Engineering, and presented at the SafetyAI 2023 conference. In the next issue, we'll continue to share the case studies from the report and provide a way to download the full report, so stay tuned!

以上内容由ARRK Engineering|埃尔科工程 的机器学习高级专家Václav Diviš创作并在AAAI人工智能大会 SafetyAI 2023分论坛宣读,下期我们将继续分享该报告中的案例研究,并提供完整报告的下载方式,敬请期待!

图片

 Senior Expert Machine Learning  :Václav Diviš

如您也对 

ADAS领域感兴趣

欢迎关注VX:ARRK 德国埃尔科

了解更多详细信息


关于ARRK Engineering|埃尔科工程
ARRK Engineering GmbH 原名 P+Z Engineering GmbH,成立于1967年,总部位于德国慕尼黑,在英国、罗马尼亚、日本和中国均设有分公司或办公室,是汽车及航空航天等行业内众多国际一线品牌的长期合作伙伴,为客户提供高端的工程开发咨询服务。

埃尔科工程技术开发(上海)有限公司 是德国 ARRK Engineering GmbH 在中国设立的全资子公司,成立于2019年9月,志为中国汽车领域提供世界一流的工程技术服务。


 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值