应用机器学习进行ACC目标筛选（中文）

路人贾

已于 2024-07-15 18:55:09 修改

阅读量682

点赞数 25

文章标签：机器学习人工智能

于 2024-07-12 17:33:58 首次发布

本文链接：https://blog.csdn.net/j1041510723/article/details/140372409

版权

摘要

Vehicles will be more complex, safe, and intelligent in the future. For instance,with the support of the advanced driver assistance system (ADAS), the safety and comfort of the driver and the passengers can be significantly improved.This degree project proposes data-driven solutions for adaptive cruise control (ACC) target selection that can be used to select one of the preceding vehicles as the primary target that similar to the choice of human drivers. This master degree project was carried out at Scania CV AB. A shared-network and
a shared-LSTM network were used to select the primary target. Besides, A novel machine learning based target selection model (compare-target model) was designed, which can consider all neighboring vehicles together by comparing vehicles. A compare-target network and a compare-target XGBoost are developed based on the compare-target model. In total, four different machine
learning methods were adopted to select the primary target for ACC, including a shared network, a shared-LSTM network, a compare-target network, and a compare-target XGBoost model. These methods were compared and analyzed. Fine-tuning was adopted to overcome the data imbalance problem of rare situations. The compare-target XGBoost can achieve 94.85% accuracy on the test set.
未来的车辆将变得更加复杂、安全和智能化。例如，在先进的驾驶辅助系统（ADAS）的支持下，可以显著提高驾驶员和乘客的安全性和舒适度。这个学位项目提出了基于数据驱动的自适应巡航控制（ACC）目标选择解决方案，这些解决方案可以用来选择一个前车作为主要目标，类似于人类驾驶员的选择。这个硕士学位项目在Scania CV AB进行。使用了共享网络和共享LSTM网络来选择主要目标。此外，设计了一个新颖的基于机器学习的目标选择模型（比较目标模型），该模型可以通过比较考虑所有邻近车辆。基于比较目标模型开发了比较目标网络和比较目标XGBoost。总共采用了四种不同的机器学习方法来选择ACC的主要目标，包括共享网络、共享LSTM网络、比较目标网络和比较目标XGBoost模型。这些方法被比较和分析。采用微调来克服罕见情况的数据不平衡问题。比较目标XGBoost在测试集上能够达到94.85%的准确率。

一、引言

This chapter introduces the subject of machine learning for adaptive cruise control target selection. Autonomous systems are expected to cope with plenty of complex situations. However, when the situations are complicated and changeable, the solutions are difficult to design through classic programming. With a relatively large amount of data, it is possible to tackle complex situations through machine learning. The research questions and objectives are formulated in this chapter. A summary of contribution and delimitation is
presented, followed by the consideration of ethics, together with an outline of this thesis.
本章介绍了用于自适应巡航控制目标选择的机器学习主题。自动驾驶系统预计将应对许多复杂情况。然而，当情况变得复杂且多变时，通过传统编程设计解决方案就变得困难。有了相对大量的数据，就有可能通过机器学习来解决复杂的情况。本章中制定了研究问题和目标。在呈现贡献和界定的总结后，考虑了伦理问题，并概述了这篇论文。
In modern society, transportation is closely related to people’s lives and is an indispensable part of life. Many developed cities in the world are encountering problems caused by excessive use of private cars, such as traffic jams,frequent accidents, parking difficulties, energy shortages, noise pollution, and environmental pollution. These problems have severely reduced the quality of life [1]. Autonomous or semi-autonomous vehicles are not only expected to improve the efficiency and safety of the transportation system, but also have
other benefits such as high flexibility, low operating cost, and environmental protection. With these advantages, autonomous vehicles can be used as a useful supplement to urban intensive public transportation, providing highqualityregional transportation services [2].
在现代社会，交通与人们的生活密切相关，是生活中不可或缺的一部分。世界上许多发达城市正面临由于过度使用私家车造成的问题，例如交通拥堵、频繁事故、停车困难、能源短缺、噪音污染和环境污染。这些问题严重降低了生活质量[1]。自动驾驶或半自动驾驶车辆不仅有望提高交通系统的效率和安全性，还有其他优点，如高灵活性、低运营成本和环保。凭借这些优势，自动驾驶车辆可以作为城市密集公共交通的有益补充，提供高质量的区域交通服务[2]。
When a driver is less involved in operating a vehicle, there is an increased demand in sophisticated solutions for making correct driving decisions regarding surrounding traffic. Adaptive Cruise Control(ACC) is an assist function to relieve the driver from having to adapt the vehicle’s speed and distance to surrounding traffic based on data from different sensors, such as radar and camera, which could automatically keep a set time gap to one preceding vehicle to avoid vehicle-to-vehicle collisions [3, 4]. For Adaptive Cruise Control,
it is vital to select the right preceding vehicle as the primary target to avoid accidents and to be able to optimize fuel consumption. It is also to make the driver feel comfortable and be able to rely on this system. For this prerequisite, Scania is exploring how to choose the right target differently. This study is part of a project in Scania CV AB, Autonomous System Group, and Scania provides the necessary data for this thesis project.
当驾驶员在操作车辆方面的参与度降低时，对于正确做出有关周围交通的驾驶决策的复杂解决方案的需求就增加了。自适应巡航控制（ACC）是一种辅助功能，可以减轻驾驶员根据来自不同传感器（如雷达和摄像头）的数据调整车辆速度和与周围交通的距离的负担，这些传感器可以自动保持与前车的时间间隔，以避免车辆之间的碰撞[3, 4]。对于自适应巡航控制来说，选择正确的前车作为主要目标至关重要，以避免事故，并能够优化燃油消耗。这也是为了让驾驶员感到舒适，并能够信赖这个系统。为了这个前提，Scania正在探索如何以不同的方式选择正确的目标。这项研究是Scania CV AB自主系统组项目的一部分，Scania为本论文项目提供了必要的数据。

1.1 问题陈述

There are several sensors, such as radar and camera on Scania trucks. These sensors collect environment information. A series of information on the surrounding environment and vehicles can be obtained through data fusion. For example, the position of the vehicles on the lane could be calculated with this information by the corresponding algorithm. The primary target is chosen through a target selection algorithm, and the information of the primary target is transmitted to a longitudinal controller for the ACC system to maintain a
chosen distance. A simple solution for target selection is to choose the vehicle which drives in the same lane. As shown in Fig.1.1(a), the primary vehicle is the preceding vehicle that drives in the same lane. However, when a vehicle tries to cut in, as shown in Fig.1.1(b), the primary target could be the cut-in vehicle or the one in the same lane. Also, it is difficult to estimate the driving lane when the lane is partly observed or when there are no lane marking.
斯堪尼亚卡车上装有多种传感器，如雷达和摄像头。这些传感器收集周围环境信息。通过数据融合，可以获得一系列关于周围环境和车辆的信息。例如，可以通过相应的算法利用这些信息计算车辆在车道上的位置。通过目标选择算法选择主要目标，并将主要目标的信息传输给纵向控制器，以便自适应巡航控制系统保持选定的距离。目标选择的一个简单解决方案是选择在同一车道上行驶的车辆，如图1.1(a)所示，主要车辆是在同一车道上行驶的前车。然而，当一辆车试图并道时，如图1.1(b)所示，主要目标可能是并道的车辆或同一车道上的车辆。此外，当车道部分被遮挡或没有车道标线时，也很难估计行驶车道。
Figure 1.1: The situations of primary target selection [5]. 在这里插入图片Figure 1.1: The situations of primary target selection [5].描述 A human driver bases its decision in each moment on personal experiences,gained knowledge, and a limited focus. Most existing algorithms consider a few (considered to be significant) situations and neighboring vehicles’status. The possible meaningful targets are detected using a relative distance and velocity with in-lane target detection, motion-based analysis [5]. The target is selected based on position and confidence. For example, mostly, the algorithm will select the closest vehicle in the same lane as the target [6, 7]. When the preceding vehicle changes to another lane or going to stop, the algorithm will consider choosing another target. One advantage of classic programming is the ability to adapt the behavior to situations which may not have occurred yet. However, in our real world, there are many complex traffic situations. It is hard to consider various kinds of situations with classical programming. With access to a large amount of training data and real-time sensor data, machine learning algorithms should be able to make more informed decisions than the current algorithm. Moreover, the truck could be applied to more complex traffic situations and new environments, which could enhance the adaptability and robustness of trucks.
人类驾驶员在每个时刻的决策基于个人经验、获得的知识以及有限的注意力。大多数现有的算法只考虑一些（被认为重要的）情况和邻近车辆的状态。使用基于运动的分析，通过相对距离和速度与车道内目标检测来探测可能有意义的目标。目标的选取基于位置和置信度。例如，算法通常会选择同一车道中最接近的车辆作为目标[6, 7]。当前面的车辆变道或即将停止时，算法将考虑选择另一个目标。经典编程的一个优点是能够适应尚未发生的情况。然而，在我们现实世界中，存在许多复杂的交通情况。用传统编程很难考虑各种情况。有了大量训练数据和实时传感器数据的访问，机器学习算法应该能够比当前算法做出更明智的决策。此外，卡车可以被应用到更复杂的交通情况和新环境中，这可以增强卡车的适应性和鲁棒性。
The most challenging problems are data imbalance and noise. These data were collected from the real world. Some traffic situations are rare. In other word, these data are imbalanced. In addition to the imbalanced traffic situations, the algorithm also needs to handle data with noisy and low accuracy, such as poor lane estimation in curves, incorrect prediction of ego vehicle movement, etc. Also, part of the data is incomplete because of the sensors (the value does not always exist).
最具挑战性的问题包括数据不平衡和噪声。这些数据是从现实世界中收集的。一些交通情况很少见。换句话说，这些数据是不平衡的。除了不平衡的交通情况外，算法还需要处理带有噪声和低准确性的数据，例如曲线中车道估计不佳、自车运动预测错误等。此外，由于传感器的原因，部分数据是不完整的（值并不总是存在）。
Several interesting research questions arise:
• How to choose and preprocess the training data?
• How to solve or overcome the challenges of the data(imbalance, incomplete,incorrect)?
• What kind of model or algorithm can be chosen and extended to adapt to this project?
• How to validate and evaluate the performance of the algorithm(s)?
几个有趣的研究问题出现了：
如何选择和预处理训练数据？
如何解决或克服数据的挑战（不平衡、不完整、不正确）？
可以选择和扩展哪种模型或算法以适应这个项目？
如何验证和评估算法的性能？

1.2 目标和任务

The structure of the ACC system is shown in Fig.1.2. This thesis focuses on the module of primary target selection. The objective is to investigate the possibilities to select the preceding vehicle based on machine learning algorithms correctly. The proposed methods should be implemented, evaluated, and compared to each other. Furthermore, the proposed solution should be suitable for autonomous heavy-duty vehicles, which means the method should be based on the data from the sensor fusion module in Fig.1.2.
自适应巡航控制系统的结构在图1.2中展示。本论文专注于主要目标选择模块。目标是探索基于机器学习算法正确选择前车的可能性。所提出的方法应该被实现、评估，并相互比较。此外，所提出的解决方案应该适用于自动驾驶重型车辆，这意味着方法应该基于图1.2中传感器融合模块的数据。
Figure 1.2: The ACC system structure of the truck. 在这里插入图片描述
Moreover, the tasks can be split into three main parts. The first part is to preprocess the data. The data used for this project are log files which include the information of ego trucks and neighbor vehicles after sensor fusion. These data are in time series, which were recorded every ten milliseconds. And these data were labeled manually. The second part is to develop machine learning algorithms which includes model selection, training, and testing. The third is to evaluate and compare the performance between different machine learning algorithms and visualize the process of selection.
此外，这些任务可以分成三个主要部分。第一部分是对数据进行预处理。这个项目使用的数据是日志文件，这些文件包含了传感器融合后自身卡车和邻近车辆的信息。这些数据是时间序列数据，每十毫秒记录一次。并且这些数据是手动标记的。第二部分是开发机器学习算法，包括模型选择、训练和测试。第三部分是评估和比较不同机器学习算法的性能，并可视化选择过程。

1.3 贡献

A great functioning model for target selection is a significant module in an adaptive cruise control system. Currently, to the best of the author’s knowledge, there is no existing reliable machine learning based model for adaptive cruise control target selection. The main contribution is the implementation and evaluation of several machine learning methods for ACC target selection. And a novel method of target selection by comparing vehicles is proposed.
一个高效运作的目标选择模型是自适应巡航控制系统中的一个重要模块。据作者所知，目前尚不存在一个可靠的基于机器学习的目标选择模型。主要的贡献是实现了几种机器学习方法对ACC目标选择的评估，并提出了一种通过比较车辆进行目标选择的新方法。
The decisions of the primary target selection are based on the information collected through sensors. The most crucial decision-making factors are the states of the surrounding vehicles, but other factors such as the environment and roads also have a certain degree of impact on decision-making. With traditional methods, it is difficult to model and analyze the relation of a variety of complex factors, and when the information obtained by sensors changes, the algorithm also needs to change accordingly. A machine learning-based approach
avoids the analysis of the relationships between various signals manually, and a complex model can take into account more factors. This is useful since on different type of trucks the information may be different. The process of creating a specific mathematical algorithm for each truck can be avoided with machine learning. Machine learning models should be applicable to a variety of similar truck only with the corresponding data to train the model.
主要目标选择的决策基于通过传感器收集的信息。最关键的决策因素是周围车辆的状态，但环境和道路等其他因素也在一定程度上影响决策。使用传统方法，对各种复杂因素之间的关系进行建模和分析是困难的，当传感器获取的信息发生变化时，算法也需要相应地改变。基于机器学习的方法避免了手动分析不同信号之间的关系，一个复杂的模型可以考虑到更多因素。这很有用，因为在不同类型的卡车上，信息可能会有所不同。通过机器学习，可以避免为每种卡车创建特定的数学算法。机器学习模型应该适用于各种类似的卡车，只需用相应的数据来训练模型。
The novelty of the proposed method (compare-target model, see Section 3.2) is that it considers all surrounding vehicles together instead of one by one. Scania’s current algorithm considers the situation of each neighboring vehicle separately and makes decisions based on the condition of each vehicle. But in reality, the target selection should consider all vehicles at the same time since the state of one vehicle affects the choices made to other vehicles. Making decisions by simultaneously considering all vehicles is more reasonable.
We develop a compare-target network and a compare-target XGBoost model based on this method. Moreover, the performances are evaluated and compared with the shared models (shared network and shared-LSTM network, see Section 3.2).
所提出方法（比较目标模型，见第3.2节）的新颖之处在于，它不是一次只考虑一个车辆，而是同时考虑所有周围车辆。Scania目前的算法分别考虑每个邻近车辆的情况，并根据每个车辆的情况进行决策。但实际上，目标选择应该同时考虑所有车辆，因为一个车辆的状态会影响对其他车辆所做的选择。同时考虑所有车辆进行决策更为合理。我们基于这种方法开发了一个比较目标网络和一个比较目标XGBoost模型。此外，与共享模型（共享网络和共享LSTM网络，见第3.2节）的性能进行了评估和比较。
Answering the research questions listed in Section 1.1 can provide insight into how to properly build up a decision model for target selection and which features and situations are of more importance.
回答第1.1节中列出的研究问题，可以为如何正确构建目标选择的决策模型提供见解，以及哪些特征和情况更为重要。

1.4 伦理考量

Machine learning and Deep learning models are deployed into increasingly greater parts of the world. Deep learning as an essential branch of artificial intelligence, is also receiving more and more attention from people.
机器学习和深度学习模型正被部署到世界上越来越多的领域。作为人工智能的一个重要分支，深度学习也正受到越来越多的关注。
Deep learning has an increasing impact on human life, but their internal operations are often opaque. After the deep learning system is trained, it is hard to see how it made the decision. In many cases, this is unacceptable even if it gets the right answer. More and more weaknesses exposed by deep learning are drawing public attention to artificial intelligence. Especially in the field of driverless cars, autonomous driving uses similar deep learning techniques for navigation, which has led to well-known disasters and fatalities
[8]. From a legal point of view, GDPR states that individuals have the right to know the reasoning behind a decision that has affected them adversely, even if the reasoning is purely algorithmic. Thus, the topics of model interpretability and explainability become more paramount.
深度学习对人类生活的影响日益增加，但其内部操作往往是不透明的。在深度学习系统训练完成后，很难看出它是如何做出决策的。在许多情况下，即使系统给出了正确答案，这种不透明性也是不可接受的。深度学习暴露出越来越多的弱点，引起了公众对人工智能的关注。特别是在无人驾驶汽车领域，自动驾驶使用类似的深度学习技术进行导航，这导致了众所周知的灾难和致命事件[8]。从法律角度来看，《通用数据保护条例》（GDPR）规定，个人有权知道对他们不利的决策背后的原因，即使这种推理纯粹是算法性的。因此，模型的可解释性和解释性问题变得更加重要。
Still, undeniably, deep learning is a powerful tool. Deep learning makes it very common to deploy applications such as facial recognition and speech recognition to a level that was almost impossible to achieve ten years ago. Thus, it is hard to imagine that deep learning will be abandoned at this time.It is more likely to modify or enhance the deep learning method, such as combining deep learning with traditional methods to improve the interpretability of the model. We wish that better understandings of machine learning models will provide us with more insight on high dimensional data and contribute to addressing vulnerabilities in the current decision models.
尽管如此，不可否认的是，深度学习是一个强大的工具。深度学习使部署应用，如面部识别和语音识别，达到十年前几乎无法实现的水平。因此，很难想象深度学习会在此时被放弃。更有可能的是修改或增强深度学习方法，例如将深度学习与传统方法相结合，以提高模型的可解释性。我们希望更好地理解机器学习模型能为我们提供更多关于高维数据的洞察，并有助于解决当前决策模型中的脆弱性问题。

1.5 论文大纲

This thesis follows a typical structure of degree reports. Chapter 2 provides readers with the relevant backgrounds and theories necessary for understanding the problems and proposed solutions, also reviews the literature on adaptive cruise control target selection, machine learning, and time series to identify a promising approach. Chapter 3 describes the details of the dataset used and the preprocessing of the dataset. Moreover, this chapter presents the approach for target selection and the methods used for performance evaluation. The results and evaluations of target selection are presented and analyzed in Chapter 4, and discussed in Chapter 5. Finally, Chapter 6 summarizes the work in this thesis together with suggestions for future work.
这篇论文遵循了学位报告的典型结构。第2章为读者提供了理解问题和提出的解决方案所需的相关背景和理论，同时回顾了关于自适应巡航控制目标选择、机器学习和时间序列的文献，以确定一个有希望的方法。第3章描述了所使用的数据集的细节以及数据集的预处理过程。此外，本章介绍了目标选择的方法和用于性能评估的方法。第4章展示了目标选择的结果和评估，并进行了分析，第5章进行了讨论。最后，第6章总结了本论文的工作，并提出了未来工作的建议。

二、背景知识

This chapter presents a series of relevant backgrounds of this thesis project and the related work. Section 2.1 describes the concept of adaptive cruise control and presents the strengths and prerequisites of ACC, also introduces the current algorithm briefly. The theory relevant to the machine learning methods used in this thesis is presented in Section 2.2, including the concept of artificial neural networks, XGBoost, and LSTM. And Section 2.3 presents the related work in this field, including some schemes of primary target selection (Not the one in Scania).
本章介绍与本论文项目相关的背景知识和相关工作。第2.1节描述了自适应巡航控制的概念，并提出了ACC的优势和先决条件，同时简要介绍了当前的算法。第2.2节介绍了在本论文中使用的机器学习方法的相关理论，包括人工神经网络、XGBoost和LSTM的概念。第2.3节则介绍了该领域内的相关工作，包括一些主要目标选择方案（不是斯堪尼亚公司内部的方案）。

2.1自适应巡航控制 (ACC)

Adaptive cruise control is based on conventional cruise control. ACC improves the comfort and energy efficiency of vehicle driving while ensuring safety, and overcomes some limitations of human drivers. The distance and relative speed between the ego vehicle and the preceding vehicle are measured and fused in real time by different sensors. Appropriate control signals are calculated to adjust the ego vehicle speed to control the distance automatically. For the control system of adaptive cruise control, the information of a primary target should be provided. In this case, some research aim at the primary target selection algorithm.
自适应巡航控制基于传统巡航控制发展而来。ACC在确保安全的同时，提升了驾驶的舒适性和能源效率，并克服了人类驾驶员的一些局限。通过不同的传感器实时测量并融合自车与前车的相对距离和速度。计算适当的控制信号来调整自车速度，自动控制与前车的距离。对于自适应巡航控制系统而言，需要提供主要目标的信息。在这种情况下，一些研究致力于主要目标选择算法的开发。
Human limitations, strengths of ACC, and target selection algorithms are discussed in the following subsections.
以下是对自适应巡航控制的进一步讨论，包括人类驾驶员的局限性、ACC的优势以及目标选择算法：

2.1.1 人类局限性

The limitation of the driver’s judgment and reaction on the surrounding traffic conditions (e.g., the driving states of neighboring vehicles) is a critical cause of traffic flow instability, traffic congestion, and traffic accidents [9]. Scholars from all over the world generally believe that if the traffic flow characteristics cannot be effectively improved, it is difficult to obtain a fundamental breakthrough in optimizing road capacity and traffic safety [10]. Automated driving technology is expected to improve the traditional traffic flow characteristics from the microscopic vehicle level, thus providing an effective way to solve traffic problems.
驾驶员对周围交通状况（例如，邻近车辆的行驶状态）的判断和反应能力的局限性是交通流不稳定、交通拥堵和交通事故的关键原因之一[9]。世界各地的学者普遍认为，如果无法有效改善交通流特性，就很难在优化道路容量和交通安全方面取得根本性的突破[10]。自动驾驶技术有望从微观车辆层面改善传统的交通流特性，从而为解决交通问题提供有效的途径。

2.1.2 ACC的优势

Adaptive cruise control is an essential type of Advanced Driver-Assistance Systems (ADAS) in the study of autonomous vehicles. ACC vehicles can obtain the driving status of preceding vehicles in real time through onboard detection devices and have a more timely and accurate traffic condition perception capability than average drivers.
自适应巡航控制是自动驾驶车辆研究中的一种重要的高级驾驶辅助系统（ADAS）。配备ACC的车辆可以通过车载检测设备实时获取前方车辆的行驶状态，相比普通驾驶员，拥有更及时和准确的交通状况感知能力。
ACC systems can automatically adjust the vehicle speed to ensure safety and improve driving comfort and energy saving [11]:
ACC系统可以自动调整车速以确保安全，并提高驾驶舒适性和节能效果[11]：
• Safety: The average time for a normal driver to be aware of a situation to react is about 1.0 to 1.3 seconds [12]. The ACC’s response period is shorter than human drivers. Thus, ACC could more effectively avoid the most traffic accidents.
安全性：普通驾驶员意识到情况并做出反应的平均时间约为1.0到1.3秒[12]。ACC的响应时间比人类驾驶员短。因此，ACC能更有效地避免大部分交通事故。
• Comfort: The driver needs to concentrate during driving and constantly operate the vehicle to maintain a safe braking time. In the case of traffic congestion, the vehicle often repeats forward and stop, and the driver needs to complete the coordination of hands and feet continuously. This is the main cause of driver fatigue, and the ACC can free the driver from this repetitive and stressful task.
舒适性：驾驶员在驾驶过程中需要集中注意力，并不断操作车辆以保持安全制动时间。在交通拥堵的情况下，车辆经常需要反复前进和停止，驾驶员需要持续完成手脚的协调动作。这是导致驾驶员疲劳的主要原因，而ACC可以使驾驶员从这种重复且压力重重的任务中解放出来。
• Energy efficiency: Low carbon life and energy conservations are the themes and trends of social development. More emissions are generated when the driving speed changed fluently, and ACC can keep the vehicle drive smoother. Further, ACC can maintain a proper distance [13], so it can effectively improve the road capacity and ease traffic congestion, which would contribute to good economics. Recent studies have shown that if the proportion of vehicles equipped with ACC systems on the highway reaches 25%, it can eliminate the congestion of highways [14].
能效：低碳生活和节能是社会发展的主题和趋势。驾驶速度的频繁变化会产生更多的排放，而ACC可以使车辆行驶更加平稳。此外，ACC可以保持适当的车距[13]，从而有效提高道路容量和缓解交通拥堵，这将带来经济效益。最近的研究表明，如果高速公路上配备ACC系统的车辆比例达到25%，就可以消除高速公路的拥堵[14]。

2.1.3 目标选择

Since there are more than one vehicle in the real highway situations and various transitions between the ego truck and the surrounding vehicles occur, it is necessary to set up a proper target selection strategy to apply the adaptive cruise control system to multi-vehicle scenarios. To this end, the algorithm needs to determine which surrounding vehicle is the best target for ACC systems based on current traffic situations. The primary target selection module forwards the information to the longitudinal controller after determining the target within the lane to navigate the ego truck smoothly and guarantee safety in complex traffic situations. The current existing primary target selection algorithms are based on classic programming, considering several significant situations. The truck obtains the surrounding information through sensors, and after the data fusion and calculation, some states of the surrounding vehicle and the ego truck can be obtained. Using the position, speed, lane placement, confidence, and other information possible targets are determined. Finally, based on a series of criteria, it chooses the most appropriate one of these possible targets as the primary target.
在真实的高速公路情况下，由于存在多辆车辆以及自车与周围车辆之间发生各种交互，因此有必要制定合适的目标选择策略，以便将自适应巡航控制系统应用于多车场景。为此，算法需要根据当前的交通情况，确定哪个周围车辆是ACC系统的最佳目标。在确定了车道内的目标后，主要目标选择模块将信息转发给纵向控制器，以在复杂的交通情况下平稳地导航自车并保证安全。目前现有的主要目标选择算法基于传统编程，考虑了一些重大情况。卡车通过传感器获取周围信息，经过数据融合和计算后，可以获得周围车辆和自车的一些状态。利用位置、速度、车道位置、置信度等信息，确定可能的目标。最后，根据一系列标准，从这些可能的目标中选择最合适的一个作为主要目标。

2.2 机器学习

The purpose is to use machine learning methods to address the target selection of adaptive cruise control. Artificial neural networks (ANN) are computational models that mimic the structure and function of biological neural networks that have been used to solve a wide variety of problems. Extreme gradient boosting (XGBoost) has an outstanding performance in many machine learning competitions. The data set has problems with imbalance and noise. XGBoost is a tree-based approach that is less sensitive to data. The data set contains time information. Thus LSTM is used for the time sequences. The concept of artificial neural networks, extreme gradient boosting, and long short-term memory are introduced in this section.
目的：本研究旨在使用机器学习方法解决自适应巡航控制的目标选择问题。人工神经网络（ANN）是模仿生物神经网络结构和功能的计算模型，已用于解决各种问题。极端梯度提升（XGBoost）在许多机器学习竞赛中表现出色，对数据不平衡和噪声问题不太敏感。数据集包含时间信息，因此使用长短期记忆（LSTM）网络处理时间序列。本节介绍了人工神经网络、极端梯度提升和长短期记忆的概念。数据集存在不平衡和噪声问题。XGBoost是一种基于树的方法，对数据不太敏感。数据集包含时间信息，因此使用LSTM来处理时间序列。在本节中，将介绍人工神经网络、极端梯度提升和长短期记忆的概念。

2.2.1 人工神经网络

In computational science, artificial neural network models are inspired and abstracted from the central nervous system of biological and are now commonly used in pattern recognition and machine learning, also known as artificial neural networks. The nervous system of animals is made up of a large number of neurons or nerve cells. The stimulation signals of peripheral neurons are continuously transmitted for communication. Similarly, artificial neural networks are computational models consisting of a large number of connected
nodes (or neurons) connected. Each node represents a specific output function called activation function. Starting from a certain input data, each node will get the corresponding input to produce the corresponding output as the input of the next node. In this way, the information is continuously transmitted to each node until the output is obtained. The connection between every two nodes represents a weighting value for transmitting the signal, called weight, which is equivalent to the memory of the artificial neural network. Similarly,
the animal’s nervous system constantly adjusts the synaptic connections between neurons while learning, and the artificial neural network also adjusts the weight of connections between nodes. The output of the network varies depending on the connection in the network, the weight value, and the activation function [15].
在计算科学中，人工神经网络模型受到生物中枢神经系统的启发和抽象，现在通常用于模式识别和机器学习，也被称为人工神经网络。动物的神经系统由大量的神经元或神经细胞组成。外周神经元的刺激信号不断地被传递以进行通信。类似地，人工神经网络是由大量相互连接的节点（或神经元）组成的计算模型。每个节点代表一个特定的输出函数，称为激活函数。从某个输入数据开始，每个节点将获得相应的输入以产生相应的输出，作为下一个节点的输入。通过这种方式，信息不断地传递到每个节点，直到获得输出。每两个节点之间的连接表示一个传递信号的权重值，称为权重，相当于人工神经网络的记忆。同样，动物的神经系统在学习和不断调整神经元之间的突触连接，人工神经网络也调整节点之间连接的权重。网络的输出根据网络中的连接、权重值和激活函数而变化[15]。
The network mainly refers to the connection between neurons at different levels in the system. Taking a typical three-layer artificial neural network as an example. The first layer has only input neurons which transmit the input signals to the neurons on the second layer through synapses (connections between neurons). Then the signals are transmitted to the third layer which is the neuron of the output layer. More complex networks may contain more levels of neurons, and there may be more input neurons and output neurons. The multi-layer model has a stronger ability to describe and manipulate nonlinear problems.
网络主要指系统内不同层次的神经元之间的连接。以一个典型的三层人工神经网络为例。第一层只有输入神经元，它们通过突触（神经元之间的连接）将输入信号传递给第二层的神经元。然后信号被传递到第三层，即输出层的神经元。更复杂的网络可能包含更多层次的神经元，并且可能有更多输入神经元和输出神经元。多层模型具有较强的描述和处理非线性问题的能力。
A multi-layer artificial neural network model uses a back-propagation feedback strategy to adjust the weight [16, 17, 18] by computing the gradient of the cost function. In this way, the input data can be repeatedly fed into the artificial neural network for calculation, and the artificial neural network is adjusted. An error can be obtained by comparing the actual output of each calculation with the desired output. According to this error, the weights in the artificial neural network can be adjusted so that the next time the same data is input to the artificial neural network, the output has a smaller error. This repetitive process is called learning (training).
多层人工神经网络模型使用反向传播反馈策略来调整权重[16, 17, 18]，通过计算代价函数的梯度。这样，输入数据可以反复输入到人工神经网络进行计算，并对人工神经网络进行调整。通过比较每次计算的实际输出与期望输出，可以获得一个误差。根据这个误差，可以调整人工神经网络中的权重，以便下次相同的数据输入到人工神经网络时，输出的误差更小。这个重复的过程称为学习（训练）。

2.2.2 极端梯度提升 (XGBoost)

EXtreme Gradient Boosting is a boosting algorithm based on the regression tree that has advantages of fast running speed and good performance, an improvement of Gradient Boosting Decision Tree (GBDT) [19]. The effects of this system have been verified by a large number of machine learning and data mining competitions [20, 21].
极端梯度提升（EXtreme Gradient Boosting，XGBoost）是一种基于回归树的梯度提升算法，它具有运行速度快和性能好的优点，是梯度提升决策树（Gradient Boosting Decision Tree，GBDT）的改进版[19]。这一系统的效果已经通过大量的机器学习和数据挖掘竞赛得到了验证[20, 21]。
Boosting algorithm is a kind of ensemble learning algorithm, which is based on the concepts of strongly learnable and weakly learnable that Kearns and Valiant proposed. And they proposed a theorem that a problem is strongly learnable if and only if it is weakly learnable [22].
提升算法是一种集成学习算法，基于Kearns和Valiant提出的强可学习和弱可学的概念。他们提出了一个定理，即一个问题如果是强可学习的，当且仅如果它是弱可学习的[22]。
First, a number of basic weak models are generated and trained. After obtaining the results of multiple models, the Boosting method weights the results of these models to output the final result. In the Boosting framework, the accuracy of each weak classifier may be low, but after weighted fusion, the final result is greatly improved [23].
首先，生成并训练一系列基本的弱模型。在获得多个模型的结果后，提升方法对这些模型的结果进行加权，以输出最终结果。在提升框架中，每个弱分类器的准确性可能较低，但经过加权融合后，最终结果大大改善[23]。
Gradient Boosting Decision Tree (GBDT) is an improvement of Boosting algorithm. GBDT uses tree models as basic models. The final result is obtained by iteration. Each iteration is to reduce the residual of the last iteration [24]. When the data set is large and complex, the GBDT algorithm is computationally intensive and inefficient. In 2015, Tianqi Chen proposed XGBoost improving this shortcoming of the GBDT algorithm [20].
梯度提升决策树（GBDT）是提升算法的改进版。GBDT使用树模型作为基本模型，通过迭代获得最终结果。每次迭代是为了减少上一次迭代的残差[24]。当数据集庞大且复杂时，GBDT算法计算量大且效率低下。2015年，陈天奇提出了XGBoost，改进了GBDT算法的这一缺点[20]。

2.2.3 长短期记忆网络 (LSTM)

Different from the traditional feedforward neural network, the signal feedback structure in the recurrent neural network (RNN) makes the output state of the network at a certain moment related to the historical signal before this moment, thus showing certain dynamic characteristics and memory ability.
In recent years, RNN has achieved remarkable success in speech recognition, text analysis, and other issues [25, 26]. One important reason is that the long-short-time-memory unit has been used in the structure of the RNN. LSTM uses different gates to enhance the memory of the network, thus solving the problem of vanishing gradient [27]. The structure of the RNN is expanded, as shown in Fig.2.1, and its structure is similar to a model with multiple layers of the same network. However, vanilla recurrent neural networks have limited memory because the gradient finally vanishes or explodes when it propagates through time.
不同于传统的前馈神经网络，递归神经网络（RNN）中的信号反馈结构使得网络在某一时刻的输出状态与之前的历史信号相关，因此表现出一定的动态特性和记忆能力。近年来，RNN在语音识别、文本分析等领域取得了显著成功【25, 26】。其中一个重要原因是长短期记忆单元（LSTM）被引入了RNN的结构中。LSTM通过使用不同的门控机制来增强网络的记忆能力，从而解决了梯度消失问题【27】。RNN的结构如图2.1所示，其结构类似于具有多层相同网络的模型。然而，传统的RNN由于梯度在时间传播过程中最终消失或爆炸，导致其记忆能力有限。
Figure 2.1: unrolling a recurrent neural network [28]. A is a chunk of neural network, Xt is the input, and ht is the output.
在这里插入图片描述 The LSTM-based RNN addressed this issue by improving the internal structure based on this chain structure. The gate architectures are used in LSTM unit to keep constant error when propagating. Among these gates, the forgot gate
determines which information needs to be discarded from the cell states at the previous moment. The update gate determines the cell states that need to be updated. The output gate will filter information based on cell states. These three gates are implemented using sigmoid function. As shown in Fig.2.2, the most prominent characteristic is the use of three sigmoid layers and point-wise multiplication operations to strengthen the control of the information transmission. For more details about the structure of LSTM and the functionality of gates, [28, 29] are recommended for readers.
基于LSTM的RNN通过改进这种链式结构的内部结构解决了这一问题。LSTM单元中的门控结构用于在传播过程中保持恒定误差。在这些门中，遗忘门决定从前一刻的细胞状态中需要丢弃哪些信息，更新门决定需要更新的细胞状态，输出门将根据细胞状态过滤信息。这三个门是通过sigmoid函数实现的。如图2.2所示，其最显著的特征是使用了三个sigmoid层和逐点乘法操作来加强对信息传输的控制。关于LSTM结构和门控功能的更多详细信息，建议读者参考【28, 29】。
Figure 2.2: The structure of a network with three LSTM cells [28]
在这里插入图片描述 In summary, LSTM-RNN uses the gates to control the memory of the RNN model. In the training phase, the weights and biases are learned from the historical data, and the characteristics of the historical states are identified and memorized.
总之，LSTM-RNN通过门控机制控制RNN模型的记忆。在训练阶段，权重和偏置通过历史数据进行学习，并识别和记住历史状态的特征。

2.3 相关工作

This section overviews the previous related research on adaptive cruise control target selection. Seungwuk, Hyoung-Jin, and Kyongsu [5] studied the target selection strategy through simulations of several driving scenarios, using the information from sensors. A driving area of the ego vehicle is determined using the curvature estimated by the yaw rate and the speed, and the width of the driving area is calculated by the vehicle width. A monitoring area is defined to detect the neighboring vehicle inside the ego vehicle driving area.
The in-lane vehicles are determined by considering the relative distance and velocity of detected vehicles. Three indexes of longitudinal motion, lateral motion, and warning are calculated to determine the driving status of each target by comparison. The closest target with representative driving status is determined as the primary target.
本节概述了关于自适应巡航控制目标选择的相关研究。Seungwuk、Hyoung-Jin 和 Kyongsu [5] 通过模拟多种驾驶场景，利用传感器信息研究了目标选择策略。利用偏航率和速度估算曲率，确定自车的驾驶区域，并根据车辆宽度计算驾驶区域的宽度。定义了一个监控区域来检测自车驾驶区域内的邻近车辆。通过考虑检测到的车辆的相对距离和速度来确定车道内的车辆。通过比较计算纵向运动、横向运动和警告三个指标，确定每个目标的驾驶状态，并将具有代表性驾驶状态的最近目标确定为主要目标。
Il-ki et al. [30] analyzed several essential situations and make corresponding decisions for each situation under the assumptions that the ego vehicle is in the middle of the lane and the width of the lane is known. For instance, when the sign of the lateral position is opposite to the sign of the lateral velocity, the target is approaching the lane of the host vehicle. When this happens, a weighted lateral position is adopted in the primary target decision process. The weighted position function makes a rapid decision possible, while also considering the fact that the cut-in vehicle is not yet in the ego vehicle lane. The primary target selection algorithm in [30] integrates additional information from multiple models, which allows the ACC system to adapt to another target more intelligently and more reminiscent of human-drivers. Because the vehicle does not always run straightly with the curvature of the lane, and the radar cannot estimate the change in curvature, sometimes confusions of the target occur.
Il-ki 等人 [30] 在假设自车位于车道中央且已知车道宽度的情况下，分析了几种重要情况，并为每种情况做出相应决策。例如，当横向位置的符号与横向速度的符号相反时，目标车辆正在接近自车车道。这时，在主要目标决策过程中采用加权横向位置。加权位置函数使快速决策成为可能，同时考虑到插入车辆尚未进入自车车道。文献[30]中的主要目标选择算法整合了多个模型的信息，使ACC系统能够更智能地适应其他目标，更类似于人类驾驶员。由于车辆并不总是按照车道曲率直线行驶，雷达无法估算曲率变化，有时会发生目标混淆。
Eskandarian and Azim[31] introduced two methods for target selection to improve the performance. Using the steering angle or yaw rate sensor on the ego-vehicle to estimate the curvature is a common approach, but this method is only valid when the curvature constant. Aided visual sensors are utilized as another approach for detection. However, these sensors work unsatisfactorily and further raise the cost. In this thesis project, lane placement is used as the input features, which could represent the in-lane condition.
Eskandarian 和 Azim [31] 提出了两种目标选择方法以提高性能。使用自车上的转向角或偏航率传感器估算曲率是常见方法，但该方法仅在曲率恒定时有效。另一种方法是使用辅助视觉传感器进行检测。然而，这些传感器工作不理想且进一步增加了成本。在本论文项目中，车道位置作为输入特征，代表车道内状况。
Shifeng et al. [32] developed a method that takes the operational behavior of the ego vehicle into consideration and provides sufficient qualitative analysis of trajectories. Shifeng et al. use both preliminary classification and final classification to decide different operating conditions, combine with the trajectory to determine the primary target. This approach overcomes some typical shortcomings of traditional ACC, such as confusion between lane changes and curve enters of preceding vehicles. In this project, the variable path placement is one of the input features which includes the information of trajectories.
Shifeng 等人 [32] 开发了一种考虑自车操作行为的方法，并提供了充分的轨迹定性分析。Shifeng 等人通过初步分类和最终分类决定不同的操作条件，结合轨迹确定主要目标。这种方法克服了传统ACC的一些典型缺点，如车道变更和前车进入弯道之间的混淆。在本项目中，可变路径位置是包含轨迹信息的输入特征之一。
Jianqiang et al. [33] proposed a method to improve the accuracy of the primary target selection based on multi-features fusion. The data is preprocessed by distance compensation factor (DCF) correction to correct the in-lane probability provided by the lidar. Kalman filtering is adopted to track and predict the distance and velocity of neighboring vehicles [34]. Besides, a two-layer artificial neural network is designed to get the importance weight of feature variables. The training output is finally used for target selection. The method metioned in [33] adopted an artificial neural network to predict the lane probability of a vehicle. In this thesis project, an artificial neural network will be
adopted to predict the selected target.
Jianqiang 等人 [33] 提出了一种基于多特征融合的主要目标选择方法。数据经过距离补偿因子(DCF)修正，修正了激光雷达提供的车道内概率。采用卡尔曼滤波器跟踪和预测邻近车辆的距离和速度【34】。此外，设计了一个两层人工神经网络以获取特征变量的重要权重。训练输出最终用于目标选择。文献[33]中的方法采用人工神经网络预测车辆的车道概率。在本论文项目中，将采用人工神经网络预测选择的目标。
A video and radar data fusion system was developed [35] for improved target selection. This framework is capable of fusing target information captured from a camera system and a radar system. The information from the two sensor systems is preserved as low as possible to facilitate exchange. An improved path selection and fused target states are used to identify targets that perform cut-in or cut-out maneuvers. This information is utilized in the primary target selection scheme. The methods of this thesis are data-driven. Thus the accuracy of data has a large impact on the results. More sensors could be utilized to improve the accuracy of sensor fusion.
开发了一种视频和雷达数据融合系统【35】，以改进目标选择。该框架能够融合来自摄像系统和雷达系统的目标信息。尽量保留两传感器系统的信息以便交换。改进的路径选择和融合的目标状态用于识别执行切入或切出动作的目标。这些信息用于主要目标选择方案。论文的方法是数据驱动的，因此数据准确性对结果有很大影响。可以利用更多传感器来提高传感器融合的准确性。

三、方法论

This chapter presents the methods used to select a target for adaptive cruise control. The data collection, annotation, and preprocess are introduced in Section 3.1, as well as some details about the data. Section 3.2 introduces the problem-solving strategies, and the architectures of machine learning models, including shared models and compare-target models. Section 3.4 describes the algorithm evaluation method.
本章介绍了用于自适应巡航控制目标选择的方法。第3.1节介绍了数据收集、注释和预处理的方法，并提供了一些关于数据的详细信息。第3.2节介绍了问题解决策略，以及机器学习模型的架构，包括共享模型和比较目标模型。第3.4节描述了算法的评估方法。

3.1 数据准备

The experiments in this project are based on data collected and fused by Scania’s test vehicles. Because supervised learning is used, the data need to be annotated. In the next section, several methods are introduced, and for different models, different pre-processing of the data is required.
The details of the data, the annotations, and the pre-processing methods are described in the following subsection.
本项目中的实验基于由斯堪尼亚测试车辆收集和融合的数据。由于使用了监督学习，因此需要对数据进行注释。在下一节中，将介绍几种方法，并针对不同的模型，数据需要进行不同的预处理。以下小节中详细描述了数据、注释和预处理方法的细节。

3.1.1 数据介绍

The raw data for this thesis project was extracted from Scania’s database. There is one radar and one camera on Scania’s test trucks. The data is recorded by these sensors and stored in log files. The input data to the target selection algorithm comes from a sensor fusion algorithm based on the raw data. These trucks have driven in several different highways in Europe.
本论文项目的原始数据来自斯堪尼亚的数据库。在斯堪尼亚的测试卡车上配有一个雷达和一个摄像头。这些传感器记录的数据存储在日志文件中。目标选择算法的输入数据来自基于原始数据的传感器融合算法。这些卡车行驶在欧洲的多个不同高速公路上。
These data are in time series and recorded at 100 Hz. Of the data in the database, a total of 26 logs have been labeled in the way described in Section 3.1.2. Each log file includes information for around 4 minutes. The information of time length and target changes are presented in Appendix A.1. As for the information, after sensor fusion and a series of calculations, many features of the ego truck and neighbor vehicles can be obtained. However, in order to compare with the current algorithm, the same input as the current algorithm is
used. These statuses contain the states of the ego truck and neighbor vehicles, such as position, velocity, path placement, lane placement, path confidence, lane confidence, type, and other path information. The parameters that are interesting in this project and the descriptions are listed in Appendix A.2.
这些数据是时间序列数据，以100 Hz的频率记录。在数据库中的数据中，共有26个日志按照第3.1.2节描述的方式进行了标注。每个日志文件包含大约4分钟的信息。时间长度和目标变化的信息在附录A.1中提供。通过传感器融合和一系列计算，可以获得自车及邻车的许多特征。然而，为了与当前算法进行比较，使用了与当前算法相同的输入。这些状态包含自车及邻车的状态，如位置、速度、路径位置、车道位置、路径置信度、车道置信度、类型和其他路径信息。项目中感兴趣的参数及其描述列在附录A.2中。
One of the main challenges comes from the data set. The position, velocity, width, and other parameters detected by sensors are not always accurate. Thus the path placement, lane placement, and confidence are also not very reliable.
数据集带来的一大挑战是，传感器检测到的位置、速度、宽度等参数并不总是准确的。因此，路径位置、车道位置和置信度也不十分可靠。
For example, the estimation of lane placement is not very accurate. By inspecting the parameters and raw video of the data set, it can be observed that sometimes the vehicle is in the same lane as the ego truck, but due to the inaccuracy, the value of the lane placement indicates that the vehicle is still in the other lane. Vehicles in other lanes are usually not selected as the primary target. For example, see Fig.3.1. We can see that part of the white vehicle on the right side is in the same lane as the ego truck.
例如，车道位置的估计并不十分准确。通过检查数据集的参数和原始视频，可以观察到，有时车辆与自车在同一车道上，但由于不准确性，车道位置的值表明车辆仍在另一车道上。通常情况下，其他车道的车辆不会被选为主要目标。例如，见图3.1。我们可以看到右侧白色车辆的一部分与自车在同一车道上。

Figure 3.1: An example of not accurate estimation. The value of the lane placement of the white vehicle is -2.16, and the lane confidence is 1.00. The definition of the lane placement is illustrated in Fig. 3.2.
在这里插入图片描述 The definition of lane placement based on the position of a vehicle on the driving lane. For example, as Fig.3.2 indicates, when the vehicle is exactly on the left of the right side lane, the value of lane placement is -2.
车道位置的定义是基于车辆在驾驶车道上的位置。例如，如图3.2所示，当车辆恰好在右侧车道的左侧时，车道位置的值为-2。
Figure 3.2: Lane placement definition. The value of the vehicle’s lane placement depends on where the vehicle is located.
在这里插入图片描述 Based on the definition, the lane placement value of this white vehicle should be greater than -2 (Because this vehicle is on the right side, the value should be between -1 to -2). However, the estimated value is -2.16, and the confidence is 1.
基于定义，这辆白色车的车道位置值应该大于-2（因为这辆车在右侧车道上，所以该值应在-1到-2之间）。然而，估算的值是-2.16，且置信度为1。
Another challenging problem is imbalance and rare situations. When there is a lane change, vehicle cut-in, etc., the selection of the target needs to be more cautiously handled. However, since the data is collected in the real world, lane changes and vehicle cut-in are much less frequent than other typical situations. The percentage of these situations in the data set is small.
另一个具有挑战性的问题是数据不平衡和罕见情况。当出现车道变换、车辆插入等情况时，目标的选择需要更加谨慎。然而，由于数据是在现实世界中收集的，车道变换和车辆插入的情况远少于其他典型情况。这些情况在数据集中所占的比例较小。

3.1.2 数据注释

The data set is labeled manually. The annotation is the ID of the vehicle that should be selected at each moment. The testing truck collecting this data set can simultaneously detect 22 surrounding vehicles, so the annotation is from 0 to 22 (0 means that no vehicle selected as the primary target at the current time). An example of the ground truth is shown in Fig.3.3. From step 20000 to step 25000 (200 sec to 250 sec), the ID of the best target is 6. It can be observed from this figure that at approximately the 27000th step, the primary
target is changed from 6 to 13.
该数据集是人工标注的，标注是每个时刻应该选择的车辆的ID。采集该数据集的测试卡车可以同时检测到22辆周边车辆，因此标注从0到22（0表示当前时刻没有车辆被选为主要目标）。图3.3给出了ground truth的一个例子。从第20000步到第25000步（200秒到250秒），最佳目标的ID是6。从该图中可以观察到，大约在第27000步，主要目标从6变为了13。

Figure 3.3: The ground truth of target selection of log file 30 on each time step.
在这里插入图片描述

3.1.3 数据预处理

Normalization
数据归一化
Normalization is to preprocess data so that the values fall into a uniform range of values, such as [0,1]. The normalization can eliminate the influence of data dimension on modeling and the importance bias caused by some numerical differences, which can speed up the training and promote the convergence of the algorithm.
归一化是预处理数据的过程，使数值落入统一的数值范围，例如 [0,1]。归一化可以消除数据维度对建模的影响和由一些数值差异引起的重要性偏差，这有助于加快训练速度并促进算法的收敛。
In this project, all features in the dataset are scaled to the range [0, 1] on the whole data set according to Equations 3.1, where X is a feature before normalization, X′ is a feature after normalization. Fig.3.4 is an example of normalized data.
在这个项目中，数据集中的所有特征都按照方程式 3.1 缩放到 [0, 1] 的范围上。这里，X 表示归一化前的特征，X′ 表示归一化后的特征。图 3.4 展示了归一化数据的示例。
在这里插入图片描述
Padding
数据填充
The data set at each moment contains features up to 22 neighboring vehicles around the ego truck, but not all the time there are 22 vehicles. When the number of vehicles is less than 22, zero is used to pad the features of the nonexistent vehicle. This padding is only used for the input data of shared models in Section 3.2.
数据集中每个时刻包含围绕自动驾驶卡车的最多22辆邻近车辆的特征，但并不总是有22辆车。当车辆数量少于22辆时，会用零来填充不存在车辆的特征。这种填充仅用于项目第3.2节中共享模型的输入数据。

Figure 3.4: An example of normalized data. 在这里插入图片描述 Moreover, before converting the normalized data to time-series samples, zero is used to pad sequences to the same length at the beginning. As shown in Fig.3.5.
此外，在将归一化数据转换为时间序列样本之前，用户会使用零在序列开头填充到相同的长度。如其项目中图3.5所示。

Figure 3.5: Using zero to pad before each sequence.
在这里插入图片描述 Time Series
时间序列
For the Shared-LSTM Network explained in Section 3.2.1, the time series data is used as input. The network makes decisions based on the timing information of each vehicle. Therefore, it is necessary to convert the original single time data sample into time series data samples.
在第3.2.1节中解释的共享LSTM网络中，使用时间序列数据作为输入。网络根据每辆车辆的时间信息做出决策。因此，需要将原始的单个时间数据样本转换为时间序列数据样本。
The main arguments of this process are the slice window size and sampling time. The sampling frequency of the original data is 100 Hz, which means the sampling time is 0.01 second. In order to reduce the difficulty of network training and the time required for training, it is necessary to set the sampling time and resample in each window to reduce the length of time sequence. For example, as Fig.3.6 indicates, the slice window size is 5 and the sampling time is 0.02s. In Fig.3.6(a), the blue shaded part is the first time series sample. Then sliding the window to get more sample, as the red shaded part in Fig.3.6(b) illustrates.
这一过程的主要参数是切片窗口大小和采样时间。原始数据的采样频率为100 Hz，这意味着采样时间为0.01秒。为了降低网络训练的难度和所需的训练时间，需要设置每个窗口的采样时间并进行重新采样，以减少时间序列的长度。例如，如图3.6所示，切片窗口大小为5，采样时间为0.02秒。在图3.6(a)中，蓝色阴影部分是第一个时间序列样本。然后滑动窗口以获取更多样本，如图3.6(b)中红色阴影部分所示。

Figure 3.6: An example of generating time series data. Slice window size is 5, the sampling time is 0.02 second, and the sequence length is 3.
在这里插入图片描述 One-Hot Encoding
独热编码
The data set is labeled as the ID of the vehicle that needs to be selected at each moment. And the IDs are values from 0 to 22. Usually, machine learning tutorials suggest preparing data in a specific way. A typical instance is the use of one-hot encoding on categorical data or integer encoding, which is mainly for implementing machine learning algorithms efficiently [36].
在数据集中，每个时刻的标签是需要选择的车辆的ID，并且这些ID的取值范围是从0到22。通常，机器学习教程建议以特定的方式准备数据。一个典型的例子是在分类数据上使用独热编码或整数编码，这主要是为了高效实现机器学习算法 [36]。
If there are natural order relationships between each class, machine learning algorithms can understand it under integer encoding. Otherwise, integer encoding is not enough. In this project, there is no relationship between the ID of each vehicle, so it is necessary to convert the integer encoding to one-hot encoding. For example, when the maximum integer value is 22, the onehot encoding representation of the integer encoding value 5 is represented as Fig.3.7
如果每个类之间存在自然的顺序关系，那么机器学习算法可以在整数编码下理解这种关系。否则，仅使用整数编码可能不足够。在这个项目中，每辆车辆的ID之间没有关系，因此需要将整数编码转换为独热编码。例如，当最大整数值为22时，整数编码值为5的独热编码表示如图3.7所示。

Figure 3.7: One-hot encoding representation of the integer encoding value 5, and the maximum integer value is 22.
在这里插入图片描述
Concatenation of Features
特征的连接
For the compare-target network and tree model explained in Section 3.2.2, the most suitable vehicle is selected as the primary target by comparing the neighboring vehicles with each other. In order to achieve comparison, data need to be concatenated and relabeled. We concatenated the features of the primary target with each other surrounding vehicles and labeled 1 or 0 according to the order of the features.
描述的是在项目中第3.2.2节中解释的比较目标网络和树模型的过程，其中通过比较相邻车辆的特征来选择最适合的车辆作为主要目标。为了进行比较，需要对数据进行连接和重新标记。
Take the situation in Fig.3.8 as an example. Vehicle-1 (within the red frame) and vehicle-2 (within the yellow frame) are surrounding vehicles, and vehicle-1 is labeled as the primary target.
In Fig.3.9(a), the features of vehicle-1 (target vehicle) is connected with the features of vehicle 2 (non-target vehicle), and the annotation is 1. In Fig.3.9(b), conversely, the features of vehicle 2 (non-target vehicle) is connected with the features of vehicle 1 (target vehicle) and labeled as 0.
在图3.8中，以车辆-1（红框内）和车辆-2（黄框内）为例，它们是周围的车辆，车辆-1被标记为主要目标。在图3.9(a)中，车辆-1（目标车辆）的特征与车辆-2（非目标车辆）的特征连接，并标注为1。相反，在图3.9(b)中，车辆-2（非目标车辆）的特征与车辆-1（目标车辆）的特征连接，并标注为0。

3.2 模型架构

This section mainly introduces the theories and model structures of implemented methods. Two types of models are introduced, the shared model and the compare-target model. The compare-target model is an original method.
这部分主要介绍了实现方法的理论和模型结构。介绍了两种类型的模型：共享模型和比较目标模型。比较目标模型是一个原创方法。
Figure 3.8: An example of the driving situation [5]. Vehicles within the red and yellow frames are surrounding vehicles, the blue vehicle is the ego vehicle.
在这里插入图片描述
Figure 3.9: An example of concatenating features for vehicles comparasion. In this example, each vehicle has four features.

With shared models, each vehicle can share the same model. When combining LSTM with the shared network, time series can be used as input, taking into account the factors of time. The compare-target model can take the interactions between two vehicles into consideration. Moreover, the tree-based method XGBoost is adopted. Because the samples are imbalanced and the tree-based model is less sensitive to data imbalance. A summary of the pros and cons is listed in Table 3.1.
在用户的项目中，每辆车辆可以共享同一个模型。他们将LSTM与共享网络结合，以处理时间序列输入，考虑到时间因素。比较目标模型能够考虑两辆车辆之间的交互作用。此外，由于样本不平衡，他们采用了基于树的方法XGBoost，因为树模型对数据不平衡的敏感性较低。用户在表3.1中总结了这些方法的优缺点。
在这里插入图片描述

3.2.1 共享模型

Shared Model Structure
共享模型架构
The decision at every moment depends on the states of the neighboring vehicles and the ego truck. Therefore, during training, each sample should include information about neighboring vehicles and ego truck, and neighboring vehicles should share one model.
每个时刻的决策依赖于周围车辆和自动驾驶卡车的状态。因此，在训练过程中，每个样本都应包含有关周围车辆和自动驾驶卡车的信息，并且周围车辆应共享同一个模型。
For instance, when we use an artificial neural network model to evaluate the semantic similarity between two sentences. A model has two inputs that are the two sentences to compare. The model outputs a score ranging from 0 to 1, which represents the similarity. Under this scenario, semantic similarity is a corresponding relationship, and the positions of the two input sentences can be exchanged. So it is unreasonable to train two models separately to handle two inputs. Instead, these two inputs should be evaluated using the same model [37, 38]. Similarly, in this project, each input (information on neighboring vehicles) is exchangeable. For this reason, it would not make
sense to learn several independent models to process each neighbor vehicle.
举例来说，当我们使用人工神经网络模型评估两个句子之间的语义相似度时，模型有两个输入，即要比较的两个句子。模型输出一个从0到1的分数，表示相似度。在这种情况下，语义相似度是一种对应关系，两个输入句子的位置可以互换。因此，单独训练两个模型来处理这两个输入是不合理的。相反，应该使用同一个模型来评估这两个输入。类似地，在这个项目中，每个输入（周围车辆的信息）是可互换的。因此，学习多个独立模型来处理每辆邻近车辆是没有意义的。
The purpose of using the same model can be achieved by the shared model structure. The structure of shared model is shown in Fig. 3.10. The signals of neighboring vehicles are input to the same model, and the corresponding outputs are obtained. The loss function is based on these outputs.
通过共享模型结构可以实现使用同一个模型的目的。共享模型的结构如图3.10所示。周围车辆的信号被输入到同一个模型中，并获得相应的输出。损失函数基于这些输出。

Figure 3.10: Shared model Structure.
在这里插入图片描述 Shared Network
共享网络
The structure of a shared network for target selection is proposed, see Fig. 3.11. The input layer of the model has two parts. The left side input layer inputs the features of the ego truck. The right side layer inputs the features of the neighboring vehicles. The description of the features is shown in Section 3.1.3. For the hidden layer, the features of the ego truck are connected to two fully connected layers. The features of neighboring vehicles are connected to two shared fully connected layers and the activation function is ReLU. The output of each surrounding vehicle feature of the shared model is respectively concatenated to the output of the ego truck features. Then the model inputs the concatenated signals into a shared fully connected layer The sigmoid function is selected as the activation function. The outputs of this layer are concatenated. Finally, connecting the result to a fully connected layer and setting the softmax function as the activation function.
用户在他们的项目中提出了用于目标选择的共享网络结构，如图3.11所示。模型的输入层包括两部分：左侧输入自动驾驶卡车的特征，右侧输入周围车辆的特征。特征的详细描述在他们项目的第3.1.3节中给出。对于隐藏层，自动驾驶卡车的特征连接到两个全连接层，而周围车辆的特征连接到两个共享的带ReLU激活函数的全连接层。共享模型的每个周围车辆特征输出分别与自动驾驶卡车特征的输出连接起来。连接后的信号输入到一个带有sigmoid激活函数的共享全连接层。这一层的输出被连接起来，并连接到一个带softmax激活函数的全连接层以得到最终的输出。

Figure 3.11: The model structure of shared network
在这里插入图片描述
Shared-LSTM Network
共享LSTM网络
The raw data contains information over time. Thus we consider taking a time series as input and making decisions with the timing information of each vehicle. The information of time sequence data is introduced in Section 3.1.3. For the model structure, the structure of the shared-LSTM network is similar to that of the shared network, see Fig. 3.12. The difference of shared-LSTM network is that the first fully connected layer after the input layer is replaced with LSTM and tanh is selected as the activation function.
用户的项目涉及将时间序列数据作为输入，并根据每辆车辆的时间信息做出决策。他们在项目的第3.1.3节中介绍了时间序列数据的信息。模型结构包括共享-LSTM网络，其结构类似于共享网络，但不同之处在于输入层后的第一个全连接层被替换为LSTM，且选择tanh作为激活函数。

Figure 3.12: The model structure of shared-LSTM network. 在这里插入图片描述

3.2.2 比较目标模型

Compare Target
比较目标
When making decisions, the state of one of the neighboring vehicles affects the decision to other vehicles, as an example shown in Fig. 3.13. In Fig 3.13(a), the probability of selecting vehicle-1 (within the red box) as the primary target is higher than the probability of selecting vehicle-2 (within the yellow box). In Fig 3.13(b), the state (position, speed, etc.) of the vehicle 1 is almost constant, but since the state of the vehicle-2 changes, the probability of selecting the vehicle-1 as the primary target is very low. Therefore, it is necessary to consider the relationship between vehicles. As illustrated in Section 3.1.3, the features are concatenated to learn the effects of features
between vehicles. Ideally, the relationships between all vehicles should be considered together. Due to computational limitation and model complexity, we only consider the relationship between two vehicles.
在做出决策时，一个邻近车辆的状态会影响对其他车辆的决策，如图3.13所示的示例。在图3.13(a)中，选择车辆-1（红框内）作为主要目标的概率高于选择车辆-2（黄框内）的概率。在图3.13(b)中，车辆-1的状态（位置、速度等）几乎恒定，但由于车辆-2的状态发生变化，选择车辆-1作为主要目标的概率非常低。因此，有必要考虑车辆之间的关系。正如第3.1.3节所示，特征被连接以学习车辆之间特征的影响。理想情况下，应该同时考虑所有车辆之间的关系。由于计算限制和模型复杂性，我们只考虑两辆车辆之间的关系。

Figure 3.13: An example for influences between vehicles [5]. See the main text for detailed explanation.
在这里插入图片描述

For the strategy of target selection, the features are concatenated in pairs between all detected vehicles, and the output should be 1 or 0. The primary target selected at each moment is the vehicle with the largest sum of outputs, where they are the first in the pair of features. This method can be explained by Equation 3.2. F stands for the machine learning model and represents the operation of concatenation. x stands for the features and i, j are the ID of vehicles.
目标选择策略涉及在所有检测到的车辆之间对特征进行成对连接，输出应为1或0。每个时刻选择的主要目标是输出总和最大的车辆，其中它们在特征对中排在第一位。这种方法可以通过方程式3.2来解释。其中，F代表机器学习模型，表示连接操作，x代表特征，i和j表示车辆的ID。
在这里插入图片描述
In this project, a multi-layer artificial neural network and a tree model were used to achieve the comparison. It should be noted that the data set used for the models in this section is not in time sequence but still on a single moment. The reason is by comparing the results of the shared network and shared-LSTM described in Section 4, the improvement of the result is not obvious when using time sequence as input, and the difficulty of training and the time required increased.
在这个项目中，使用了多层人工神经网络和树模型进行比较。需要注意的是，本节中使用的数据集不是时间序列数据，而是代表单个时刻的数据。这一决定是在比较了第4节中共享网络和共享-LSTM模型的结果后做出的。在使用时间序列作为输入时，并没有显著改善结果，反而增加了训练的难度和时间要求。
Compare-Target Network
比较目标网络
The structure of the compare-target network is shown in Fig. 3.14. This model’s purpose is to achieve binary classification. The input layer of this model has also two parts. The inputs on the left side are the features of the ego truck, and the inputs on the other side are the features of the neighboring vehicles. The specific information of the features is described in section 3.1.3. In the hidden layer, the features are transmitted to two fully connected layers. The activation function of the first layer is ReLU and the activation function of the second layer is sigmoid.
用户的项目包括一个比较目标网络，如图3.14所示，旨在进行二元分类。该模型的输入层分为两部分：左侧是自动驾驶卡车的特征，右侧是周围车辆的特征。这些特征的详细信息在第3.1.3节中进行了描述。在隐藏层中，特征经过两个全连接层传递，第一层使用ReLU激活函数，第二层使用sigmoid激活函数。
Figure 3.14: The model structure for compare-target.
在这里插入图片描述 Compare Tree
比较树
In addition to using artificial neural networks, XGBoost is also adopted. The principles of XGBoost and the reasons for using this algorithm are described in Section 2.2.2.
除了使用人工神经网络外，用户在他们的项目中还采用了XGBoost这一基于树的方法。XGBoost的原理和采用该算法的原因在他们项目的第2.2.2节中有详细描述。

3.2.3 网络微调

A common problem in machine learning application is data imbalance that some classes or situations have a significantly low number of samples [39]. Data imbalance is also a problem in this project. Although the amount of data is large, there are only a few data for some cases. In the data set, some vehicle lane change situations are more important, and it is usually necessary to change the primary target in these situations. However, the number of samples for these situations is small in the data set, and the number of change targets is shown in Appendix A.1. The number of these situations is particularly small compared to the total data set, which means the situation of
imbalance is severe. This problem leads to unsatisfactory performance in rare situations (lane change). Methods of coping with imbalance are studied for machine learning models. Random majority under-sampling [40] is adopted here.
处理数据不平衡是机器学习应用中常见的问题，某些类别或情况的样本数量明显较少[39]。数据不平衡问题在本项目中同样存在。尽管数据量很大，但某些情况下的数据非常有限。在数据集中，一些涉及车辆变道的情况更为重要，通常需要在这些情况下更改主要目标。然而，这些情况的样本数量在数据集中很少，变更目标的数量在附录A.1中显示。与整体数据集相比，这些情况的数量特别少，这意味着数据不平衡的情况非常严重。这个问题导致在罕见情况（如变道）下表现不佳。研究了处理机器学习模型不平衡的方法。这里采用了随机多数下采样[40]。
The specific implementation process is divided into the following steps:
具体实施过程分为以下几步：

Extract the data of target-change situations as well as the data before and after target-change situations in a limited time window (We can infer the target-change situation by the change of the ID).提取目标变更情况的数据，以及在有限时间窗口内目标变更情况前后的数据（可以通过ID的变化推断目标变更情况）。
Random sampling in the whole data sets, making the number of samples in common situations balanced with the number of samples of the target-change situations.在整个数据集中进行随机抽样，使常见情况的样本数量与目标变更情况的样本数量平衡。
Use the original data set to train and save the model first. Then use the balanced data to fine tune the saved model after adjusting learning rate.首先使用原始数据集训练并保存模型。然后使用经过平衡处理的数据来微调已保存的模型，调整学习率。
It is emphasized that the training use the whole training data set first and then fine-tuning the model with the extracted balanced data set, instead of directly using the extracted subset. There are two main reasons:强调首先使用完整的训练数据集进行训练，然后再用提取的平衡数据集对模型进行微调，而不是直接使用提取的子集。这样做的两个主要原因是：
There are many different kinds of situations in the whole data set, and training should include as many situations as possible. However, the data are not labeled for these situations.整个数据集中有许多不同类型的情况，训练应该尽可能涵盖更多情况。然而，这些情况的数据并未为这些情况标记。
The number of samples of the extracted subset is too small, and it is unreasonable to train with such a small set directly.
提取的子集样本数量太少，直接用这么小的集合进行训练是不合理的。

3.3 异常情况

In the process of driving a vehicle, it is not always necessary to select a vehicle as the primary target. When there is no suitable vehicle as the primary target, there is no need to use adaptive control to keep the ego truck and other vehicles at a time gap. Therefore, making a judgment on whether or not to choose a target is an important step.
在驾驶车辆的过程中，并不总是需要选择一辆车作为主要目标。当没有合适的车辆作为主要目标时，就不需要使用自适应控制来保持自车与其他车辆之间的时间间隔。因此，判断是否选择目标车辆是一个重要的步骤。
For the shared network and shared-LSTM network, a threshold is set to determine whether the target should be selected. For these two model structures, as presented in Fig 3.11 and Fig 3.12, the outputs shared layers (purple blocks) stand for the probability of selecting one vehicle as the primary target. If this value is smaller than the threshold, the corresponding vehicle should not be selected.
对于共享网络和共享-LSTM网络，设置了一个阈值来确定是否选择目标车辆。在这两种模型结构中（如图3.11和图3.12所示），共享层（紫色块）的输出表示选择某辆车作为主要目标的概率。如果这个概率值小于设定的阈值，则不会选择对应的车辆作为目标。
For the compare-target model, the threshold cannot be applied directly. Since the output value in compare-target model stands for which vehicle is more suitable as the primary target. So we reuse the shared layers in the shared network. As indicated in Fig. 3.15, firstly, the most suitable target is determined by the compare-target model, and then the features of this target and the features of the ego truck are input into the shared network, and the output value is calculated. If the output value is smaller than the threshold, no target should be selected at this time.
对于比较-目标模型，不能直接应用阈值。因为比较-目标模型的输出值表示哪辆车更适合作为主要目标，而不是概率。因此，我们重新利用共享网络中的共享层（如图3.15所示）：首先，比较-目标模型确定最合适的目标车辆。然后，将这个目标车辆的特征与自车的特征输入到共享网络中。计算共享网络的输出值。如果输出值小于阈值，则在此时不选择目标车辆。这种方法确保根据每种模型的输出有效地决定是否选择主要目标车辆，根据模型结构的特点和输出方式灵活调整阈值。

Figure 3.15: The strategy to determine no target should be selected. 在这里插入图片描述

3.4 评估

In this project, the evaluation consists of two parts: the accuracy of the selection and the cost of selection.
在这个项目中，评估主要包括两个部分：选择的准确性和选择的成本
Accuracy
准确度
The accuracy is the evaluation criteria. The definition of accuracy is indicated in Equation 3.3
准确度是评价标准。准确度的定义如公式3.3所示
在这里插入图片描述
Ncorrect represents the number of correct Selection, Ntotal is the total number of time steps. The definition of correct selection is the target selected by the machine learning model the same as the human label.
Ncorrect表示正确选择的次数，Ntotal表示总时间步数。正确选择的定义是机器学习模型选择的目标与人类标签相同。
Cost
成本
Besides the accuracy, collision avoidance acceleration (CAA) cost and distance cost were also adopted as an evaluation criterion of the target selection performance. The reason to use this criterion is when the selected target is not the correct target the impact of different choice is different. The total cost is 0 when the selected vehicle is the same as the labeled vehicle. If the impact of selecting one vehicle is similar to the result of selecting the correct vehicle, the cost is low. However, if choosing a vehicle that will lead to a bad result (affect the safety), the cost will be high. This cost is based on Scania’s algorithm, so the details of this cost will not be described in this report. It is only used in Chapter 4 to evaluate the performance of the machine learning models.
除了准确性外，还采用了碰撞回避加速度（CAA）成本和距离成本作为评估目标选择性能的标准。使用这个标准的原因是，当选择的目标不是正确的目标时，不同选择的影响是不同的。当所选车辆与标记车辆相同时，总成本为0。如果选择某个车辆的影响类似于选择正确车辆的结果，则成本较低。但是，如果选择会导致不良结果（影响安全），成本将较高。这个成本基于Scania的算法，因此本报告不会详细描述这个成本。它仅用于第四章评估机器学习模型的性能。

四、结果

This project’s objective is to explore machine learning methods for adaptive cruise control target selection. The methods introduced in Chapter 3 (shared network, shared-LSTM network, compare-target network, and XG- Boost model) are implemented, analyzed, and compared in Section 4.1. A case study is presented in Section 4.2.
这个项目的目标是探索自适应巡航控制目标选择的机器学习方法。在第三章介绍的方法（共享网络，共享-LSTM网络，比较目标网络和XGBoost模型）在第四章的4.1节中被实施、分析和比较。案例研究在第4.2节中呈现。

4.1 比较与分析

The data set used by each model in the experiment was the same. The final evaluations in this experiment were based on the same test set, showing the best performance achieved by each model. The experimental settings and details can refer to Appendix B.1 and Appendix B.2. The accuracy and cost of the models introduced in Section 3.2 are listed in Table 4.1.
实验中每个模型使用的数据集均相同。本实验的最终评估基于同一测试集，展示了每个模型所达到的最佳性能。实验设置和详细信息请参阅附录 B.1 和附录 B.2。第3.2节介绍的模型在准确率和成本方面的评估列于表格 4.1 中。
在这里插入图片描述 In the time series model, LSTM is used to extract important information from the time series. As Table 4.1 indicates, the best performance of the shared network is 90.16%, the best performance of the shared-lstm network is 91.94%. Since the decision mainly depends on the current moment, the information from time-series is not too much. Thus the improvement is not obvious. In addition, due to the use of the recurrent neural network, the training is more difficult, and the training consumes more resources and time. For these two reasons, in the compare-target model, instead of considering time sequences, only data at a single moment is used as input.
在时间序列模型中，LSTM被用来从时间序列中提取重要信息。正如表4.1所示，共享网络的最佳性能为90.16%，共享-LSTM网络的最佳性能为91.94%。由于决策主要取决于当前时刻，来自时间序列的信息并不太多，因此改进并不明显。此外，由于使用了循环神经网络，训练更加困难，消耗的资源和时间更多。基于这两个原因，在比较目标模型中，不再考虑时间序列，而是仅使用单个时刻的数据作为输入。
By comparing the shared network with the compare-target model, it is obvious that the performance of the compare-target model is better. The accuracy of the compare-target network and the compare target XGBoost could reach more than 94%. The reason that the compare-target model performs better can be seen as that the compare-target methods take the interrelationships between the vehicles into consideration. At any moment, the decision on a vehicle is not based individually on the information of this vehicle. Each vehicle influences each other.
通过比较共享网络和比较目标模型，显然可以看出比较目标模型的性能更好。比较目标网络和比较目标XGBoost的准确率可以达到94%以上。比较目标模型表现更好的原因在于，比较目标方法考虑了车辆之间的相互关系。在任何时刻，对车辆的决策都不是单独基于该车辆的信息，而是相互影响的结果。
The result of the compare-target network (fine-tuned) is obtained after the fine-tuning on the compare-target network using the method mentioned in Section 3.2.3. The result has improved a bit after fine-tuning. Here is an example of the performance of a compare-target network on log file 78, see Fig 4.1. As observed in this figure, from the 5000th step to 8000th step, the selection is not stable and smooth enough. Between the 10000th step and 10500th step, there is a vehicle that tries to cut-in (see Fig. 4.2), but the correct vehicle is not selected in time. One of the factors that led to this phenomenon is data imbalance and noisy. After the fine-tuning (see Fig. 4.3), the performance has been improved. Especially between the 6000th step to 8000th step, the selection process is smoother. Also, the selection performance on cut-in condition is better.
经过第3.2.3节提到的方法对比目标网络进行微调后，得到了比较目标网络（经过微调）的结果。微调后性能有所提升。这里以日志文件78的比较目标网络性能为例，参见图4.1。从图中可以看出，从第5000步到8000步，选择并不稳定和流畅。在第10000步到10500步之间，有一辆车试图插入（见图4.2），但未能及时选择正确的车辆。导致这种现象的因素之一是数据不平衡和噪声。经过微调后（见图4.3），性能得到了改善。特别是从6000步到8000步之间，选择过程更加平稳。此外，在插入条件下的选择性能也有所提升。

4.2 案例研究

In this section, the performance of the algorithm is analyzed for specific situations in the test set and some target selection results are displayed. The results of the algorithms in this section are based on the compare-target XGBoost.
在本节中，针对测试集中的特定情况分析了算法的性能，并展示了一些目标选择的结果。本节算法的结果基于比较目标的XGBoost模型。
According to the results of the previous section, the machine learningbased approachs can achieve a high accuracy overall. That is, the selected target is similar to the target selected by the human driver. For example, in Fig.4.4, the target selected by the algorithm (the vehicle in the yellow rectangle) is the same as the choice of human drivers. However, in some cases, the results of machine learning are different from what we expect. As shown in Fig. 4.5, another vehicle is a better choice for adaptive cruise control. But the machine learning algorithm still chooses the original vehicle as the primary target. One of the possible causes is the error from target detection
and sensor fusion. From this figure we can see that the estimate of the lane placement is different from the real value (real value should be between -1 and -2). Actually, the estimation of the yaw rate, trajectory, etc. of this vehicle is also not accurate. This is just one example. Each parameter affects each other.
根据前一节的结果，基于机器学习的方法总体上能够达到较高的准确率。也就是说，算法选择的目标与人类驾驶员的选择相似。例如，在图4.4中，算法选择的目标（黄色矩形中的车辆）与人类驾驶员的选择相同。然而，在某些情况下，机器学习的结果与我们的预期不同。如图4.5所示，另一辆车辆更适合自适应巡航控制，但机器学习算法仍然选择原始车辆作为主要目标。可能的原因之一是目标检测和传感器融合中的误差。从这幅图中我们可以看出，车道位置的估计与实际值不同（实际值应在-1和-2之间）。实际上，这辆车的偏航率、轨迹等估计也不准确。这只是一个例子，每个参数都相互影响。
Figure 4.1: The performance of target selection of compare-target network on log file 78.
在这里插入图片描述
Figure 4.2: An example of cut-in situation on log file 78. The vehicle in the orange box tries to cut-in.
Figure 4.3: The performance of target selection of compare-target network(fine tune) on log file 78.

Figure 4.4: An example of target selection, explained in the main text (example-1).
在这里插入图片描述
By analyzing the state at the next moment (see Fig. 4.6), we found that the estimated lane placement of the white vehicle is between -1 and -2 (-1.80), and this vehicle is selected as the primary target. From the results of these three moments, one conclusion that can be initially drawn is that the model learns the impact of the lane placement on the target selection. However, due to the existence of errors and noise, the most suitable choice cannot be made at all times.
通过分析下一时刻的状态（见图4.6），我们发现白色车辆的估计车道位置在-1和-2之间（-1.80），并且这辆车被选为主要目标。从这三个时刻的结果来看，一个初步的结论是模型学习到了车道位置对目标选择的影响。然而，由于存在误差和噪音，并非始终能够做出最合适的选择。
Except for these rare situations, the model performs well. The performance of the algorithm in two real highway environments is shown as Fig.4.7 and Fig. 4.8.
除了这些罕见的情况，模型表现良好。算法在两个真实的高速公路环境中的性能如图4.7和图4.8所示。
Figure 4.5: An example of target selection, explained in the main text (example-2).
在这里插入图片描述

Figure 4.6: An example of target selection, explained in the main text (example-3).
在这里插入图片描述

Figure 4.7: An example of the performance of the compare-target XGBoost on real highways (log file 32). The red lines represent the lane. The green cuboids represent the detected vehicles. The yellow rectangle represents the target selected by the machine learning algorithm.
这是比较目标 XGBoost 在真实高速公路上（日志文件 32）的性能示例。红色线条代表车道，绿色立方体代表检测到的车辆，黄色矩形代表机器学习算法选择的目标。
在这里插入图片描述

Figure 4.8: An example of the performance of the compare-target XGBoost on real highways (log file 53). The red lines represent the lane. The green cuboids represent the detected vehicles. The yellow rectangle represents the target selected by the machine learning algorithm.
这是比较目标 XGBoost 在真实高速公路上（日志文件 53）的性能示例。红色线条代表车道，绿色立方体代表检测到的车辆，黄色矩形代表机器学习算法选择的目标

五、讨论与未来工作

This chapter discusses the results displayed in Chapter 4. Some problems are also analyzed, such as data imbalance and no target should be selected, together with the possible directions for future work.
这一章讨论了第四章展示的结果。还分析了一些问题，如数据不平衡和不应选择目标，以及未来工作的可能方向。
The data imbalance is a very severe problem. Even after fine-tuning with a balanced data set, the performance in some important situations is still not ideal. Some possible reasons can be found by analyzing existing data sets. The number of log files is 26, each file lasts about 4-5 minutes, the total duration is more than 100 minutes. However, for machine learning, this number of data sets may be still insufficient. Moreover, the distribution of the situations is imbalanced, which is not suitable for machine learning. Through Appendix
A.1, we know that there are only 137 target changes in the data set (some of which are not vehicle cut-in or cut-out situation). Thus, the amount of data in these rare cases is insufficient. When we use time sequences as input, the network is expected to detect cut-in situations. Due to the data insufficiency, the improvement is not obvious. For the data set imbalance problem, more data need to be collected, especially the situations of vehicle cut-in. Also, data augmentation could be adopted to increase the amount of training data.
Since the situations of vehicle cut-in and lane change are essential, this type of situations can be labeled. In the future, decisions can be made by predicting the lane change behavior of the vehicle.
数据不平衡是一个非常严重的问题。即使在使用平衡数据集进行微调之后，某些重要情况下的性能仍然不理想。通过分析现有数据集，可以找到一些可能的原因。日志文件数量为26个，每个文件持续约4-5分钟，总时长超过100分钟。然而，对于机器学习来说，这个数据集数量可能仍然不足够。此外，情况分布不均衡，这对机器学习不利。通过附录 A.1，我们知道数据集中仅有137次目标变更（其中一些不是车辆变道或退出的情况）。因此，这些稀有情况的数据量是不足的。当我们将时间序列作为输入时，网络预期能够检测到车辆变道情况。由于数据不足，改进并不明显。对于数据集不平衡的问题，需要收集更多数据，特别是车辆变道的情况。此外，可以采用数据增强来增加训练数据的量。
In the process of driving, not all the time we can find a suitable neighboring vehicle as the primary target. In Section 3.3, a method is proposed for the shared models and the compare-target models to determine to choose or not to choose any target. However, this method needs to set a hyper-parameter threshold, which is difficult to decide. It is also considered that the set value is not necessarily suitable. As for this threshold, a possible solution is to label the vehicles that can be selected as targets. Then we can train a model to determine whether a target can be chosen.
由于车辆变道和车道变更是关键的情况，这些情况可以被标记。未来可以通过预测车辆的车道变更行为来做出决策。在驾驶过程中，并不总是能够找到合适的邻近车辆作为主要目标。在第3.3节中，提出了一种方法，用于确定是否选择任何目标的共享模型和比较目标模型。然而，这种方法需要设定一个超参数阈值，这是很难决定的。同时，也考虑到设置的值不一定合适。对于这个阈值，一个可能的解决方案是标记可以选择为目标的车辆。然后我们可以训练一个模型来确定是否可以选择一个目标。
The best result is obtained by the compare-target model. The relationship between any two vehicles is considered in this model. However, the relationship between all vehicles is not considered at the same time. Because if considering the intrinsic relationship between all vehicles at the same time, the dimensions of the data will be extremely large, which increases the difficulty of model training. The data need to be learned in a suitable format. A possible method to consider all vehicles together is to convert the origin features to graphs. As shown in Fig. 5.1, using different colors and shapes to represent vehicles, trucks, lanes, and paths . In this way, we can use convolutional
neural networks to extract features and consider the spatial relationship between vehicles. However, it is difficult to convert the raw data to images without losing information and extracting the correct features. The method of convolutional neural network finally reached an accuracy of 86.34%. Because the training requires much time and the results are not satisfying, it is not introduced in the main work. However, this method is worth exploring.
比较目标模型获得了最佳结果。该模型考虑了任意两辆车之间的关系。然而，没有同时考虑所有车辆之间的关系。因为如果同时考虑所有车辆之间的内在关系，数据的维度将会非常大，这增加了模型训练的难度。数据需要以合适的格式进行学习。一种可能的方法是将原始特征转换为图形。如图5.1所示，使用不同颜色和形状表示车辆、卡车、车道和路径。通过这种方式，我们可以使用卷积神经网络提取特征，并考虑车辆之间的空间关系。然而，将原始数据转换为图像并提取正确特征是困难的。卷积神经网络方法最终达到了86.34%的准确率。由于训练时间长且结果不理想，这种方法并没有在主要工作中介绍。然而，这个方法值得探索。
The objective of this thesis is exploring machine learning based methods for adaptive cruise control target selection. In fact, it is not necessary to use machine learning in all scenarios, even though machine learning has achieved excellent results in many applications. Making decisions by modeling the motion of surrounding vehicles and road conditions could make the results more reliable and controllable.
本文的目标是探索基于机器学习的自适应巡航控制目标选择方法。实际上，并不需要在所有场景中都使用机器学习，即使机器学习在许多应用中取得了出色的结果。通过对周围车辆和道路条件的运动建模来做出决策，可能会使结果更可靠和可控。
Figure 5.1: Convert the raw data to images. Using a white rectangle to represent the ego vehicle, green rectangles to represent the neighbor vehicles, white lines to represent lanes and paths. Gaussian filters are applied to represent confidence.
图5.1：将原始数据转换为图像。使用白色矩形表示自车，绿色矩形表示邻近车辆，白色线表示车道和路径。应用高斯滤波器表示置信度。
在这里插入图片描述

六、结论

The objective is to develop a machine learning based solution for adaptive cruise control target selection that could select a primary target which is suitable for the human driver. Data collected on roads are fused for target selection. In order to apply machine learning algorithms for selecting a primary target for the adaptive cruise control system, the data is labeled and preprocessed. In this thesis, mainly two methods are developed. The shared network shares the same hidden-layers for neighboring vehicles. Based on the shared network, we use LSTM units to replace the first fully-connected layer to train the time sequence data. The performance has been improved after considering the time series. But there is no significant difference since the decisions mainly depend on the current condition. We proposed a compare-target method to consider the relationship between vehicles. The performance of the compare-target network is better than shared models. Because the data set is imbalanced and noisy, a tree-based method XGBoost is adopted. A strategy of fine-tuning is adopted to reduce the impact of data imbalance. Comparing the four machine learning methods w.r.t. the test accuracy and the cost, the
compare-target XGBoost model performed best with an accuracy of 94.85% on the testing set.
目标是开发基于机器学习的自适应巡航控制目标选择解决方案，能够选择适合人类驾驶员的主要目标。收集在道路上的数据用于目标选择。为了应用机器学习算法选择自适应巡航控制系统的主要目标，数据被标记和预处理。在这篇论文中，主要开发了两种方法。共享网络为周围车辆共享相同的隐藏层。基于共享网络，我们使用LSTM单元替换第一个全连接层来训练时间序列数据。考虑了时间序列数据后，性能有所提高。但由于决策主要取决于当前条件，改进并不明显。我们提出了一种比较目标方法来考虑车辆之间的关系。比较目标网络的性能优于共享模型。由于数据集存在不平衡和噪声，采用了基于树的方法XGBoost。采用了微调策略来减少数据不平衡的影响。比较四种机器学习方法在测试准确性和成本方面，比较目标XGBoost模型在测试集上表现最佳，准确率达到94.85%。

参考文献

在这里插入图片描述

A、数据信息

A.1 日志数据信息

在这里插入图片描述

A.2 信号描述

在这里插入图片描述

B、详细实验设置

B.1 硬件和软件设置

Experiments were implemented on a mobile workstation, HP ZBook Studio G5 Mobile Workstation, with a Intel® Core™ i7-8850H CPU 2.60GHz processor, a Quadro P1000 4GB GPU and 16GB RAM. To simulate and unpack the data set, MATLAB and Simulink were used. And pandas, a Pythonbased library was used for data access. To implement the artificial neural network and tree models, the Python-based framework TensorFlow, Keras, xgboost, deepctr, scikit were used. The details of the platforms and libraries of this project are shown in Table B.1.
实验是在一台移动工作站上实施的，具体是HP ZBook Studio G5移动工作站，配备了Intel® Core™ i7-8850H CPU 2.60GHz处理器、Quadro P1000 4GB GPU和16GB RAM。为了模拟和解包数据集，使用了MATLAB和Simulink。而pandas这个基于Python的库被用来访问数据。为了实现人工神经网络和树模型，使用了基于Python的框架TensorFlow、Keras、xgboost、deepctr、scikit。这个项目的平台和库的详细信息显示在表B.1中。
在这里插入图片描述

B.2 实验细节

• Data Split
15% of data are split as testing data. Other data are divided into training data and validation data. To be specific, 4 log files are split as testing data, and the other 22 log files are used for training and validation.
数据分割
15%的数据被分割为测试数据。其他数据被划分为训练数据和验证数据。具体来说，4个日志文件被分割为测试数据，另外22个日志文件用于训练和验证。
• K-fold Cross-validation
K-fold cross-validation is applied. The original data samples are randomly divided into k equal sized subsamples. Of these k subsamples,
one single subsample is used as the validation data. The remaining k-1 subsamples are retained as training data. The cross-validation process is then repeated for k times [41]. Here k is 10.
K折交叉验证
应用了K折交叉验证。原始数据样本被随机划分为k个等大小的子样本。在这些k个子样本中，一个单独的子样本被用作验证数据。剩下的k-1个子样本保留为训练数据。然后，交叉验证过程重复k次[41]。这里k为10。
• Optimization Algorithm and Hyper-parameters
The optimization algorithm Adam [42] was adopted in all neural network methods. The networks are trained for 20 epochs and the batch
size is 100, the initial learning rate is 0.001.
优化算法和超参数
在所有神经网络方法中采用了Adam优化算法[42]。网络训练20个周期，批量大小为100，初始学习率为0.001。
• Initialization
The weights of neural networks are initialized with He-Normal initialization.
初始化
神经网络的权重使用He-Normal初始化。
• Regularization
L2 regularization term is applied in the neural network for weight decay to reduce the influence of over-fitting. The value is 0.00001.
正则化
在神经网络中应用了L2正则化项进行权重衰减，以减少过拟合的影响。值为0.00001。
• Learning Rate Decay
The learning rate will be reduced when the metric has stopped improving for 3 epochs. The factor of learning rate decay is 0.1 that
lrnew = factor × lr.
学习率衰减
当度量指标在3个周期内停止改善时，学习率将减少。学习率衰减的因子为0.1，即新的学习率 lrnew=factor×lrlrnew=factor×lr。
• Early Stopping
The training will be stopped when the metric has stopped improving for 10 epochs.
早停
当度量指标在10个周期内停止改善时，训练将停止。