

Vehicles will be more complex, safe, and intelligent in the future. For instance,with the support of the advanced driver assistance system (ADAS), the safety and comfort of the driver and the passengers can be significantly improved.This degree project proposes data-driven solutions for adaptive cruise control (ACC) target selection that can be used to select one of the preceding vehicles as the primary target that similar to the choice of human drivers. This master degree project was carried out at Scania CV AB. A shared-network and
a shared-LSTM network were used to select the primary target. Besides, A novel machine learning based target selection model (compare-target model) was designed, which can consider all neighboring vehicles together by comparing vehicles. A compare-target network and a compare-target XGBoost are developed based on the compare-target model. In total, four different machine
learning methods were adopted to select the primary target for ACC, including a shared network, a shared-LSTM network, a compare-target network, and a compare-target XGBoost model. These methods were compared and analyzed. Fine-tuning was adopted to overcome the data imbalance problem of rare situations. The compare-target XGBoost can achieve 94.85% accuracy on the test set.
未来的车辆将变得更加复杂、安全和智能化。例如,在先进的驾驶辅助系统(ADAS)的支持下,可以显著提高驾驶员和乘客的安全性和舒适度。这个学位项目提出了基于数据驱动的自适应巡航控制(ACC)目标选择解决方案,这些解决方案可以用来选择一个前车作为主要目标,类似于人类驾驶员的选择。这个硕士学位项目在Scania CV AB进行。使用了共享网络和共享LSTM网络来选择主要目标。此外,设计了一个新颖的基于机器学习的目标选择模型(比较目标模型),该模型可以通过比较考虑所有邻近车辆。基于比较目标模型开发了比较目标网络和比较目标XGBoost。总共采用了四种不同的机器学习方法来选择ACC的主要目标,包括共享网络、共享LSTM网络、比较目标网络和比较目标XGBoost模型。这些方法被比较和分析。采用微调来克服罕见情况的数据不平衡问题。比较目标XGBoost在测试集上能够达到94.85%的准确率。


This chapter introduces the subject of machine learning for adaptive cruise control target selection. Autonomous systems are expected to cope with plenty of complex situations. However, when the situations are complicated and changeable, the solutions are difficult to design through classic programming. With a relatively large amount of data, it is possible to tackle complex situations through machine learning. The research questions and objectives are formulated in this chapter. A summary of contribution and delimitation is
presented, followed by the consideration of ethics, together with an outline of this thesis.
In modern society, transportation is closely related to people’s lives and is an indispensable part of life. Many developed cities in the world are encountering problems caused by excessive use of private cars, such as traffic jams,frequent accidents, parking difficulties, energy shortages, noise pollution, and environmental pollution. These problems have severely reduced the quality of life [1]. Autonomous or semi-autonomous vehicles are not only expected to improve the efficiency and safety of the transportation system, but also have
other benefits such as high flexibility, low operating cost, and environmental protection. With these advantages, autonomous vehicles can be used as a useful supplement to urban intensive public transportation, providing highqualityregional transportation services [2].
When a driver is less involved in operating a vehicle, there is an increased demand in sophisticated solutions for making correct driving decisions regarding surrounding traffic. Adaptive Cruise Control(ACC) is an assist function to relieve the driver from having to adapt the vehicle’s speed and distance to surrounding traffic based on data from different sensors, such as radar and camera, which could automatically keep a set time gap to one preceding vehicle to avoid vehicle-to-vehicle collisions [3, 4]. For Adaptive Cruise Control,
it is vital to select the right preceding vehicle as the primary target to avoid accidents and to be able to optimize fuel consumption. It is also to make the driver feel comfortable and be able to rely on this system. For this prerequisite, Scania is exploring how to choose the right target differently. This study is part of a project in Scania CV AB, Autonomous System Group, and Scania provides the necessary data for this thesis project.
当驾驶员在操作车辆方面的参与度降低时,对于正确做出有关周围交通的驾驶决策的复杂解决方案的需求就增加了。自适应巡航控制(ACC)是一种辅助功能,可以减轻驾驶员根据来自不同传感器(如雷达和摄像头)的数据调整车辆速度和与周围交通的距离的负担,这些传感器可以自动保持与前车的时间间隔,以避免车辆之间的碰撞[3, 4]。对于自适应巡航控制来说,选择正确的前车作为主要目标至关重要,以避免事故,并能够优化燃油消耗。这也是为了让驾驶员感到舒适,并能够信赖这个系统。为了这个前提,Scania正在探索如何以不同的方式选择正确的目标。这项研究是Scania CV AB自主系统组项目的一部分,Scania为本论文项目提供了必要的数据。

1.1 问题陈述

There are several sensors, such as radar and camera on Scania trucks. These sensors collect environment information. A series of information on the surrounding environment and vehicles can be obtained through data fusion. For example, the position of the vehicles on the lane could be calculated with this information by the corresponding algorithm. The primary target is chosen through a target selection algorithm, and the information of the primary target is transmitted to a longitudinal controller for the ACC system to maintain a
chosen distance. A simple solution for target selection is to choose the vehicle which drives in the same lane. As shown in Fig.1.1(a), the primary vehicle is the preceding vehicle that drives in the same lane. However, when a vehicle tries to cut in, as shown in Fig.1.1(b), the primary target could be the cut-in vehicle or the one in the same lane. Also, it is difficult to estimate the driving lane when the lane is partly observed or when there are no lane marking.
Figure 1.1: The situations of primary target selection [5].在这里插入图片Figure 1.1: The situations of primary target selection [5].描述A human driver bases its decision in each moment on personal experiences,gained knowledge, and a limited focus. Most existing algorithms consider a few (considered to be significant) situations and neighboring vehicles’status. The possible meaningful targets are detected using a relative distance and velocity with in-lane target detection, motion-based analysis [5]. The target is selected based on position and confidence. For example, mostly, the algorithm will select the closest vehicle in the same lane as the target [6, 7]. When the preceding vehicle changes to another lane or going to stop, the algorithm will consider choosing another target. One advantage of classic programming is the ability to adapt the behavior to situations which may not have occurred yet. However, in our real world, there are many complex traffic situations. It is hard to consider various kinds of situations with classical programming. With access to a large amount of training data and real-time sensor data, machine learning algorithms should be able to make more informed decisions than the current algorithm. Moreover, the truck could be applied to more complex traffic situations and new environments, which could enhance the adaptability and robustness of trucks.
人类驾驶员在每个时刻的决策基于个人经验、获得的知识以及有限的注意力。大多数现有的算法只考虑一些(被认为重要的)情况和邻近车辆的状态。使用基于运动的分析,通过相对距离和速度与车道内目标检测来探测可能有意义的目标。目标的选取基于位置和置信度。例如,算法通常会选择同一车道中最接近的车辆作为目标[6, 7]。当前面的车辆变道或即将停止时,算法将考虑选择另一个目标。经典编程的一个优点是能够适应尚未发生的情况。然而,在我们现实世界中,存在许多复杂的交通情况。用传统编程很难考虑各种情况。有了大量训练数据和实时传感器数据的访问,机器学习算法应该能够比当前算法做出更明智的决策。此外,卡车可以被应用到更复杂的交通情况和新环境中,这可以增强卡车的适应性和鲁棒性。
The most challenging problems are data imbalance and noise. These data were collected from the real world. Some traffic situations are rare. In other word, these data are imbalanced. In addition to the imbalanced traffic situations, the algorithm also needs to handle data with noisy and low accuracy, such as poor lane estimation in curves, incorrect prediction of ego vehicle movement, etc. Also, part of the data is incomplete because of the sensors (the value does not always exist).
Several interesting research questions arise:
• How to choose and preprocess the training data?
• How to solve or overcome the challenges of the data(imbalance, incomplete,incorrect)?
• What kind of model or algorithm can be chosen and extended to adapt to this project?
• How to validate and evaluate the performance of the algorithm(s)?

1.2 目标和任务

The structure of the ACC system is shown in Fig.1.2. This thesis focuses on the module of primary target selection. The objective is to investigate the possibilities to select the preceding vehicle based on machine learning algorithms correctly. The proposed methods should be implemented, evaluated, and compared to each other. Furthermore, the proposed solution should be suitable for autonomous heavy-duty vehicles, which means the method should be based on the data from the sensor fusion module in Fig.1.2.
Figure 1.2: The ACC system structure of the truck.在这里插入图片描述
Moreover, the tasks can be split into three main parts. The first part is to preprocess the data. The data used for this project are log files which include the information of ego trucks and neighbor vehicles after sensor fusion. These data are in time series, which were recorded every ten milliseconds. And these data were labeled manually. The second part is to develop machine learning algorithms which includes model selection, training, and testing. The third is to evaluate and compare the performance between different machine learning algorithms and visualize the process of selection.

1.3 贡献

A great functioning model for target selection is a significant module in an adaptive cruise control system. Currently, to the best of the author’s knowledge, there is no existing reliable machine learning based model for adaptive cruise control target selection. The main contribution is the implementation and evaluation of several machine learning methods for ACC target selection. And a novel method of target selection by comparing vehicles is proposed.
The decisions of the primary target selection are based on the information collected through sensors. The most crucial decision-making factors are the states of the surrounding vehicles, but other factors such as the environment and roads also have a certain degree of impact on decision-making. With traditional methods, it is difficult to model and analyze the relation of a variety of complex factors, and when the information obtained by sensors changes, the algorithm also needs to change accordingly. A machine learning-based approach
avoids the analysis of the relationships between various signals manually, and a complex model can take into account more factors. This is useful since on different type of trucks the information may be different. The process of creating a specific mathematical algorithm for each truck can be avoided with machine learning. Machine learning models should be applicable to a variety of similar truck only with the corresponding data to train the model.
The novelty of the proposed method (compare-target model, see Section 3.2) is that it considers all surrounding vehicles together instead of one by one. Scania’s current algorithm considers the situation of each neighboring vehicle separately and makes decisions based on the condition of each vehicle. But in reality, the target selection should consider all vehicles at the same time since the state of one vehicle affects the choices made to other vehicles. Making decisions by simultaneously considering all vehicles is more reasonable.
We develop a compare-target network and a compare-target XGBoost model based on this method. Moreover, the performances are evaluated and compared with the shared models (shared network and shared-LSTM network, see Section 3.2).
Answering the research questions listed in Section 1.1 can provide insight into how to properly build up a decision model for target selection and which features and situations are of more importance.

1.4 伦理考量

Machine learning and Deep learning models are deployed into increasingly greater parts of the world. Deep learning as an essential branch of artificial intelligence, is also receiving more and more attention from people.
Deep learning has an increasing impact on human life, but their internal operations are often opaque. After the deep learning system is trained, it is hard to see how it made the decision. In many cases, this is unacceptable even if it gets the right answer. More and more weaknesses exposed by deep learning are drawing public attention to artificial intelligence. Especially in the field of driverless cars, autonomous driving uses similar deep learning techniques for navigation, which has led to well-known disasters and fatalities
[8]. From a legal point of view, GDPR states that individuals have the right to know the reasoning behind a decision that has affected them adversely, even if the reasoning is purely algorithmic. Thus, the topics of model interpretability and explainability become more paramount.
Still, undeniably, deep learning is a powerful tool. Deep learning makes it very common to deploy applications such as facial recognition and speech recognition to a level that was almost impossible to achieve ten years ago. Thus, it is hard to imagine that deep learning will be abandoned at this time.It is more likely to modify or enhance the deep learning method, such as combining deep learning with traditional methods to improve the interpretability of the model. We wish that better understandings of machine learning models will provide us with more insight on high dimensional data and contribute to addressing vulnerabilities in the current decision models.

1.5 论文大纲

This thesis follows a typical structure of degree reports. Chapter 2 provides readers with the relevant backgrounds and theories necessary for understanding the problems and proposed solutions, also reviews the literature on adaptive cruise control target selection, machine learning, and time series to identify a promising approach. Chapter 3 describes the details of the dataset used and the preprocessing of the dataset. Moreover, this chapter presents the approach for target selection and the methods used for performance evaluation. The results and evaluations of target selection are presented and analyzed in Chapter 4, and discussed in Chapter 5. Finally, Chapter 6 summarizes the work in this thesis together with suggestions for future work.


This chapter presents a series of relevant backgrounds of this thesis project and the related work. Section 2.1 describes the concept of adaptive cruise control and presents the strengths and prerequisites of ACC, also introduces the current algorithm briefly. The theory relevant to the machine learning methods used in this thesis is presented in Section 2.2, including the concept of artificial neural networks, XGBoost, and LSTM. And Section 2.3 presents the related work in this field, including some schemes of primary target selection (Not the one in Scania).

2.1自适应巡航控制 (ACC)

Adaptive cruise control is based on conventional cruise control. ACC improves the comfort and energy efficiency of vehicle driving while ensuring safety, and overcomes some limitations of human drivers. The distance and relative speed between the ego vehicle and the preceding vehicle are measured and fused in real time by different sensors. Appropriate control signals are calculated to adjust the ego vehicle speed to control the distance automatically. For the control system of adaptive cruise control, the information of a primary target should be provided. In this case, some research aim at the primary target selection algorithm.
Human limitations, strengths of ACC, and target selection algorithms are discussed in the following subsections.

2.1.1 人类局限性

The limitation of the driver’s judgment and reaction on the surrounding traffic conditions (e.g., the driving states of neighboring vehicles) is a critical cause of traffic flow instability, traffic congestion, and traffic accidents [9]. Scholars from all over the world generally believe that if the traffic flow characteristics cannot be effectively improved, it is difficult to obtain a fundamental breakthrough in optimizing road capacity and traffic safety [10]. Automated driving technology is expected to improve the traditional traffic flow characteristics from the microscopic vehicle level, thus providing an effective way to solve traffic problems.

2.1.2 ACC的优势

Adaptive cruise control is an essential type of Advanced Driver-Assistance Systems (ADAS) in the study of autonomous vehicles. ACC vehicles can obtain the driving status of preceding vehicles in real time through onboard detection devices and have a more timely and accurate traffic condition perception capability than average drivers.
ACC systems can automatically adjust the vehicle speed to ensure safety and improve driving comfort and energy saving [11]:
• Safety: The average time for a normal driver to be aware of a situation to react is about 1.0 to 1.3 seconds [12]. The ACC’s response period is shorter than human drivers. Thus, ACC could more effectively avoid the most traffic accidents.
• Comfort: The driver needs to concentrate during driving and constantly operate the vehicle to maintain a safe braking time. In the case of traffic congestion, the vehicle often repeats forward and stop, and the driver needs to complete the coordination of hands and feet continuously. This is the main cause of driver fatigue, and the ACC can free the driver from this repetitive and stressful task.
• Energy efficiency: Low carbon life and energy conservations are the themes and trends of social development. More emissions are generated when the driving speed changed fluently, and ACC can keep the vehicle drive smoother. Further, ACC can maintain a proper distance [13], so it can effectively improve the road capacity and ease traffic congestion, which would contribute to good economics. Recent studies have shown that if the proportion of vehicles equipped with ACC systems on the highway reaches 25%, it can eliminate the congestion of highways [14].

2.1.3 目标选择

Since there are more than one vehicle in the real highway situations and various transitions between the ego truck and the surrounding vehicles occur, it is necessary to set up a proper target selection strategy to apply the adaptive cruise control system to multi-vehicle scenarios. To this end, the algorithm needs to determine which surrounding vehicle is the best target for ACC systems based on current traffic situations. The primary target selection module forwards the information to the longitudinal controller after determining the target within the lane to navigate the ego truck smoothly and guarantee safety in complex traffic situations. The current existing primary target selection algorithms are based on classic programming, considering several significant situations. The truck obtains the surrounding information through sensors, and after the data fusion and calculation, some states of the surrounding vehicle and the ego truck can be obtained. Using the position, speed, lane placement, confidence, and other information possible targets are determined. Finally, based on a series of criteria, it chooses the most appropriate one of these possible targets as the primary target.

2.2 机器学习

The purpose is to use machine learning methods to address the target selection of adaptive cruise control. Artificial neural networks (ANN) are computational models that mimic the structure and function of biological neural networks that have been used to solve a wide variety of problems. Extreme gradient boosting (XGBoost) has an outstanding performance in many machine learning competitions. The data set has problems with imbalance and noise. XGBoost is a tree-based approach that is less sensitive to data. The data set contains time information. Thus LSTM is used for the time sequences. The concept of artificial neural networks, extreme gradient boosting, and long short-term memory are introduced in this section.

2.2.1 人工神经网络

In computational science, artificial neural network models are inspired and abstracted from the central nervous system of biological and are now commonly used in pattern recognition and machine learning, also known as artificial neural networks. The nervous system of animals is made up of a large number of neurons or nerve cells. The stimulation signals of peripheral neurons are continuously transmitted for communication. Similarly, artificial neural networks are computational models consisting of a large number of connected
nodes (or neurons) connected. Each node represents a specific output function called activation function. Starting from a certain input data, each node will get the corresponding input to produce the corresponding output as the input of the next node. In this way, the information is continuously transmitted to each node until the output is obtained. The connection between every two nodes represents a weighting value for transmitting the signal, called weight, which is equivalent to the memory of the artificial neural network. Similarly,
the animal’s nervous system constantly adjusts the synaptic connections between neurons while learning, and the artificial neural network also adjusts the weight of connections between nodes. The output of the network varies depending on the connection in the network, the weight value, and the activation function [15].
The network mainly refers to the connection between neurons at different levels in the system. Taking a typical three-layer artificial neural network as an example. The first layer has only input neurons which transmit the input signals to the neurons on the second layer through synapses (connections between neurons). Then the signals are transmitted to the third layer which is the neuron of the output layer. More complex networks may contain more levels of neurons, and there may be more input neurons and output neurons. The multi-layer model has a stronger ability to describe and manipulate nonlinear problems.
A multi-layer artificial neural network model uses a back-propagation feedback strategy to adjust the weight [16, 17, 18] by computing the gradient of the cost function. In this way, the input data can be repeatedly fed into the artificial neural network for calculation, and the artificial neural network is adjusted. An error can be obtained by comparing the actual output of each calculation with the desired output. According to this error, the weights in the artificial neural network can be adjusted so that the next time the same data is input to the artificial neural network, the output has a smaller error. This repetitive process is called learning (training).
多层人工神经网络模型使用反向传播反馈策略来调整权重[16, 17, 18],通过计算代价函数的梯度。这样,输入数据可以反复输入到人工神经网络进行计算,并对人工神经网络进行调整。通过比较每次计算的实际输出与期望输出,可以获得一个误差。根据这个误差,可以调整人工神经网络中的权重,以便下次相同的数据输入到人工神经网络时,输出的误差更小。这个重复的过程称为学习(训练)。

2.2.2 极端梯度提升 (XGBoost)

EXtreme Gradient Boosting is a boosting algorithm based on the regression tree that has advantages of fast running speed and good performance, an improvement of Gradient Boosting Decision Tree (GBDT) [19]. The effects of this system have been verified by a large number of machine learning and data mining competitions [20, 21].
极端梯度提升(EXtreme Gradient Boosting,XGBoost)是一种基于回归树的梯度提升算法,它具有运行速度快和性能好的优点,是梯度提升决策树(Gradient Boosting Decision Tree,GBDT)的改进版[19]。这一系统的效果已经通过大量的机器学习和数据挖掘竞赛得到了验证[20, 21]。
Boosting algorithm is a kind of ensemble learning algorithm, which is based on the concepts of strongly learnable and weakly learnable that Kearns and Valiant proposed. And they proposed a theorem that a problem is strongly learnable if and only if it is weakly learnable [22].
First, a number of basic weak models are generated and trained. After obtaining the results of multiple models, the Boosting method weights the results of these models to output the final result. In the Boosting framework, the accuracy of each weak classifier may be low, but after weighted fusion, the final result is greatly improved [23].
Gradient Boosting Decision Tree (GBDT) is an improvement of Boosting algorithm. GBDT uses tree models as basic models. The final result is obtained by iteration. Each iteration is to reduce the residual of the last iteration [24]. When the data set is large and complex, the GBDT algorithm is computationally intensive and inefficient. In 2015, Tianqi Chen proposed XGBoost improving this shortcoming of the GBDT algorithm [20].

2.2.3 长短期记忆网络 (LSTM)

Different from the traditional feedforward neural network, the signal feedback structure in the recurrent neural network (RNN) makes the output state of the network at a certain moment related to the historical signal before this moment, thus showing certain dynamic characteristics and memory ability.
In recent years, RNN has achieved remarkable success in speech recognition, text analysis, and other issues [25, 26]. One important reason is that the long-short-time-memory unit has been used in the structure of the RNN. LSTM uses different gates to enhance the memory of the network, thus solving the problem of vanishing gradient [27]. The structure of the RNN is expanded, as shown in Fig.2.1, and its structure is similar to a model with multiple layers of the same network. However, vanilla recurrent neural networks have limited memory because the gradient finally vanishes or explodes when it propagates through time.
不同于传统的前馈神经网络,递归神经网络(RNN)中的信号反馈结构使得网络在某一时刻的输出状态与之前的历史信号相关,因此表现出一定的动态特性和记忆能力。近年来,RNN在语音识别、文本分析等领域取得了显著成功【25, 26】。其中一个重要原因是长短期记忆单元(LSTM)被引入了RNN的结构中。LSTM通过使用不同的门控机制来增强网络的记忆能力,从而解决了梯度消失问题【27】。RNN的结构如图2.1所示,其结构类似于具有多层相同网络的模型。然而,传统的RNN由于梯度在时间传播过程中最终消失或爆炸,导致其记忆能力有限。
Figure 2.1: unrolling a recurrent neural network [28]. A is a chunk of neural network, Xt is the input, and ht is the output.
在这里插入图片描述The LSTM-based RNN addressed this issue by improving the internal structure based on this chain structure. The gate architectures are used in LSTM unit to keep constant error when propagating. Among these gates, the forgot gate
determines which information needs to be discarded from the cell states at the previous moment. The update gate determines the cell states that need to be updated. The output gate will filter information based on cell states. These three gates are implemented using sigmoid function. As shown in Fig.2.2, the most prominent characteristic is the use of three sigmoid layers and point-wise multiplication operations to strengthen the control of the information transmission. For more details about the structure of LSTM and the functionality of gates, [28, 29] are recommended for readers.
基于LSTM的RNN通过改进这种链式结构的内部结构解决了这一问题。LSTM单元中的门控结构用于在传播过程中保持恒定误差。在这些门中,遗忘门决定从前一刻的细胞状态中需要丢弃哪些信息,更新门决定需要更新的细胞状态,输出门将根据细胞状态过滤信息。这三个门是通过sigmoid函数实现的。如图2.2所示,其最显著的特征是使用了三个sigmoid层和逐点乘法操作来加强对信息传输的控制。关于LSTM结构和门控功能的更多详细信息,建议读者参考【28, 29】。
Figure 2.2: The structure of a network with three LSTM cells [28]
在这里插入图片描述In summary, LSTM-RNN uses the gates to control the memory of the RNN model. In the training phase, the weights and biases are learned from the historical data, and the characteristics of the historical states are identified and memorized.

2.3 相关工作

This section overviews the previous related research on adaptive cruise control target selection. Seungwuk, Hyoung-Jin, and Kyongsu [5] studied the target selection strategy through simulations of several driving scenarios, using the information from sensors. A driving area of the ego vehicle is determined using the curvature estimated by the yaw rate and the speed, and the width of the driving area is calculated by the vehicle width. A monitoring area is defined to detect the neighboring vehicle inside the ego vehicle driving area.
The in-lane vehicles are determined by considering the relative distance and velocity of detected vehicles. Three indexes of longitudinal motion, lateral motion, and warning are calculated to determine the driving status of each target by comparison. The closest target with representative driving status is determined as the primary target.
本节概述了关于自适应巡航控制目标选择的相关研究。Seungwuk、Hyoung-Jin 和 Kyongsu [5] 通过模拟多种驾驶场景,利用传感器信息研究了目标选择策略。利用偏航率和速度估算曲率,确定自车的驾驶区域,并根据车辆宽度计算驾驶区域的宽度。定义了一个监控区域来检测自车驾驶区域内的邻近车辆。通过考虑检测到的车辆的相对距离和速度来确定车道内的车辆。通过比较计算纵向运动、横向运动和警告三个指标,确定每个目标的驾驶状态,并将具有代表性驾驶状态的最近目标确定为主要目标。
Il-ki et al. [30] analyzed several essential situations and make corresponding decisions for each situation under the assumptions that the ego vehicle is in the middle of the lane and the width of the lane is known. For instance, when the sign of the lateral position is opposite to the sign of the lateral velocity, the target is approaching the lane of the host vehicle. When this happens, a weighted lateral position is adopted in the primary target decision process. The weighted position function makes a rapid decision possible, while also considering the fact that the cut-in vehicle is not yet in the ego vehicle lane. The primary target selection algorithm in [30] integrates additional information from multiple models, which allows the ACC system to adapt to another target more intelligently and more reminiscent of human-drivers. Because the vehicle does not always run straightly with the curvature of the lane, and the radar cannot estimate the change in curvature, sometimes confusions of the target occur.
Il-ki 等人 [30] 在假设自车位于车道中央且已知车道宽度的情况下,分析了几种重要情况,并为每种情况做出相应决策。例如,当横向位置的符号与横向速度的符号相反时,目标车辆正在接近自车车道。这时,在主要目标决策过程中采用加权横向位置。加权位置函数使快速决策成为可能,同时考虑到插入车辆尚未进入自车车道。文献[30]中的主要目标选择算法整合了多个模型的信息,使ACC系统能够更智能地适应其他目标,更类似于人类驾驶员。由于车辆并不总是按照车道曲率直线行驶,雷达无法估算曲率变化,有时会发生目标混淆。
Eskandarian and Azim[31] introduced two methods for target selection to improve the performance. Using the steering angle or yaw rate sensor on the ego-vehicle to estimate the curvature is a common approach, but this method is only valid when the curvature constant. Aided visual sensors are utilized as another approach for detection. However, these sensors work unsatisfactorily and further raise the cost. In this thesis project, lane placement is used as the input features, which could represent the in-lane condition.
Eskandarian 和 Azim [31] 提出了两种目标选择方法以提高性能。使用自车上的转向角或偏航率传感器估算曲率是常见方法,但该方法仅在曲率恒定时有效。另一种方法是使用辅助视觉传感器进行检测。然而,这些传感器工作不理想且进一步增加了成本。在本论文项目中,车道位置作为输入特征,代表车道内状况。
Shifeng et al. [32] developed a method that takes the operational behavior of the ego vehicle into consideration and provides sufficient qualitative analysis of trajectories. Shifeng et al. use both preliminary classification and final classification to decide different operating conditions, combine with the trajectory to determine the primary target. This approach overcomes some typical shortcomings of traditional ACC, such as confusion between lane changes and curve enters of preceding vehicles. In this project, the variable path placement is one of the input features which includes the information of trajectories.
Shifeng 等人 [32] 开发了一种考虑自车操作行为的方法,并提供了充分的轨迹定性分析。Shifeng 等人通过初步分类和最终分类决定不同的操作条件,结合轨迹确定主要目标。这种方法克服了传统ACC的一些典型缺点,如车道变更和前车进入弯道之间的混淆。在本项目中,可变路径位置是包含轨迹信息的输入特征之一。
Jianqiang et al. [33] proposed a method to improve the accuracy of the primary target selection based on multi-features fusion. The data is preprocessed by distance compensation factor (DCF) correction to correct the in-lane probability provided by the lidar. Kalman filtering is adopted to track and predict the distance and velocity of neighboring vehicles [34]. Besides, a two-layer artificial neural network is designed to get the importance weight of feature variables. The training output is finally used for target selection. The method metioned in [33] adopted an artificial neural network to predict the lane probability of a vehicle. In this thesis project, an artificial neural network will be
adopted to predict the selected target.
Jianqiang 等人 [33] 提出了一种基于多特征融合的主要目标选择方法。数据经过距离补偿因子(DCF)修正,修正了激光雷达提供的车道内概率。采用卡尔曼滤波器跟踪和预测邻近车辆的距离和速度【34】。此外,设计了一个两层人工神经网络以获取特征变量的重要权重。训练输出最终用于目标选择。文献[33]中的方法采用人工神经网络预测车辆的车道概率。在本论文项目中,将采用人工神经网络预测选择的目标。
A video and radar data fusion system was developed [35] for improved target selection. This framework is capable of fusing target information captured from a camera system and a radar system. The information from the two sensor systems is preserved as low as possible to facilitate exchange. An improved path selection and fused target states are used to identify targets that perform cut-in or cut-out maneuvers. This information is utilized in the primary target selection scheme. The methods of this thesis are data-driven. Thus the accuracy of data has a large impact on the results. More sensors could be utilized to improve the accuracy of sensor fusion.


This chapter presents the methods used to select a target for adaptive cruise control. The data collection, annotation, and preprocess are introduced in Section 3.1, as well as some details about the data. Section 3.2 introduces the problem-solving strategies, and the architectures of machine learning models, including shared models and compare-target models. Section 3.4 describes the algorithm evaluation method.

3.1 数据准备

The experiments in this project are based on data collected and fused by Scania’s test vehicles. Because supervised learning is used, the data need to be annotated. In the next section, several methods are introduced, and for different models, different pre-processing of the data is required.
The details of the data, the annotations, and the pre-processing methods are described in the following subsection.

3.1.1 数据介绍

The raw data for this thesis project was extracted from Scania’s database. There is one radar and one camera on Scania’s test trucks. The data is recorded by these sensors and stored in log files. The input data to the target selection algorithm comes from a sensor fusion algorithm based on the raw data. These trucks have driven in several different highways in Europe.
These data are in time series and recorded at 100 Hz. Of the data in the database, a total of 26 logs have been labeled in the way described in Section 3.1.2. Each log file includes information for around 4 minutes. The information of time length and target changes are presented in Appendix A.1. As for the information, after sensor fusion and a series of calculations, many features of the ego truck and neighbor vehicles can be obtained. However, in order to compare with the current algorithm, the same input as the current algorithm is
used. These statuses contain the states of the ego truck and neighbor vehicles, such as position, velocity, path placement, lane placement, path confidence, lane confidence, type, and other path information. The parameters that are interesting in this project and the descriptions are listed in Appendix A.2.
这些数据是时间序列数据,以100 Hz的频率记录。在数据库中的数据中,共有26个日志按照第3.1.2节描述的方式进行了标注。每个日志文件包含大约4分钟的信息。时间长度和目标变化的信息在附录A.1中提供。通过传感器融合和一系列计算,可以获得自车及邻车的许多特征。然而,为了与当前算法进行比较,使用了与当前算法相同的输入。这些状态包含自车及邻车的状态,如位置、速度、路径位置、车道位置、路径置信度、车道置信度、类型和其他路径信息。项目中感兴趣的参数及其描述列在附录A.2中。
One of the main challenges comes from the data set. The position, velocity, width, and other parameters detected by sensors are not always accurate. Thus the path placement, lane placement, and confidence are also not very reliable.
For example, the estimation of lane placement is not very accurate. By inspecting the parameters and raw video of the data set, it can be observed that sometimes the vehicle is in the same lane as the ego truck, but due to the inaccuracy, the value of the lane placement indicates that the vehicle is still in the other lane. Vehicles in other lanes are usually not selected as the primary target. For example, see Fig.3.1. We can see that part of the white vehicle on the right side is in the same lane as the ego truck.

Figure 3.1: An example of not accurate estimation. The value of the lane placement of the white vehicle is -2.16, and the lane confidence is 1.00. The definition of the lane placement is illustrated in Fig. 3.2.
在这里插入图片描述The definition of lane placement based on the position of a vehicle on the driving lane. For example, as Fig.3.2 indicates, when the vehicle is exactly on the left of the right side lane, the value of lane placement is -2.
Figure 3.2: Lane placement definition. The value of the vehicle’s lane placement depends on where the vehicle is located.
在这里插入图片描述Based on the definition, the lane placement value of this white vehicle should be greater than -2 (Because this vehicle is on the right side, the value should be between -1 to -2). However, the estimated value is -2.16, and the confidence is 1.
Another challenging problem is imbalance and rare situations. When there is a lane change, vehicle cut-in, etc., the selection of the target needs to be more cautiously handled. However, since the data is collected in the real world, lane changes and vehicle cut-in are much less frequent than other typical situations. The percentage of these situations in the data set is small.

3.1.2 数据注释

The data set is labeled manually. The annotation is the ID of the vehicle that should be selected at each moment. The testing truck collecting this data set can simultaneously detect 22 surrounding vehicles, so the annotation is from 0 to 22 (0 means that no vehicle selected as the primary target at the current time). An example of the ground truth is shown in Fig.3.3. From step 20000 to step 25000 (200 sec to 250 sec), the ID of the best target is 6. It can be observed from this figure that at approximately the 27000th step, the primary
target is changed from 6 to 13.
该数据集是人工标注的,标注是每个时刻应该选择的车辆的ID。采集该数据集的测试卡车可以同时检测到22辆周边车辆,因此标注从0到22(0表示当前时刻没有车辆被选为主要目标)。图3.3给出了ground truth的一个例子。从第20000步到第25000步(200秒到250秒),最佳目标的ID是6。从该图中可以观察到,大约在第27000步,主要目标从6变为了13。

Figure 3.3: The ground truth of target selection of log file 30 on each time step.

3.1.3 数据预处理

Normalization is to preprocess data so that the values fall into a uniform range of values, such as [0,1]. The normalization can eliminate the influence of data dimension on modeling and the importance bias caused by some numerical differences, which can speed up the training and promote the convergence of the algorithm.
归一化是预处理数据的过程,使数值落入统一的数值范围,例如 [0,1]。归一化可以消除数据维度对建模的影响和由一些数值差异引起的重要性偏差,这有助于加快训练速度并促进算法的收敛。
In this project, all features in the dataset are scaled to the range [0, 1] on the whole data set according to Equations 3.1, where X is a feature before normalization, X′ is a feature after normalization. Fig.3.4 is an example of normalized data.
在这个项目中,数据集中的所有特征都按照方程式 3.1 缩放到 [0, 1] 的范围上。这里,X 表示归一化前的特征,X′ 表示归一化后的特征。图 3.4 展示了归一化数据的示例。
The data set at each moment contains features up to 22 neighboring vehicles around the ego truck, but not all the time there are 22 vehicles. When the number of vehicles is less than 22, zero is used to pad the features of the nonexistent vehicle. This padding is only used for the input data of shared models in Section 3.2.

Figure 3.4: An example of normalized data.在这里插入图片描述Moreover, before converting the normalized data to time-series samples, zero is used to pad sequences to the same length at the beginning. As shown in Fig.3.5.

Figure 3.5: Using zero to pad before each sequence.
在这里插入图片描述Time Series
For the Shared-LSTM Network explained in Section 3.2.1, the time series data is used as input. The network makes decisions based on the timing information of each vehicle. Therefore, it is necessary to convert the original single time data sample into time series data samples.
The main arguments of this process are the slice window size and sampling time. The sampling frequency of the original data is 100 Hz, which means the sampling time is 0.01 second. In order to reduce the difficulty of network training and the time required for training, it is necessary to set the sampling time and resample in each window to reduce the length of time sequence. For example, as Fig.3.6 indicates, the slice window size is 5 and the sampling time is 0.02s. In Fig.3.6(a), the blue shaded part is the first time series sample. Then sliding the window to get more sample, as the red shaded part in Fig.3.6(b) illustrates.
这一过程的主要参数是切片窗口大小和采样时间。原始数据的采样频率为100 Hz,这意味着采样时间为0.01秒。为了降低网络训练的难度和所需的训练时间,需要设置每个窗口的采样时间并进行重新采样,以减少时间序列的长度。例如,如图3.6所示,切片窗口大小为5,采样时间为0.02秒。在图3.6(a)中,蓝色阴影部分是第一个时间序列样本。然后滑动窗口以获取更多样本,如图3.6(b)中红色阴影部分所示。

Figure 3.6: An example of generating time series data. Slice window size is 5, the sampling time is 0.02 second, and the sequence length is 3.
在这里插入图片描述在这里插入图片描述One-Hot Encoding
The data set is labeled as the ID of the vehicle that needs to be selected at each moment. And the IDs are values from 0 to 22. Usually, machine learning tutorials suggest preparing data in a specific way. A typical instance is the use of one-hot encoding on categorical data or integer encoding, which is mainly for implementing machine learning algorithms efficiently [36].
在数据集中,每个时刻的标签是需要选择的车辆的ID,并且这些ID的取值范围是从0到22。通常,机器学习教程建议以特定的方式准备数据。一个典型的例子是在分类数据上使用独热编码或整数编码,这主要是为了高效实现机器学习算法 [36]。
If there are natural order relationships between each class, machine learning algorithms can understand it under integer encoding. Otherwise, integer encoding is not enough. In this project, there is no relationship between the ID of each vehicle, so it is necessary to convert the integer encoding to one-hot encoding. For example, when the maximum integer value is 22, the onehot encoding representation of the integer encoding value 5 is represented as Fig.3.7

Figure 3.7: One-hot encoding representation of the integer encoding value 5, and the maximum integer value is 22.
Concatenation of Features
For the compare-target network and tree model explained in Section 3.2.2, the most suitable vehicle is selected as the primary target by comparing the neighboring vehicles with each other. In order to achieve comparison, data need to be concatenated and relabeled. We concatenated the features of the primary target with each other surrounding vehicles and labeled 1 or 0 according to the order of the features.
Take the situation in Fig.3.8 as an example. Vehicle-1 (within the red frame) and vehicle-2 (within the yellow frame) are surrounding vehicles, and vehicle-1 is labeled as the primary target.
In Fig.3.9(a), the features of vehicle-1 (target vehicle) is connected with the features of vehicle 2 (non-target vehicle), and the annotation is 1. In Fig.3.9(b), conversely, the features of vehicle 2 (non-target vehicle) is connected with the features of vehicle 1 (target vehicle) and labeled as 0.

3.2 模型架构

This section mainly introduces the theories and model structures of implemented methods. Two types of models are introduced, the shared model and the compare-target model. The compare-target model is an original method.
Figure 3.8: An example of the driving situation [5]. Vehicles within the red and yellow frames are surrounding vehicles, the blue vehicle is the ego vehicle.
Figure 3.9: An example of concatenating features for vehicles comparasion. In this example, each vehicle has four features.
With shared models, each vehicle can share the same model. When combining LSTM with the shared network, time series can be used as input, taking into account the factors of time. The compare-target model can take the interactions between two vehicles into consideration. Moreover, the tree-based method XGBoost is adopted. Because the samples are imbalanced and the tree-based model is less sensitive to data imbalance. A summary of the pros and cons is listed in Table 3.1.

3.2.1 共享模型

Shared Model Structure
The decision at every moment depends on the states of the neighboring vehicles and the ego truck. Therefore, during training, each sample should include information about neighboring vehicles and ego truck, and neighboring vehicles should share one model.
For instance, when we use an artificial neural network model to evaluate the semantic similarity between two sentences. A model has two inputs that are the two sentences to compare. The model outputs a score ranging from 0 to 1, which represents the similarity. Under this scenario, semantic similarity is a corresponding relationship, and the positions of the two input sentences can be exchanged. So it is unreasonable to train two models separately to handle two inputs. Instead, these two inputs should be evaluated using the same model [37, 38]. Similarly, in this project, each input (information on neighboring vehicles) is exchangeable. For this reason, it would not make
sense to learn several independent models to process each neighbor vehicle.
The purpose of using the same model can be achieved by the shared model structure. The structure of shared model is shown in Fig. 3.10. The signals of neighboring vehicles are input to the same model, and the corresponding outputs are obtained. The loss function is based on these outputs.

Figure 3.10: Shared model Structure.
在这里插入图片描述Shared Network
The structure of a shared network for target selection is proposed, see Fig. 3.11. The input layer of the model has two parts. The left side input layer inputs the features of the ego truck. The right side layer inputs the features of the neighboring vehicles. The description of the features is shown in Section 3.1.3. For the hidden layer, the features of the ego truck are connected to two fully connected layers. The features of neighboring vehicles are connected to two shared fully connected layers and the activation function is ReLU. The output of each surrounding vehicle feature of the shared model is respectively concatenated to the output of the ego truck features. Then the model inputs the concatenated signals into a shared fully connected layer The sigmoid function is selected as the activation function. The outputs of this layer are concatenated. Finally, connecting the result to a fully connected layer and setting the softmax function as the activation function.

Figure 3.11: The model structure of shared network
Shared-LSTM Network
The raw data contains information over time. Thus we consider taking a time series as input and making decisions with the timing information of each vehicle. The information of time sequence data is introduced in Section 3.1.3. For the model structure, the structure of the shared-LSTM network is similar to that of the shared network, see Fig. 3.12. The difference of shared-LSTM network is that the first fully connected layer after the input layer is replaced with LSTM and tanh is selected as the activation function.

Figure 3.12: The model structure of shared-LSTM network.在这里插入图片描述

3.2.2 比较目标模型

Compare Target
When making decisions, the state of one of the neighboring vehicles affects the decision to other vehicles, as an example shown in Fig. 3.13. In Fig 3.13(a), the probability of selecting vehicle-1 (within the red box) as the primary target is higher than the probability of selecting vehicle-2 (within the yellow box). In Fig 3.13(b), the state (position, speed, etc.) of the vehicle 1 is almost constant, but since the state of the vehicle-2 changes, the probability of selecting the vehicle-1 as the primary target is very low. Therefore, it is necessary to consider the relationship between vehicles. As illustrated in Section 3.1.3, the features are concatenated to learn the effects of features
between vehicles. Ideally, the relationships between all vehicles should be considered together. Due to computational limitation and model complexity, we only consider the relationship between two vehicles.

Figure 3.13: An example for influences between vehicles [5]. See the main text for detailed explanation.

For the strategy of target selection, the features are concatenated in pairs between all detected vehicles, and the output should be 1 or 0. The primary target selected at each moment is the vehicle with the largest sum of outputs, where they are the first in the pair of features. This method can be explained by Equation 3.2. F stands for the machine learning model and represents the operation of concatenation. x stands for the features and i, j are the ID of vehicles.
In this project, a multi-layer artificial neural network and a tree model were used to achieve the comparison. It should be noted that the data set used for the models in this section is not in time sequence but still on a single moment. The reason is by comparing the results of the shared network and shared-LSTM described in Section 4, the improvement of the result is not obvious when using time sequence as input, and the difficulty of training and the time required increased.
Compare-Target Network
The structure of the compare-target network is shown in Fig. 3.14. This model’s purpose is to achieve binary classification. The input layer of this model has also two parts. The inputs on the left side are the features of the ego truck, and the inputs on the other side are the features of the neighboring vehicles. The specific information of the features is described in section 3.1.3. In the hidden layer, the features are transmitted to two fully connected layers. The activation function of the first layer is ReLU and the activation function of the second layer is sigmoid.
Figure 3.14: The model structure for compare-target.
在这里插入图片描述Compare Tree
In addition to using artificial neural networks, XGBoost is also adopted. The principles of XGBoost and the reasons for using this algorithm are described in Section 2.2.2.

3.2.3 网络微调

A common problem in machine learning application is data imbalance that some classes or situations have a significantly low number of samples [39]. Data imbalance is also a problem in this project. Although the amount of data is large, there are only a few data for some cases. In the data set, some vehicle lane change situations are more important, and it is usually necessary to change the primary target in these situations. However, the number of samples for these situations is small in the data set, and the number of change targets is shown in Appendix A.1. The number of these situations is particularly small compared to the total data set, which means the situation of
imbalance is severe. This problem leads to unsatisfactory performance in rare situations (lane change). Methods of coping with imbalance are studied for machine learning models. Random majority under-sampling [40] is adopted here.
The specific implementation process is divided into the following steps:

  1. Extract the data of target-change situations as well as the data before and after target-change situations in a limited time window (We can infer the target-change situation by the change of the ID).提取目标变更情况的数据,以及在有限时间窗口内目标变更情况前后的数据(可以通过ID的变化推断目标变更情况)。
  2. Random sampling in the whole data sets, making the number of samples in common situations balanced with the number of samples of the target-change situations.在整个数据集中进行随机抽样,使常见情况的样本数量与目标变更情况的样本数量平衡。
  3. Use the original data set to train and save the model first. Then use the balanced data to fine tune the saved model after adjusting learning rate.首先使用原始数据集训练并保存模型。然后使用经过平衡处理的数据来微调已保存的模型,调整学习率。
    It is emphasized that the training use the whole training data set first and then fine-tuning the model with the extracted balanced data set, instead of directly using the extracted subset. There are two main reasons:强调首先使用完整的训练数据集进行训练,然后再用提取的平衡数据集对模型进行微调,而不是直接使用提取的子集。这样做的两个主要原因是:
  4. There are many different kinds of situations in the whole data set, and training should include as many situations as possible. However, the data are not labeled for these situations.整个数据集中有许多不同类型的情况,训练应该尽可能涵盖更多情况。然而,这些情况的数据并未为这些情况标记。
  5. The number of samples of the extracted subset is too small, and it is unreasonable to train with such a small set directly.

3.3 异常情况

In the process of driving a vehicle, it is not always necessary to select a vehicle as the primary target. When there is no suitable vehicle as the primary target, there is no need to use adaptive control to keep the ego truck and other vehicles at a time gap. Therefore, making a judgment on whether or not to choose a target is an important step.
For the shared network and shared-LSTM network, a threshold is set to determine whether the target should be selected. For these two model structures, as presented in Fig 3.11 and Fig 3.12, the outputs shared layers (purple blocks) stand for the probability of selecting one vehicle as the primary target. If this value is smaller than the threshold, the corresponding vehicle should not be selected.
For the compare-target model, the threshold cannot be applied directly. Since the output value in compare-target model stands for which vehicle is more suitable as the primary target. So we reuse the shared layers in the shared network. As indicated in Fig. 3.15, firstly, the most suitable target is determined by the compare-target model, and then the features of this target and the features of the ego truck are input into the shared network, and the output value is calculated. If the output value is smaller than the threshold, no target should be selected at this time.

Figure 3.15: The strategy to determine no target should be selected.在这里插入图片描述

3.4 评估

In this project, the evaluation consists of two parts: the accuracy of the selection and the cost of selection.
The accuracy is the evaluation criteria. The definition of accuracy is indicated in Equation 3.3
Ncorrect represents the number of correct Selection, Ntotal is the total number of time steps. The definition of correct selection is the target selected by the machine learning model the same as the human label.
Besides the accuracy, collision avoidance acceleration (CAA) cost and distance cost were also adopted as an evaluation criterion of the target selection performance. The reason to use this criterion is when the selected target is not the correct target the impact of different choice is different. The total cost is 0 when the selected vehicle is the same as the labeled vehicle. If the impact of selecting one vehicle is similar to the result of selecting the correct vehicle, the cost is low. However, if choosing a vehicle that will lead to a bad result (affect the safety), the cost will be high. This cost is based on Scania’s algorithm, so the details of this cost will not be described in this report. It is only used in Chapter 4 to evaluate the performance of the machine learning models.


This project’s objective is to explore machine learning methods for adaptive cruise control target selection. The methods introduced in Chapter 3 (shared network, shared-LSTM network, compare-target network, and XG- Boost model) are implemented, analyzed, and compared in Section 4.1. A case study is presented in Section 4.2.

4.1 比较与分析

The data set used by each model in the experiment was the same. The final evaluations in this experiment were based on the same test set, showing the best performance achieved by each model. The experimental settings and details can refer to Appendix B.1 and Appendix B.2. The accuracy and cost of the models introduced in Section 3.2 are listed in Table 4.1.
实验中每个模型使用的数据集均相同。本实验的最终评估基于同一测试集,展示了每个模型所达到的最佳性能。实验设置和详细信息请参阅附录 B.1 和附录 B.2。第3.2节介绍的模型在准确率和成本方面的评估列于表格 4.1 中。
在这里插入图片描述In the time series model, LSTM is used to extract important information from the time series. As Table 4.1 indicates, the best performance of the shared network is 90.16%, the best performance of the shared-lstm network is 91.94%. Since the decision mainly depends on the current moment, the information from time-series is not too much. Thus the improvement is not obvious. In addition, due to the use of the recurrent neural network, the training is more difficult, and the training consumes more resources and time. For these two reasons, in the compare-target model, instead of considering time sequences, only data at a single moment is used as input.
By comparing the shared network with the compare-target model, it is obvious that the performance of the compare-target model is better. The accuracy of the compare-target network and the compare target XGBoost could reach more than 94%. The reason that the compare-target model performs better can be seen as that the compare-target methods take the interrelationships between the vehicles into consideration. At any moment, the decision on a vehicle is not based individually on the information of this vehicle. Each vehicle influences each other.
The result of the compare-target network (fine-tuned) is obtained after the fine-tuning on the compare-target network using the method mentioned in Section 3.2.3. The result has improved a bit after fine-tuning. Here is an example of the performance of a compare-target network on log file 78, see Fig 4.1. As observed in this figure, from the 5000th step to 8000th step, the selection is not stable and smooth enough. Between the 10000th step and 10500th step, there is a vehicle that tries to cut-in (see Fig. 4.2), but the correct vehicle is not selected in time. One of the factors that led to this phenomenon is data imbalance and noisy. After the fine-tuning (see Fig. 4.3), the performance has been improved. Especially between the 6000th step to 8000th step, the selection process is smoother. Also, the selection performance on cut-in condition is better.

4.2 案例研究

In this section, the performance of the algorithm is analyzed for specific situations in the test set and some target selection results are displayed. The results of the algorithms in this section are based on the compare-target XGBoost.
According to the results of the previous section, the machine learningbased approachs can achieve a high accuracy overall. That is, the selected target is similar to the target selected by the human driver. For example, in Fig.4.4, the target selected by the algorithm (the vehicle in the yellow rectangle) is the same as the choice of human drivers. However, in some cases, the results of machine learning are different from what we expect. As shown in Fig. 4.5, another vehicle is a better choice for adaptive cruise control. But the machine learning algorithm still chooses the original vehicle as the primary target. One of the possible causes is the error from target detection
and sensor fusion. From this figure we can see that the estimate of the lane placement is different from the real value (real value should be between -1 and -2). Actually, the estimation of the yaw rate, trajectory, etc. of this vehicle is also not accurate. This is just one example. Each parameter affects each other.
Figure 4.1: The performance of target selection of compare-target network on log file 78.
Figure 4.2: An example of cut-in situation on log file 78. The vehicle in the orange box tries to cut-in.
在这里插入图片描述Figure 4.3: The performance of target selection of compare-target network(fine tune) on log file 78.
Figure 4.4: An example of target selection, explained in the main text (example-1).
By analyzing the state at the next moment (see Fig. 4.6), we found that the estimated lane placement of the white vehicle is between -1 and -2 (-1.80), and this vehicle is selected as the primary target. From the results of these three moments, one conclusion that can be initially drawn is that the model learns the impact of the lane placement on the target selection. However, due to the existence of errors and noise, the most suitable choice cannot be made at all times.
Except for these rare situations, the model performs well. The performance of the algorithm in two real highway environments is shown as Fig.4.7 and Fig. 4.8.
Figure 4.5: An example of target selection, explained in the main text (example-2).

Figure 4.6: An example of target selection, explained in the main text (example-3).

Figure 4.7: An example of the performance of the compare-target XGBoost on real highways (log file 32). The red lines represent the lane. The green cuboids represent the detected vehicles. The yellow rectangle represents the target selected by the machine learning algorithm.
这是比较目标 XGBoost 在真实高速公路上(日志文件 32)的性能示例。红色线条代表车道,绿色立方体代表检测到的车辆,黄色矩形代表机器学习算法选择的目标。

Figure 4.8: An example of the performance of the compare-target XGBoost on real highways (log file 53). The red lines represent the lane. The green cuboids represent the detected vehicles. The yellow rectangle represents the target selected by the machine learning algorithm.
这是比较目标 XGBoost 在真实高速公路上(日志文件 53)的性能示例。红色线条代表车道,绿色立方体代表检测到的车辆,黄色矩形代表机器学习算法选择的目标


This chapter discusses the results displayed in Chapter 4. Some problems are also analyzed, such as data imbalance and no target should be selected, together with the possible directions for future work.
The data imbalance is a very severe problem. Even after fine-tuning with a balanced data set, the performance in some important situations is still not ideal. Some possible reasons can be found by analyzing existing data sets. The number of log files is 26, each file lasts about 4-5 minutes, the total duration is more than 100 minutes. However, for machine learning, this number of data sets may be still insufficient. Moreover, the distribution of the situations is imbalanced, which is not suitable for machine learning. Through Appendix
A.1, we know that there are only 137 target changes in the data set (some of which are not vehicle cut-in or cut-out situation). Thus, the amount of data in these rare cases is insufficient. When we use time sequences as input, the network is expected to detect cut-in situations. Due to the data insufficiency, the improvement is not obvious. For the data set imbalance problem, more data need to be collected, especially the situations of vehicle cut-in. Also, data augmentation could be adopted to increase the amount of training data.
Since the situations of vehicle cut-in and lane change are essential, this type of situations can be labeled. In the future, decisions can be made by predicting the lane change behavior of the vehicle.
数据不平衡是一个非常严重的问题。即使在使用平衡数据集进行微调之后,某些重要情况下的性能仍然不理想。通过分析现有数据集,可以找到一些可能的原因。日志文件数量为26个,每个文件持续约4-5分钟,总时长超过100分钟。然而,对于机器学习来说,这个数据集数量可能仍然不足够。此外,情况分布不均衡,这对机器学习不利。通过附录 A.1,我们知道数据集中仅有137次目标变更(其中一些不是车辆变道或退出的情况)。因此,这些稀有情况的数据量是不足的。当我们将时间序列作为输入时,网络预期能够检测到车辆变道情况。由于数据不足,改进并不明显。对于数据集不平衡的问题,需要收集更多数据,特别是车辆变道的情况。此外,可以采用数据增强来增加训练数据的量。
In the process of driving, not all the time we can find a suitable neighboring vehicle as the primary target. In Section 3.3, a method is proposed for the shared models and the compare-target models to determine to choose or not to choose any target. However, this method needs to set a hyper-parameter threshold, which is difficult to decide. It is also considered that the set value is not necessarily suitable. As for this threshold, a possible solution is to label the vehicles that can be selected as targets. Then we can train a model to determine whether a target can be chosen.
The best result is obtained by the compare-target model. The relationship between any two vehicles is considered in this model. However, the relationship between all vehicles is not considered at the same time. Because if considering the intrinsic relationship between all vehicles at the same time, the dimensions of the data will be extremely large, which increases the difficulty of model training. The data need to be learned in a suitable format. A possible method to consider all vehicles together is to convert the origin features to graphs. As shown in Fig. 5.1, using different colors and shapes to represent vehicles, trucks, lanes, and paths . In this way, we can use convolutional
neural networks to extract features and consider the spatial relationship between vehicles. However, it is difficult to convert the raw data to images without losing information and extracting the correct features. The method of convolutional neural network finally reached an accuracy of 86.34%. Because the training requires much time and the results are not satisfying, it is not introduced in the main work. However, this method is worth exploring.
The objective of this thesis is exploring machine learning based methods for adaptive cruise control target selection. In fact, it is not necessary to use machine learning in all scenarios, even though machine learning has achieved excellent results in many applications. Making decisions by modeling the motion of surrounding vehicles and road conditions could make the results more reliable and controllable.
Figure 5.1: Convert the raw data to images. Using a white rectangle to represent the ego vehicle, green rectangles to represent the neighbor vehicles, white lines to represent lanes and paths. Gaussian filters are applied to represent confidence.


The objective is to develop a machine learning based solution for adaptive cruise control target selection that could select a primary target which is suitable for the human driver. Data collected on roads are fused for target selection. In order to apply machine learning algorithms for selecting a primary target for the adaptive cruise control system, the data is labeled and preprocessed. In this thesis, mainly two methods are developed. The shared network shares the same hidden-layers for neighboring vehicles. Based on the shared network, we use LSTM units to replace the first fully-connected layer to train the time sequence data. The performance has been improved after considering the time series. But there is no significant difference since the decisions mainly depend on the current condition. We proposed a compare-target method to consider the relationship between vehicles. The performance of the compare-target network is better than shared models. Because the data set is imbalanced and noisy, a tree-based method XGBoost is adopted. A strategy of fine-tuning is adopted to reduce the impact of data imbalance. Comparing the four machine learning methods w.r.t. the test accuracy and the cost, the
compare-target XGBoost model performed best with an accuracy of 94.85% on the testing set.




A.1 日志数据信息


A.2 信号描述



B.1 硬件和软件设置

Experiments were implemented on a mobile workstation, HP ZBook Studio G5 Mobile Workstation, with a Intel® Core™ i7-8850H CPU 2.60GHz processor, a Quadro P1000 4GB GPU and 16GB RAM. To simulate and unpack the data set, MATLAB and Simulink were used. And pandas, a Pythonbased library was used for data access. To implement the artificial neural network and tree models, the Python-based framework TensorFlow, Keras, xgboost, deepctr, scikit were used. The details of the platforms and libraries of this project are shown in Table B.1.
实验是在一台移动工作站上实施的,具体是HP ZBook Studio G5移动工作站,配备了Intel® Core™ i7-8850H CPU 2.60GHz处理器、Quadro P1000 4GB GPU和16GB RAM。为了模拟和解包数据集,使用了MATLAB和Simulink。而pandas这个基于Python的库被用来访问数据。为了实现人工神经网络和树模型,使用了基于Python的框架TensorFlow、Keras、xgboost、deepctr、scikit。这个项目的平台和库的详细信息显示在表B.1中。

B.2 实验细节

• Data Split
15% of data are split as testing data. Other data are divided into training data and validation data. To be specific, 4 log files are split as testing data, and the other 22 log files are used for training and validation.
• K-fold Cross-validation
K-fold cross-validation is applied. The original data samples are randomly divided into k equal sized subsamples. Of these k subsamples,
one single subsample is used as the validation data. The remaining k-1 subsamples are retained as training data. The cross-validation process is then repeated for k times [41]. Here k is 10.
• Optimization Algorithm and Hyper-parameters
The optimization algorithm Adam [42] was adopted in all neural network methods. The networks are trained for 20 epochs and the batch
size is 100, the initial learning rate is 0.001.
• Initialization
The weights of neural networks are initialized with He-Normal initialization.
• Regularization
L2 regularization term is applied in the neural network for weight decay to reduce the influence of over-fitting. The value is 0.00001.
• Learning Rate Decay
The learning rate will be reduced when the metric has stopped improving for 3 epochs. The factor of learning rate decay is 0.1 that
lrnew = factor × lr.
当度量指标在3个周期内停止改善时,学习率将减少。学习率衰减的因子为0.1,即新的学习率 lrnew=factor×lrlrnew​=factor×lr。
• Early Stopping
The training will be stopped when the metric has stopped improving for 10 epochs.

评论 1




当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


