Ultra-Fast-Lane-Detection-v2解读

最新推荐文章于 2024-08-21 08:24:53 发布

zhu_linx

最新推荐文章于 2024-08-21 08:24:53 发布

阅读量1.5k

点赞数 2

文章标签：人工智能计算机视觉深度学习

本文链接：https://blog.csdn.net/zhu_linx/article/details/127229551

版权

Ultra-Fast-Lane-Detection-v2解读

v2和v1对比

v1采用row-anchor方式,v2采用的是hybrid-anchor方式，原因如下:
(a) shows the definition of lanes by the CULane dataset. (a图展示了CULane数据集对车道线的定义)
(b) is the lane-wise accuracy with the row anchor system. (b图展示了使用row-anchor系统的车道线精度)
© illustrates the lane-wise accuracy with the column anchor system. (c图说明了使用column-anchor系统的车道线精度)
We can see that for ego lanes, the row anchor system gains better performance, while the column anchor system gains better performance for side lanes.(我们可以看到，对于ego车道，row-anchor系统获得了更好的性能，而对于side车道，column-anchor系统获得更好的性能)
更具体的原因见论文3.1部分

<img src='im 在这里插入图片描述

v2算法总体设计

Illustration of the network architecture. The input image is first sent to a backbone network to get the deep feature. Then the deep feature
is flattened and fed into a classifier, which has two output branches. The first localization branch is to learn the coordinates on the hybrid anchors
with classification-based representation. The second existence branch is to predict the existence of each coordinate on the hybrid anchors. After
obtaining the localization output, we use expectation instead of argmax to get the coordinates of lanes.
(网络体系结构的图示.首先将输入图像发送到主干网络以获取深层特征.然后是深层特征被展平并送入一个具有两个输出分支的分类器.第一个是基于混合锚点分类器表示的定位分支,用来学习坐标.第二个存在分支是预测混合锚上每个坐标的存在情况.之后
为了获得定位输出，我们使用期望值而不是argmax来获得车道坐标)

锚点驱动网络设计

名词对照表

标签生成

If the lane has no intersection between certain anchors, the coordinates will be set to -1. Suppose the number of lanes assigned to row anchors is N^r_lane and the one for column anchors is N^c_lane. The lanes in an image can be represented by a fixed-size target T where every element is either the coordinate of lane or -1, and its length is N_row × N^r_lane + N_col × N^c_lane. T can be divided into two parts T^r and T^c, which correspond to the parts on row and column anchors, and the sizes are N_row× N^r_lane and N_col × N^c_lane respectively.
(如果车道在某些锚之间没有交集，坐标将设置为-1。假设指定给行锚的车道数为N^r_lane，而列锚的车道为N^c_lane。图像中的车道可以用一个固定大小的目标T表示，其中每个元素都是车道坐标或-1，其长度为N_row×N^r_lane。T可以分为两个部分T^r和T^c.这两个部分对应于行锚和列锚上的部分，其大小分别为N_row×N^r_lane和N_colxN^c_lane)
With the help of lane representation with hybrid anchor, our goal of designing networks is to learn the fixed-size targets T^r and T^c with classification. To learn T^r and T^c with classification, we map different coordinates in T^r and T^c to distinct classes. Suppose T^r and T^c are normalized (the elements of T^r and T^c range from 0 to 1 or equal -1, i.e., the “no lane” case), and the numbers of classes are N^r_dim and N^c_dim. The mapping can be written as:
(借助混合锚的车道表示，我们设计网络的目标是通过分类学习固定大小的目标T^r和T^c。为了使得分类器学习T^r和T^c，我们将T^r和T^c中的不同坐标映射到不同的类。假设T^r和T^c被归一化（T^r和T^c的元素范围为0到1或等于-1，即“无车道线”情况），类的数量为N^r_dim和N^c_dim。映射可以写为:)

注:[in which T^r_cls and T^c_cls are the mapped class labels of the coordinates, [·] is the floor operation, and T^r_{cls_i,j} is the element in the i-th row, j-th column of T^rcls. In this way, we could convert the learning of coordinate on the hybrid anchor to two classification problems with dimensions of N^r_dim and N^c_dim, respectively. For the no lane case, i.e., T^r_i,j or T^c_m,n equals -1, we use an additional two-way classification to indicate:]
(其中，T^r_cls和T^c_cls是坐标的映射类标签，[·]是向下取整除法运算即将dim全局坐标转为dim坐标，T^r_{cls_i,j}是T^r_cls的第i行第j列中的元素。这样，我们可以将混合锚上的坐标学习转化为两个分类问题，其维数分别为N^r_dim和N^c_dim。对于无车道情况，即T^r_i,j或T^c_m,n=-1，我们使用额外的双向分类来表示:)
具体处理代码可参见utils/common下的inference_culane_tusimple函数

注:[in which T^r_ext is the class label of the coordinates’ existence, and T^r_{ext_i,j} is the element in the i-th row, j-th column of T^r_ext. The existence targets for column anchor T^c_ext is similar:]
(其中，T^r_ext是坐标存在的类标签，T^r_{ext_i,j}是T^r_ext第i行第j列的元素。列锚T^c_ext的现有目标类似:)

注:[With the above derivation, the whole network is to learn the T^r_cls, T^c_cls, T^r_ext and T^c_ext with two branches, which are localization and existence branches. Suppose the deep feature of an input image is X, the network can be written as:]
(通过以上推导，整个网络将学习T^r_cls、T^c_cls、T^r_extT和T^c_ext两个分支，即定位分支和存在分支。假设输入图像的深层特征是X，网络可以写成：)

注:[in which P and E are the localization and existence branches, f is the classifier, and flatten(·) is the flatten operation. The outputs of P and E are all composed of two parts (P^r, P^c, E^rand E^c), which correspond to the row and column anchors respectively. The sizes of P^r and P^c are N^r_lane × N_row × N^r_dim and N^c_lane × N_col × N^c_dim respectively,in which N^r_dim and N^c_dim are the mapped classification dimensions for row and column anchors. The sizes of E^r and E^c are N^r_lane×N_row×2 and N^c_lane×N_col×2 respectively]
(其中P和E是定位和存在分支，f是分类器，flature（·）是展平操作。P和E的输出都由两部分组成（P^r、P^c、E^r和E^c，分别对应于行锚和列锚。P^r和P^c分别用N^r_lane×N_rowxN^r_dim和N^c_lane × N_col × N^c_dim 表示,N^r_dim和N^c_dim是行锚和列锚的映射分类维度。E^r和E^c的维度分别是N^r_lane×N_rowx2和N^c_lane×N_col×2)

注:[we directly flatten the deep features from the backbone and feed them to the classifier. In comparison, conventional classification networks [54], [55], [56], [57] use global average pooling (GAP). The reason why we use flatten instead of GAP is that we find the spatial information is crucial for the classification-based lane detection network. Using GAP would eliminate the spatial information and result in poor performance]
(我们直接对主干层输出的深层特征展平并将其提供给分类器。相比之下，传统分类网络使用全局平均池（GAP）。之所以使用展平代替GAP，是因为我们发现空间信息对于基于分类的车道检测网络至关重要。使用GAP会消除空间信息并导致性能不佳)

顺序分类损失

正如上述公式中看到的,一个基本性质是上述分类网络中的类具有顺序关系,在我们的分类网络中,相邻分类被定义为具有和传统分类不同的紧密的顺序关系,为了更好利用这个顺序关系的先验知识,我们提议使用基础分类损失和期望损失.

基础分类损失被定义如下:

注:[上述公式中的L_CE(·) 是交叉熵损失,P^r_i,j是第i条车道线、第j个row锚点的预测定位结果,T^r_{cls_i,j}是对应的真实标签,列锚loss同row锚点损失]
这里标签中是有-1存在,按照传统交叉熵计算是无法计算的,源码中的具体loss算法可以参见utils/loss.py中的soft_nll函数

由于分类是有顺序的，预测的期望可以看作是平均投票的结果。为了方便我们将期望值表示为如下: