


(a) shows the definition of lanes by the CULane dataset. (a图展示了CULane数据集对车道线的定义)
(b) is the lane-wise accuracy with the row anchor system. (b图展示了使用row-anchor系统的车道线精度)
© illustrates the lane-wise accuracy with the column anchor system. (c图说明了使用column-anchor系统的车道线精度)
We can see that for ego lanes, the row anchor system gains better performance, while the column anchor system gains better performance for side lanes.(我们可以看到,对于ego车道,row-anchor系统获得了更好的性能,而对于side车道,column-anchor系统获得更好的性能)

Illustration of the network architecture. The input image is first sent to a backbone network to get the deep feature. Then the deep feature
is flattened and fed into a classifier, which has two output branches. The first localization branch is to learn the coordinates on the hybrid anchors
with classification-based representation. The second existence branch is to predict the existence of each coordinate on the hybrid anchors. After
obtaining the localization output, we use expectation instead of argmax to get the coordinates of lanes.



If the lane has no intersection between certain anchors, the coordinates will be set to -1. Suppose the number of lanes assigned to row anchors is Nrlane and the one for column anchors is Nclane. The lanes in an image can be represented by a fixed-size target T where every element is either the coordinate of lane or -1, and its length is Nrow × Nrlane + Ncol × Nclane. T can be divided into two parts Tr and Tc, which correspond to the parts on row and column anchors, and the sizes are Nrow× Nrlane and Ncol × Nclane respectively.
With the help of lane representation with hybrid anchor, our goal of designing networks is to learn the fixed-size targets Tr and Tc with classification. To learn Tr and Tc with classification, we map different coordinates in Tr and Tc to distinct classes. Suppose Tr and Tc are normalized (the elements of Tr and Tc range from 0 to 1 or equal -1, i.e., the “no lane” case), and the numbers of classes are Nrdim and Ncdim. The mapping can be written as:

注:[in which Trcls and Tccls are the mapped class labels of the coordinates, [·] is the floor operation, and Trcls_i,j is the element in the i-th row, j-th column of Trcls. In this way, we could convert the learning of coordinate on the hybrid anchor to two classification problems with dimensions of Nrdim and Ncdim, respectively. For the no lane case, i.e., Tri,j or Tcm,n equals -1, we use an additional two-way classification to indicate:]

注:[in which Trext is the class label of the coordinates’ existence, and Trext_i,j is the element in the i-th row, j-th column of Trext. The existence targets for column anchor Tcext is similar:]

注:[With the above derivation, the whole network is to learn the Trcls, Tccls, Trext and Tcext with two branches, which are localization and existence branches. Suppose the deep feature of an input image is X, the network can be written as:]

注:[in which P and E are the localization and existence branches, f is the classifier, and flatten(·) is the flatten operation. The outputs of P and E are all composed of two parts (Pr, Pc, Erand Ec), which correspond to the row and column anchors respectively. The sizes of Pr and Pc are Nrlane × Nrow × Nrdim and Nclane × Ncol × Ncdim respectively,in which Nrdim and Ncdim are the mapped classification dimensions for row and column anchors. The sizes of Er and Ec are Nrlane×Nrow×2 and Nclane×Ncol×2 respectively]
(其中P和E是定位和存在分支,f是分类器,flature(·)是展平操作。P和E的输出都由两部分组成(Pr、Pc、Er和Ec,分别对应于行锚和列锚。Pr和Pc分别用Nrlane×NrowxNrdim和Nclane × Ncol × Ncdim 表示,Nrdim和Ncdim是行锚和列锚的映射分类维度。Er和Ec的维度分别是Nrlane×Nrowx2和Nclane×Ncol×2)

注:[we directly flatten the deep features from the backbone and feed them to the classifier. In comparison, conventional classification networks [54], [55], [56], [57] use global average pooling (GAP). The reason why we use flatten instead of GAP is that we find the spatial information is crucial for the classification-based lane detection network. Using GAP would eliminate the spatial information and result in poor performance]




注:[上述公式中的LCE(·) 是交叉熵损失,Pri,j是第i条车道线、第j个row锚点的预测定位结果,Trcls_i,j是对应的真实标签,列锚loss同row锚点损失]








注释:[α 和 β 是损失函数的系数,另外源码中实际的损失不仅仅包括以上损失还将v1的损失函数加入.]



