SCNN -Spatial As Deep: Spatial CNN for Traffic Scene Understanding论文阅读+代码复现(车道线检测)

本文探讨了深度学习模型SCNN在车道线检测任务中的应用,对比了CULane、TuSimple和CaltechLanesDataset的数据集特点。通过改进的深残差学习,SCNN实现了更优的性能,特别是在处理复杂场景如CULane时。实验中,模型基于VGG16并在ImageNet上预训练,采用Torch7框架实现,现在也有TensorFlow和PyTorch版本。训练策略包括调整学习率策略和批量归一化等。评估指标为IoU,并进行了消融研究以验证不同组件的效果。此外,还提供了训练和测试代码,便于复现和进一步研究。

数据集

CULane: 专注四车道问题,并且障碍物另一边的车道没有进行标注
TuSimple 和 Caltech Lanes Dataset 的场景都比较简单 ,相比较来说CULane有更强的实际意义.

在这里插入图片描述
where f is a nonlinear activation function as ReLU

However, deep residual learning (He et al. 2016) has shown its capability to easy the training of very deep neural networks. Similarly, in our deep SCNN messages are propagated as residual, which is the output of ReLU in Eq.(1).
Such residual could also be viewed as a kind of modification to the original neuron.
As our experiments will show, such message pass scheme achieves better results than LSTM based methods

Experiment

CULane
Cityscapes

SGD with batch size 12,
base learning rate 0.01,
 momentum 0.9,
 weight decay 0.0001
 The learning rate policy is ”poly” with power and iteration number set to 0.9 and 60K respectively.
 Our models are modified based on the LargeFOV model in (Chen et al. 2017).
 The initial weights of the first 13 convolution layers are copied fromVGG16 (Simonyan and Zisserman 2015)
  trained on ImageNet (Deng et al. 2009). 

All experiments are implemented on the Torch7 (Collobert, Kavukcuoglu, and Farabet 2011)
framework. 现在也有 tensorflow 和 pytorch 版本

Lane detection model

没有采用先二值分割再聚类的方法 , 把车道线分开来分成四类, probmaps 被送到小的网络来预测车道标记是否存在.

在测试期间, 我们仍需要把概率图变成曲线. 像下图这样. 对于 existence value 大于0.5的车道标记, 我们在对应的probmap中每20行寻找最强相应的位置, 这些位置被三次样条曲线连接起来,形成最终结果.

在这里插入图片描述

  • As shown in Fig.5 (a), the detailed differences between our baseline model and LargeFOV are:
    (1) the output channel number of the ’fc7’ layer is set to 128,
    (2) the ’rate’ for the atrous convolution layer of ’fc6’ is set to 4,
    (3) batch normalization (Ioffe and Szegedy 2015) is added before each ReLU layer,
    (4) a small network is added to predict the existence of lane markings.
    During training, the line width of the targets is set to 16 pixels, and the input and target images are rescaled to 800 × 288. Considering the imbalanced label between background and lane markings, the loss of background is multiplied by 0.4

Evaluation

In order to judge whether a lane marking is successfully detected, we view lane markings as lines with
widths equal to 30 pixel

评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值