SCNN -Spatial As Deep: Spatial CNN for Traffic Scene Understanding论文阅读+代码复现(车道线检测)

数据集

CULane: 专注四车道问题,并且障碍物另一边的车道没有进行标注
TuSimple 和 Caltech Lanes Dataset 的场景都比较简单 ,相比较来说CULane有更强的实际意义.

在这里插入图片描述
where f is a nonlinear activation function as ReLU

However, deep residual learning (He et al. 2016) has shown its capability to easy the training of very deep neural networks. Similarly, in our deep SCNN messages are propagated as residual, which is the output of ReLU in Eq.(1).
Such residual could also be viewed as a kind of modification to the original neuron.
As our experiments will show, such message pass scheme achieves better results than LSTM based methods

Experiment

CULane
Cityscapes

SGD with batch size 12,
base learning rate 0.01,
 momentum 0.9,
 weight decay 0.0001
 The learning rate policy is ”poly” with power and iteration number set to 0.9 and 60K respectively.
 Our models are modified based on the LargeFOV model in (Chen et al. 2017).
 The initial weights of the first 13 convolution layers are copied fromVGG16 (Simonyan and Zisserman 2015)
  trained on ImageNet (Deng et al. 2009). 

All experiments are implemented on the Torch7 (Collobert, Kavukcuoglu, and Farabet 2011)
framework. 现在也有 tensorflow 和 pytorch 版本

Lane detection model

没有采用先二值分割再聚类的方法 , 把车道线分开来分成四类, probmaps 被送到小的网络来预测车道标记是否存在.

在测试期间, 我们仍需要把概率图变成曲线. 像下图这样. 对于 existence value 大于0.5的车道标记, 我们在对应的probmap中每20行寻找最强相应的位置, 这些位置被三次样条曲线连接起来,形成最终结果.

在这里插入图片描述

  • As shown in Fig.5 (a), the detailed differences between our baseline model and LargeFOV are:
    (1) the output channel number of the ’fc7’ layer is set to 128,
    (2) the ’rate’ for the atrous convolution layer of ’fc6’ is set to 4,
    (3) batch normalization (Ioffe and Szegedy 2015) is added before each ReLU layer,
    (4) a small network is added to predict the existence of lane markings.
    During training, the line width of the targets is set to 16 pixels, and the input and target images are rescaled to 800 × 288. Considering the imbalanced label between background and lane markings, the loss of background is multiplied by 0.4

Evaluation

In order to judge whether a lane marking is successfully detected, we view lane markings as lines with
widths equal to 30 pixel
and calculate the intersectionover-union (IoU) between the ground truth and the prediction.

Predictions whose IoUs are larger than certain threshold are viewed as true positives (TP), as shown in Fig. 6.
Here we consider 0.3 and 0.5 thresholds corresponding to loose and strict evaluations.
在这里插入图片描述
计算了 harmonic mean( F1-measure) 作为最终评价指标

ablation Study

(1) Effectiveness of multidirectional SCNN
(2) Effects of kernel width w.
(3) Spatial CNN on different positions
(4) Effectiveness of sequential propagation 在SCNN中,一个像素不仅受到附近像素的影响,还可以收到其他远处像素的信息
(5) Comparison with state-of-the-art methods.
(6) Computational efficiency over other methods.

Semantic Segmentation on Cityscapes (简单看了一下,目前只关注车道线检测)

SCNN-Tensorflow

code: https://github.com/cardwing/Codes-for-Lane-Detection#SCNN-Tensorflow
code: https://github.com/XingangPan/SCNN

环境配置,文件准备

  • 我直接用的LaneNet配置的环境,目前没有发现冲突
  • 下载vgg.npy,放到 SCNN-Tensorflow/lane-detection-model/data.
  • 下载作者给出的预训练模型

测试

方便看的,分析指令作用

CUDA_VISIBLE_DEVICES="0" python tools/test_lanenet.py   
--weights_path /home/stone/disk/Lane_detection/Codes-for-Lane-Detection-master
              /SCNN-Tensorflow/lane-detection-model/model_weights/culane_lanenet_vgg_2018-12-01-14-38-37.ckpt-10000  
--image_path demo_file/test_img.txt  
 --save_dir savedir

方便复制粘贴

CUDA_VISIBLE_DEVICES="0" python tools/test_lanenet.py   --weights_path /home/stone/disk/Lane_detection/Codes-for-Lane-Detection-master/SCNN-Tensorflow/lane-detection-model/model_weights/culane_lanenet_vgg_2018-12-01-14-38-37.ckpt-10000  --image_path demo_file/test_img.txt   --save_dir savedir

在这里插入图片描述

  • 这样概率图存储到了savedir中, 要用官方原版的matlab文件从概率图得到曲线calculate precision, recall and F1-measure.

  • 其中我更改了CULane->list->test.txt 中包含的文件路径,在行首添加了/CULane字符(训练时其实也要改list下其他文件)

sed 's/^/\/CULane/g' test.txt  >test.out     

随后手动从test.out复制到test.txt

  • matlab文件中的路径设置
  • 通过show的值来选择是否可视化
% Experiment name
exp = 'vgg_SCNN_DULR_w9';
% Data root
data = '/home/stone/disk/Lane_detection/Codes-for-Lane-Detection-master/SCNN-Tensorflow/lane-detection-model';
% Directory where prob imgs generated by CNN are saved.
probRoot = strcat('/home/stone/disk/Lane_detection/Codes-for-Lane-Detection-master/SCNN-Tensorflow/lane-detection-model/savedir/ ', exp);
% Directory to save fitted lanes.
output = strcat('./output/', exp);

testList = strcat(data, '/CULane/list/test.txt');
show =true;  % set to true to visualize

在这里插入图片描述其实就是16个pixel宽度一个点,16*18=288,从而取最大值得到上图显示的点的位置

function [ coordinate ] = getLane( score )
% Calculate lane position from a probmap.
thr = 0.3;
coordinate = zeros(1,18);
for i=1:18
    lineId = uint16(288-(i-1)*20/590*288);
    line = score(lineId,:);
    [value, id] = max(line);
    if double(value)/255 > thr
        coordinate(i) = id;
    end
end
if sum(coordinate>0)<2
    coordinate = zeros(1,18);
end
end

训练

修改config/global_condfig.py中的参数

__C.TRAIN.BATCH_SIZE 
__C.TRAIN.VAL_BATCH_SIZE
__C.TRAIN.GPU_NUM
__C.TRAIN.CPU_NUM
__C.TEST.BATCH_SIZE
__C.TEST.CPU_NUM 
parser.add_argument('--dataset_dir', type=str, help='The training dataset dir path')
parser.add_argument('--net', type=str, help='Which base net work to use', default='vgg')
parser.add_argument('--weights_path', type=str, help='The pretrained weights path')

方便分析

CUDA_VISIBLE_DEVICES="0" python tools/train_lanenet.py 
--net vgg 
--dataset_dir /home/stone/disk/Lane_detection/Codes-for-Lane-Detection-master/
              SCNN-Tensorflow/lane-detection-model/CULane/list

方便复制

CUDA_VISIBLE_DEVICES="0" python tools/train_lanenet.py --net vgg --dataset_dir /home/stone/disk/Lane_detection/Codes-for-Lane-Detection-master/SCNN-Tensorflow/lane-detection-model/CULane/list

代码结构

1. lanenet_data_processor.py

DataSet类

  • __init__
  • process_img
  • process_label_instance
  • process_label_existence
  • _init_dataset
  • next_batch

2. lanenet_data_processor_test.py

DataSet类

  • __init__
  • process_img
  • _init_dataset
  • next_batch

3. global_config.py

  • config 存储一些变量

4. lanenet_merge_model.py

class LaneNet

  • inference
  • test_inference
  • loss

陌生语句

img_decoded = tf.image.decode_jpeg(img_raw, channels=3)  #将图像使用JPEG的格式解码从而得到图像对应的三维矩阵

img_resized = tf.image.resize_images(img_decoded, [CFG.TRAIN.IMG_HEIGHT, CFG.TRAIN.IMG_WIDTH],
                                             method=tf.image.ResizeMethod.BICUBIC)
method = 0 双线性插值法
method = 1 最近邻居法
method = 2 双三次插值法
method = 3 面积插值法

tf.subtract 减法

input_queue = tf.train.slice_input_producer([image_tensor, label_instance_tensor, label_existence_tensor]) #文件队列数据读取机制
  • 0
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 7
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值