车道线-论文阅读: Learning Lightweight Lane Detection CNNs by Self Attention Distillation

ICCV2019

code: https://github.com/cardwing/Codes-for-Lane-Detection
paper: https://arxiv.org/abs/1908.00821

Abstract

  1. present a novel knowledge distillation approach, i.e., Self Attention Distillation (SAD), which allows a model to learn from itself and gains substantial improvement without any additional supervision or labels.
  2. The value of attention map can be used “free”.
  3. The network: ENet-SAD

1.Introduction

 1. SAD: allows a network to exploit attention maps derived from its own layers as the distillation targets for its lower layers.
 2. SAD is only used in the training phase, so brings no computational cost during the deployment.
 3.By adding SAD , the preceding block to mimic the attention maps of a deeper block.

2.Method

1.aim to perform layer-wise and top-down attention distillation to enhance the representation learning process.

Only use activation-based attention distillation!

2.Activation-based attention distillation

A m ∈ R C m × H m × W m A_{m} \in R^{C_{m} \times H_{m} \times W_{m}} AmRCm×Hm×Wm
A m A_{m} Am表示activation map, C m C_{m} Cm H m H_{m} Hm W m W_{m} Wm分别表示channel、height、width.
对channel进行操作:
g : R C m × H m × W m → R H m × W m g: R^{C_{m} \times H_{m} \times W_{m}} \rightarrow R^{H_{m} \times W_{m}} g:RCm×Hm×WmRHm×Wm
对于上述的操作,有三种实现方法:
1)绝对值求和: g s u m ( A m ) = ∑ i = 1 C m ∣ A m i ∣ g_{sum}(A_{m})=\sum_{i=1}^{Cm} |A_{mi}| gsum(Am)=i=1CmAmi
2)绝对值的p次方求和: g s u m p ( A m ) = ∑ i = 1 C m ∣ A m i ∣ p g_{sum}^{p}(A_{m})=\sum_{i=1}^{Cm} |A_{mi}|^{p} gsump(Am)=i=1CmAmip
3)绝对值最大值的p次方: g m a x p ( A m ) = max ⁡ i = 1 , C m ∣ A m i ∣ p g_{max}^{p}(A_{m})=\max_{i=1,Cm} |A_{mi}|^{p} gmaxp(Am)=maxi=1,CmAmip
其中, A m i A_{mi} Ami表示 A m A_{m} Am的第i层channel。
相比较而言, g s u m p ( A m ) g_{sum}^{p}(A_{m}) gsump(Am)的效果会更好。

3.网络结构

lane segmentation by Enet-SAD
1) spatial softmax operation Φ ( ⋅ ) \Phi(\cdot) Φ() on g s u m 2 ( A m ) g_{sum}^{2}(A_{m}) gsum2(Am).
2)Bilinear upsampling B ( ⋅ ) B(\cdot) B() is added before the softmax operation if the size of original attention maps is different from that of targets.
3)AT-GEN is represented by a function Ψ = Φ ( B ( g s u m 2 ( A m ) ) ) \Psi=\Phi(B(g_{sum}^{2}(A_{m}))) Ψ=Φ(B(gsum2(Am)))
在这里插入图片描述
4)Total loss
在这里插入图片描述

4.车道线后处理

1)网络输出:multi-channel prob maps + lane existence vector
2)后处理:
(1)用9*9的kernel平滑处理prob map;
(2)for each lane whose existence probability is larger than 0.5, we search the corresponding probability map every 20 rows for the position with the highest probability value.
(3)用cubic splines样条曲线拟合车道线。

5.其他

1) add a small network P1 to predict the existence of lanes.
2)Dilated Conv replace the original Conv
3)concats E3 E4 for output

6.训练及注意事项

1)SAD work well in the middle and high level layers;
2)在low-level增加SAD会损失网络性能;
3)mimicking the attention maps of the neighbouring layer successively brings more performance gains compared with mimicking those of nonadjacent layers (P23 + P34 outperforms P24 + P34).
4)在higher-layers and low-layers 之间进行distillation会降低指标,因为不同维度的layer info差别很大;
5)在训练后期加入SAD会 有效;在训练早起因为deeper layer还没被训练稳定,因此这些act map质量不高。

  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Training deep models for lane detection is challenging due to the very subtle and sparse supervisory signals in- herent in lane annotations. Without learning from much richer context, these models often fail in challenging sce- narios, e.g., severe occlusion, ambiguous lanes, and poor lighting conditions. In this paper, we present a novel knowl- edge distillation approach, i.e., Self Attention Distillation (SAD), which allows a model to learn from itself and gains substantial improvement without any additional supervision or labels. Specifically, we observe that attention maps ex- tracted from a model trained to a reasonable level would encode rich contextual information. The valuable contex- tual information can be used as a form of ‘free’ supervision for further representation learning through performing top- down and layer-wise attention distillation within the net- work itself. SAD can be easily incorporated in any feed- forward convolutional neural networks (CNN) and does not increase the inference time. We validate SAD on three pop- ular lane detection benchmarks (TuSimple, CULane and BDD100K) using lightweight models such as ENet, ResNet- 18 and ResNet-34. The lightest model, ENet-SAD, per- forms comparatively or even surpasses existing algorithms. Notably, ENet-SAD has 20 × fewer parameters and runs 10 × faster compared to the state-of-the-art SCNN [16], while still achieving compelling performance in all bench- marks. Our code is available at https://github. com/cardwing/Codes-for-Lane-Detection.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值