paper:Real-Time Freespace Segmentation
Abstract
论文工作的主要目标是允许一个室内机器人能够正确检测RGB摄像头中的free space(机器人可安全行驶
的区域)
对图像中的障碍物进行检测和分割可以使机器人安全避开障碍物
。
论文的方法用于检测所有类型的障碍物
,比如:negative obstacles, ledges, overhangs,glass surfaces, etc.
论文提出的网络run at 55 fps on an embedded NVIDIA Jetson TX2’s GPU
工作pipeline
:
- a frame is captured from the camera and simple
edge detection techniques
are applied; - the frame and detected edges are processed by our
semantic segmentation neural network
to produce a freespace map; - this freespace map is converted to a 3D pointcloud.
本篇博客只对前两步进行记录。
1、Input Preprocessing
We add auxiliary input
. These allow the network to more easily learn to refine edges of obstacles
and speed up the neural network’s convergence.
Because the edges of the freespace map (segmentation output) generally correspond with edges of objects, explicitly adding the image gradient at this point helps refine segmentation results at object boundaries
.
We compute the discrete Sobel (1st order derivative) and Laplacian (2nd
order derivative) gradients of the image.
将Sobel X、Y、Laplacian拼接,然后再下采样得到112x112x3的图作为auxiliary input.
2、Network Architecture
Our specific task of segmenting free space (binary classification) requires much less contextual information
than segmenting the multiclass task.
The main input to the network is a 224x224 RGB
image,and auxiliary Input is a 112x112x3
feature map, scaled to [-1.0, 1.0].
Only the first 13
inverted residual bottleneck blocks from MobileNetV2
are used
Atrous Pyramid模块不使用BN
网络输出: channel 0 represents obstacles and background while channel 1 represents free space
.