[论文阅读] Online Inference and Detection of Curbs in Partially Occluded Scenes with Sparse LIDAR

paper 原论文的链接

1. 主要思想

通过什么方式,解决了什么问题

  • 提出了一个框架标注3D点云,并且将其投射到鸟瞰图,对路沿进行掩码标注,标注包括:遮挡和未遮挡的。
  • 主要自己构建路沿检测数据集,然后学习预测遮挡的和不遮挡路沿的数据(可能用传统算法获取路沿拟合的线),提出了一个网络分别拟合遮挡的和不遮挡的路沿,并且用了anchor line机制,该anchor机制提高了预测的准确度(和物体检测的anchor思想一样)

2. 具体方法

说明怎么解决的,具体设计是什么, 有什么启发性思考(作者的创新点)

相关工作

  • [11] uses range and intensity information from 3D LIDAR to detect
    visible curbs on elevation data, which fails in the presence of occluding obstacles.
  • [12] presents a LIDAR-based method to detect visible curbs using sliding-beam segmentation followed by segment-specific curb detection, but fails to detect curbs behind obstacles.

如何生成路沿的曲线

  • In this work, we used images acquired by a Point Grey Bumblebee XB3 camera, mounted on the front of the platform facing towards
    the direction of motion. In particular, our implementation of VO uses FAST corners [16] combined with BRIEF descriptors [17], RANSAC [18] for outlier rejection, and nonlinear least-squares refinement.

  • 将点的高度设置在3.55m内, 防止地上的水导致点云点特别低的情况。

将可视的线和遮挡的线进行分离

To determine which points are visible and which are occluded we use the hidden point removal operator as described in [20]. The operator determines all visible points in a pointcloud when observed from a given viewpoint. This is achieved by extracting all points residing on the convex hull of a transformed pointcloud. These points resemble the visible points, all other (labeled) points are considered as hidden (or occluded). We take the previously trimmed pointclouds and create binary bird’s-eye view images by taking the height of points from the ground into account. The points that are within a predefined height difference from the LIDAR roughly correspond to the points (obstacles) that are
blocking the view. By putting together raw labels and binary masks of obstacles, obtained by running the hidden point removal algorithm, we obtain separate masks for visible and occluded road boundaries。(待了解)

网络结构

  • 分析类Unet模型,不能很好检测处遮挡路沿的原因: first, the network’s limited receptive field, which is not big enough to capture context around large obstacles to estimate the position of curbs behind them, and second, the lack of structure (model-free) which prevents the network to infer very thin curves of occluded road boundaries within an image.
    在这里插入图片描述
  • 可见路沿是Unet这样的网络直接进行检测;
  • 遮挡的路沿(也就是这种路沿是没有实际点特征的,label是表示这个点是遮挡路沿的点), 采用anchor line的方式(设定一些先验线),如下图所示, 取4个角度的先验线, 然后取最切近的一个线去预测目标线。
  • 每个grid cell怎么预测这些线的?: Lines in each grid cell are parameterised in a discrete-continuous(离散且连续的线段) form: first, fitted lines are assigned to one of four types of anchor lines, and secondly, offsets between fitted and anchor lines are calculated. Anchor lines pass through the centre of a grid cell at different angles (22.5◦,67.5◦, 112.5◦and 157.5◦). During fitting, lines are assignedto the closest anchor line. Once a fitted line is discretised,two continuous parameters are calculated: (1) an angle offsetbetween a fitted and the respective anchor line ($w^k_{i,j,gy}
    ) , a n d ( 2 ) a d i s t a n c e f r o m t h e c e n t r e o f t h e c e l l t o t h e f i t t e d l i n e ( ), and (2) a distance from the centre of the cell to the fitted line ( ),and(2)adistancefromthecentreofthecelltothefittedline(β^k_{i, j,gt}$). As a result, we obtain 16 numbers for each grid cell, 4 numbers(w, β \beta β, 类别-是否是线) for each line category.
    在这里插入图片描述
  • To increase the receptive field of the model we added intra-layer convolutions [23] before the multi-scale parameter estimation layers. Traditional layer-by-layer convolutions are applied between feature maps, but intra-layer convolutions are slice-by-slice convolutions within feature maps. Hence, intra-layer convolutions capture aspects across the whole image and can thereby capture spatial relationships over longer distances. For example, there is a strong correlation between the length of the occluded curbs and the size of objects which are obstructing the view (ranging from 10-15 pixels through occlusions by traffic cones to 200-300 pixels through occlusions by several parked cars).
  • 用交叉熵损失预测是否为路沿, 用smoothL1预测w, β \beta β;

后处理

  • 采用时间信息,也就是前后帧进行跟踪识别, 这样做有两个好处: filtering out false positives and tracking true positives.
  • VO: 用旋转和平移矩阵表示前后帧路沿线的关系; 通过视觉里程计VO,将前一帧的结果映射到后一帧,然后与后一帧的识别结果进行综合;
  • filtering: we transform the last three output masks of detected road boundaries into a common reference attached to the current frame. Then we construct a histogram of output mask size (480x960) by counting the number of overlapping pixels with a value grater than threshold of 0.7 (which was determined experimentally). 可能如果这三帧的在同一个位置都有值的话,histogram会高,则保留; 否则则剔除该点。
  • Tracking. In the second step, we perform a similar procedure as outlined above. However, this time we consider
    road boundary masks from the last three frames that were
    generated by the first step (as shown in Figure 9). By
    taking the union of these masks we track the detected road
    boundaries over the time. Integrating temporal information
    helps to close gaps between boundary segments
    在这里插入图片描述

3. 实验支撑

记录一些关键实验的结论分析,具有启发性的实验和结论

  • 总结性能图
    在这里插入图片描述- 可见路沿和遮挡路沿的对比
  • 添加后处理后的效果

4. 总结启示

针对中心思想和实验结论的总结和扩展思考
扩展思考 : 也就是用自己已有的知识或者自己的“土话”,重新理解paper(费曼学习法的精髓-便于记忆和举一反三的应用)

  • 直接预测路沿线的方法
  • 用anchor line对遮挡的线进行预测
  • 线的拟合思路: Fast corner【16】–> Brief descriptor[17]–>Ransac[18] for outlier rejection and nonlinear least-squares refinement.
  • 将路沿分成可见路沿和遮挡路沿两种类别的思路不错
  • 可见路沿和遮挡路沿分开进行预测的方式也值得借鉴。

5. 相关文献

主要的比较贴近的文献,关键性文献

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值