Apollo 感知模块 _study [ADS]

https://github.com/ApolloAuto/apollo/blob/master/docs/specs/perception_apollo_2.5.md

坐标系: 后轴中心 ;HDmap The Local Frame – East-North-Up cordinations


Apollo 2.5

Introduction

Apollo 2.5 aims for Level-2 autonomous driving with low cost sensors. An autonomous vehicle will stay in the lane and keep a distance with a closest in-path vehicle (CIPV) using a single front-facing camera and a frontal radar. Apollo 2.5 supports high-speed autonomous driving on highway without any map. The deep network was learned to process an image data. The performance of the deep network will be improved over time as collecting more data.

Safety alert

Apollo 2.5 does not support a high curvature road, a road without lane marks including local roads and intersections. The perception module is based on the visual detection using a deep network with limited data. Therefor, before we release a better network, the driver should be careful in driving and always be ready to disengage the autonomous driving by turning the wheel to the right direction. Please perform the test drive at the safe and restricted area.

  • Recommended road

    • Road with clear white lane lines on both sides (单边车道线也是不支持的!!)
  • Avoid

    • High curvature road
    • Road without lane line marks
    • Intersection (十字路口)
    • Butt dots or dotted lane lines
    • Public road

Perception modules

The flow chart of each module is shown below.

Image

在线标定: 通过灭点等信息动态调整? TBD 参考3.0的姿态校准。--估计摄像头的外参

tailgating --跟随--标识的是前车的轨迹!-用来实现车道线丢失时,可以尾随前车的轨迹,TJA 时

Figure 1: Flow diagram of lane keeping system

Deep network

车道线和车辆检测-如果用2个CNN ,资源开销大,但是快。--目前算是应用最多的。

合并的话,资源开销可能少一些,慢。

APOLLO 2.5 使用的是YOLO  for LD &OD 

目标分类信息:type: vehicle, truck, cyclist, and pedestrian, 2D box,  还有朝向

Deep network ingests an image and provides two detection outputs, lane lines and objects for Apollo 2.5. There is an ongoing debate on individual task and co-trained task for deep learning. Individual networks such as a lane detection network or an object detection network usually perform better than one co-trained multi-task network. However, with given limited resources, multiple individual networks will be costly and consume more time in processing. Therefore, for the economic design, co-train is inevitable with some compromise in performance. In Apollo 2.5, YOLO [1][2] was used as a base network of object and lane detection. The object has vehicle, truck, cyclist, and pedestrian categories and represented by a 2-D bounding box with orientation information. The lane lines are detected by segmentation using the same network with some modification.

Network optimization

In literature, there are multiple approaches of network optimization for real time processing of high framerate images. Rather than using 32bit float, a network with INT8 was implemented to achieve real-time implementation. TensorRT may be used to optimize the network.

Object detection/tracking

In a traffic scene, there are two kinds of objects: stationary objects and dynamic objects. Stationary objects include lane lines, traffic lights, and thousands of traffic signs written in different languages. Other than driving, there are multiple landmarks on the road mostly for visual localization including streetlamp, barrier, bridge on top of the road, or any skyline. For stationary object, we will detect only lane lines in Apollo 2.5. (不支持马路沿,边界 etc)

Among dynamic objects, we care passenger vehicles, trucks, cyclists, pedestrians, or any other object including animal or body parts on the road. We can also categorize object based on which lane the object is in. The most important object is CIPV (closest object in our path). Next important objects would be the one in neighbor lanes.

2D-to-3D bounding box   TBD 

Given a 2D box, with its 3D size and orientation in camera, this module searches the 3D position in a camera coordinate system and estimates an accurate 3D distance using either the width, the height, or the 2D area of that 2D box. The module works without accurate extrinsic camera parameters.

给定一个2D盒子,其3D大小和相机方向,该模块搜索相机坐标系统中的3D位置,并使用该2D盒子的宽度,高度 (已知的先验信息)或2D区域估计精确的3D距离。该模块可在没有准确的外部相机参数的情况下工作

Object tracking

The object tracking module utilizes multiple cues such as 3D position, 2D image patches, 2D boxes, or deep learning ROI features. The tracking problem is formulated as multiple hypothesis data association by combining the cues efficiently to provide the most correct association between tracks and detected object, thus obtaining correct ID association for each object.

Lane detection/tracking

Among static objects, we will handle lane lines only in Apollo 2.5. The lane is for both longitudinal and lateral control. A lane itself guides lateral control and an object in the lane guides longitudinal control.

Lane lines

The lane can be represented by multiple sets of polylines such as next left lane line, left line, right line, and next right line. Given a heatmap (热力图)of lane lines from the deep network, the segmented binary image is generated by thresholding(滤波-直方图). The method first finds the connected components and detects the inner contours(内轮廓). Then it generates lane marker points based on the contour edges in the ground space of ego-vehicle coordinate system. After that, it associates these lane markers into several lane line objects with corresponding relative spatial (e.g., left(L0), right(R0), next left(L1), next right(L2), etc.) labels.

 

CIPV (Closest-In Path Vehicle)

A CIPV is the closest vehicle in our ego-lane. An object is represented by 3D bounding box and its 2D projection from the top-down view localizes the object on the ground. Then, each object will be checked if it is in the ego-lane or not. Among the objects in our ego-lane, the closest one will be selected as a CIPV.

Radar + camera fusion Asynchronous --不知道有没有用OOSM(out of sesquence measurements)

Given multiple sensors, their output should be combined in a synergic fashion. Apollo 2.5. introduces a sensor set with a radar and a camera. For this process, both sensors need to be calibrated. Each sensor will be calibrated using the same method introduced in Apollo 2.0. After calibration, the output will be represented in a 3-D world coordinate and each output will be fused by their similarity in location, size, time and the utility of each sensor. After learning the utility function of each sensor, the camera contributes more on lateral distance and the radar contributes more on longitudinal distance measurement.

速度信息没说,惯例---1R1V情况下,速度权重 radar为主。

Virtual lane

All lane detection results will be combined spatially and temporarily to induce the virtual lane which will be fed to planning and control. Some lane lines would be incorrect or missing in a certain frame. To provide the smooth lane line output, the history of lane lines using vehicle odometry is used. As the vehicle moves, the odometer of each frame is saved and lane lines in previous frames will be also saved in the history buffer. The detected lane line which does not match with the history lane lines will be removed and the history output will replace the lane line and be provided to the planning module.

how?? TBD  对于某一段历史的车道线对象,如果下一个时刻监测到的时候,和历史值不匹配。就会去除。里程计来测量车辆走了多远,(对应车道线长度)

Output of perception

The input of PnC will be quite different with that of the previous lidar-based system.

  • Lane line output

    • Polyline and/or a polynomial curve 多项式曲线或直线
    • Lane type by position: L1(next left lane line), L0(left lane line), R0(right lane line), R1(next right lane line)
  • Object output

    • 3D rectangular cuboid
    • Relative velocity and direction
    • Type: CIPV, PIHP, others
    • Classification type: car, truck, bike, pedestrian

The world coordinate will be ego-coordinate in 3D where the rear center axle is an origin.

References

[1] J Redmon, S Divvala, R Girshick, A Farhadi, "You only look once: Unified, real-time object detection," CVPR 2016

[2] J Redmon, A Farhadi, "YOLO9000: Better, Faster, Stronger," arXiv preprint


APOLLO 3.0

https://github.com/ApolloAuto/apollo/blob/master/docs/specs/perception_apollo_3.0_cn.md

Image

+ 车道线不可靠时, 前车的轨迹,可以实现跟随tailgating 。TJA

+2.5只融合了RV 的信息, 3.0 融合了 Lidar, Radar ,Camer.

+Online pose estimation: This new feature estimates the pose of an ego-vehicle for every single frame. This feature helps to drive through bumps or slopes on the road with more accurate 3D scene understanding.

取代了2.5的在线标定。--实际功能是,更好的估计ego的姿态。 通过IMU?+ 输出结果(=外参变了!)给摄像头距离估算模块。

Ultrasonic sensors: Perception in Apollo 3.0 now works with ultrasonic sensors. The output can be used for Automated Emergency Brake (AEB) 差劲--只有低速的AEB!and vertical/perpendicular 垂直parking.

 

Whole lane line: Unlike previous lane line segments, this whole lane line feature will provide more accurate and long range detection of lane lines. TBD???  For whole lane line, we have an individual network to provide longer lane lines in cases of either whole or broken lines.我们有一个单独的网络,以提供更长的车道线,无论是车道线是离散的还是连续的。We have two types of lane lines, lane mark segment and whole lane line. The lane mark segment is used for visual localization and whole lane line is used for lane keeping.我们有两种类型的车道线,车道标记段和整个车道线。车道标记段用于视觉定位,整个车道线用于使车辆保持在车道内。 

 

Visual localization: Camera's are currently being tested to aide and enhance localization TBD???

16 beam LiDAR support

输出添加:

  • Drops: trajectory of an object,=Tailgating 轨迹吧!snails

Tailgating is a maneuver to follow the vehicle or object in front of the autonomous car. From the tracked objects and ego-vehicle motion, the trajectories of objects are estimated. This trajectory will guide how the objects are moving as a group on the road and the future trajectory can be predicted. There is two kinds of tailgating, the one is pure tailgating by following the specific car and the other is CIPV-guided tailgating, which the ego-vehicle follows the CIPV's trajectory when the no lane line is detected.

The snapshot of visualization of the output is shown in the figure below: Image

The figure above depicts visualization of the Perception output in Apollo 3.0. The top left image shows image-based output. The bottom-left image shows the 3D bounding box of objects. Therefore, the left image shows 3-D top-down view of lane lines and objects. The CIPV is marked with a red bounding box. The yellow lines depicts the trajectory of each vehicle


Apollo 5.0

https://github.com/ApolloAuto/apollo/blob/master/docs/specs/perception_apollo_5.0.md

Supports Caffe and PaddlePaddle

  • Online sensor calibration service

  • Manual camera calibration
  • Closest In-Path Object (CIPO) Detection
  • Vanishing Point Detection

Safety alert

Apollo 5.0 does not support a high curvature road, roads without lane lines including local roads and intersections.

Perception module

比3.0还是修改了很多,

+lidar , +红绿灯, +调整了在线标定的输出(2d-3d),fusion 变成可以配置的。

还缺少光源检测,限速牌检测

可以看出融合是基于track to track 的fusion

Image

Supports PaddlePaddle

The Apollo platform's perception module actively depended on Caffe for its modelling, but will now support PaddlePaddle, an open source platform developed by Baidu to support its various deep learning projects. Some features include:

  • PCNNSeg: Object detection from 128-channel lidar or �a fusion of three 16-channel lidars using PaddlePaddle

  • PCameraDetector: Object detection from a camera

  • PlaneDetector: Lane line detection from a camera

Using PaddlePaddle Features

  1. To use the PaddlePaddle model for Camera Obstacle Detector, set camera_obstacle_perception_conf_file to obstacle_paddle.pt in the following configuration file

  2. To use the PaddlePaddle model for LiDAR Obstacle Detector, set use_paddle to true in the following configuration file

Manual Camera Calibration--外部标定的一个验证工具。

In Apollo 5.0, Perception launched a manual camera calibration tool for camera extrinsic parameters. This tool is simple, reliable and user-friendly. It comes equipped with a visualizer and the calibration can be performed using your keyboard. It helps to estimate the camera's orientation (pitch, yaw, roll). It provides a vanishing point, horizon, and top down view as guidelines. Users would need to change the 3 angles to align a horizon and make the lane lines parallel.

The process of manual calibration can be seen below:

Closest In-Path Object (CIPO) Detection--不了解具体用途 RGF?

The CIPO includes detection of key objects on the road for longitudinal control. It utilizes the object and ego-lane line detection output. It creates a virtual ego lane line using the vehicle's ego motion prediction. Any vehicle model including Sphere model, Bicycle model and 4-wheel tire model can be used for the ego motion prediction. Based on the vehicle model using the translation of velocity and angular velocity, the length and curvature of the pseudo lanes are determined. Some examples of CIPO using Pseudo lane lines can be seen below:

  1. CIPO used for curved roads 

  2. CIPO for a street with no lane lines 

Vanishing Point Detection--TBC--基于CNN 寻找图片的灭点

In Apollo 5.0, an additional branch of network is attached to the end of the lane encoder to detect the vanishing point. This branch is composed of convolutional layers and fully connected layers, where convolutional layers translate lane features for the vanishing point task and fully connected layers make a global summary of the whole image to output the vanishing point location. Instead of giving an output in xy coordinate directly, the output of vanishing point is in the form of dxdy which indicate its distances to the image center in xy coordinates. The new branch of network is trained separately by using pre-trained lane features directly, where the model weights with respect to the lane line network is fixed. The Flow Diagram is included below, note that the red color denotes the flow of the vanishing point detection algorithm.

Two challenging visual examples of our vanishing point detection with lane network output are shown below:

  1. Illustrates the case that vanishing point can be detected when there is obstacle blocking the view:

  2. Illustrates the case of turning road with altitude changes:

Key Features

  • Regression to (dx, dy) rather than (x, y) reduces the search space
  • Additional convolution layer is needed for feature translation which casts CNN features for vanishing point purpose
  • Fully Connected layer is applied for holistic spatial summary of information, which is required for vanishing point estimation
  • The branch design supports diverse training strategies, e.g. fine tune pre-trained laneline model, only train the subnet with direct use of laneline features, co-train of multi-task network

 

 

 

  • 2
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值