【论文笔记】-- Visual Odometry: Part II - Matching, Robustness, and Applications

最新推荐文章于 2020-09-09 15:57:30 发布

C ．Lee

最新推荐文章于 2020-09-09 15:57:30 发布

阅读量1k

点赞数

分类专栏： SLAM 文章标签： slam 计算机视觉

本文链接：https://blog.csdn.net/weixin_44832149/article/details/105942096

版权

SLAM 专栏收录该内容

10 篇文章 4 订阅

订阅专栏

1. Feature selection and matching

对于发现特征点与其 correspondences，有两种方法。其一，features detection + features tracking；其二，features detection + features matching。前一种方法更合适拍摄视角变化较小，而当预计会有较大的运动或视角变化时，后一种方法更合适。

A. Feature Detection

1）概述

特征检测。在 VO 中，feature detectors 一般有 corners 和 blobs 两种，它们在图像中的位置可直接被准确测得。

A corner is defined as a point at the intersection of two or more edges.
A blob is an image pattern that differs from its immediate neighborhood in terms of intensity, color, and texture. It is not an edge, nor a corner.
注意，blob 既不是边，也非角点，而指图像中的具有相似颜色、纹理等特征所组成的一块连通区域。针对这些特征区域所提取出某些具有区域代表性的信息，就被称为Blob特征。

好的 feature detector 应具备如下性质，

localization accuracy (both in position and scale),
repeatability (i.e., a large number of features should be redetected in the next images),
computational efficiency,
robustness (to noise, compression artifacts, blur),
distinctiveness (so that features can be matched accurately across different images),
invariance (to both photometric changes [e.g., illumination] and geometric changes [rotation, scale (zoom), perspective distortion]).

corner detectors 有 Moravec，Forstner，Harris，Shi-Tomasi，FAST
blob detectors 有 SIFT，SURF，CENSUR

下图，为不同 feature detectors 之间的性质与性能表现。
在这里插入图片描述

2）基本原理

每个 feature detector 包含两个步骤，

对整张图像执行 feature-response function，比如 Harris detector 的 corner response function，SIFT 的 difference-of-Gaussian operator。
对第一步的输出执行 non-maxima suppression，其输出表示 detected features。

B. Feature Descriptor

detected features，即被检测出的特征，其周围区域被转换为一个压缩的描述子 descriptor，可以与其它描述子匹配。

最简单的 descriptor 是特征点周围区域内的所有像素强度。但这往往不够好。
SIFT descriptor，一种梯度方向直方图。它既可以在 blob 上计算，也可在 corner 上计算，但在用于corner 性能会下降。
BRIEF descriptor，一种二值 descriptor。
ORB
BRISK

C. Feature Matching

1）基本方法

在两个图像之间匹配特征的最简单方法是，将第一个图像中的所有特征描述子与第二个图像中的所有其他特征描述子进行比较。比较方法有两种，前者比较相似性，后者比较距离。

If the descriptor is the local appearance of the feature, then a good measure is the SSD or the NCC.
For SIFT descriptors, this is the Euclidean distance.

2）改进

Mutual consistency check: 存在第二张图片中的一个特征与第一张图片中的多个特征匹配，采用 mutual consistency check 来解决。彼此互为偏好匹配项的一对对应特征才被认为是正确的。
Constrained matching: 在大规模的特征下，逐一匹配计算非常耗时。
- 可采用索引结构加速搜索，比如多叉搜索树、哈希表。
- 可在第二张图像的预测区域，搜索可能的对应特征。这些预测区域，是使用运动模型和 3D 特征位置估计得来的，比如 3D-to-2D 的运动估计。运动模型可通过附加的传感器得来，比如IMU，还可以由 constant velocity model 求得。
- 在上述中，若只有运动模型可知，而没有 3D 特征位置，那么可采用 epipolar matching。如下图，可沿着第二张图像的 epipolar line 搜索对应特征。这也是双目视觉的特征匹配方法。而在单目情况下，epipolar line 还需从 2D 特征和相机的相对运动来计算。

在这里插入图片描述

D. Feature Tracking

This detect-then-track approach is suitable for VO applications where images are taken at nearby locations, where the amount of motion and appearance deformation between adjacent frames is small. For this particular application, SSD and NCC can work well.
However, if features are tracked over long image sequences, their appearance can undergo larger changes. In this case, the solution is to apply an affine-distortion model to each feature. The resulting tracker is often called KanadeLucasTomasi (KLT) tracker [12].

2. Outlier removal

outliers（外点/离群点/离群值） – wrong data associations，即误匹配。造成的原因有，image noise, occlusions, blur, and changes in view point and illumination for which the mathematical model of the feature detector or descriptor does not account for.

为提高相机运动估计的精度，剔除 outliers 非常重要。

A. RANSAC

通常，利用运动模型引入的几何约束来剔除 outlier。

1）核心思想

The idea behind RANSAC is to compute model hypotheses from randomly-sampled sets of data points and then verify these hypotheses on the other data points. The hypothesis that shows the highest consensus with the other data is selected as solution.

如上所述，RANSAC 的核心思想是，先从随机采样的数据集中计算模型假设，再在其它数据中验证这些假设，其中，选择与其它数据高度符合的假设作为最终模型。

2）在 VO 中的应用

model hypotheses – 相对运动（R, t）
data points – candidate feature correspondences，即候选的特征匹配对
Inlier points to a hypothesis are found by computing the point-to-epipolar line distance（e.g.,Sampson distance） or the directional error。

在这里插入图片描述
N 的数量计算如下，
$N=\frac{\log (1-p)}{\log \left(1-(1-\varepsilon)^{s}\right)} （1）$
其中， $s$ 是实例化模型中 data points 的个数， $ε$ 是 data points 中 outliers 的百分比， $p$ 是所要求的成功概率。在实际应用下，为了鲁棒性， $N$ 通常×10。

注，成功概率是什么鬼？

B. Minimal Model Parameterizations: 8, 7, 6, 5, 4, 2, and 1-point RANSAC

如图所示， $N$ 是估计模型所需的 data points 个数 $s$ 的指数。因此，人们对使用模型的最小参数化非常感兴趣。
在这里插入图片描述

5点法，是求解标定相机6自由度运动的最小参数法。
4、3、2、1点法，得结合其它线索，或相机运动存在约束。
高翔的十四讲上，求解本质矩阵采用的是8点法，因为考虑到了工程的实际应用，5点法求解较为复杂。

总的来说，如果摄像机的运动不受约束，则估计运动的最小点数为5，因此应使用5 RANSAC（或6、7或8点1）。当然，与6点，7点或8点RANSAC相比，使用5点RANSAC所需的迭代次数更少（因此所需时间更少）。

下表显示了，针对8、7、5、4、2、1点求解器，总结了最小RANSAC迭代次数与模型参数 $s$ 的关系。这些值是从公式（1）中获得的，假设成功概率为 $p$ = 99％，离群值百分比 $ε$ = 50％。

在这里插入图片描述

C. Reducing the Iterations of RANSAC

实际应用中，outliers一般比较多，为获取更多 inliers，通常增加迭代次数（多于上表）。因此，许多研究致力于提升 RANSAC 的速度。比如，MLESAC，PROSAC， Preemptive RANSAC，Uncertainty RANSAC， deterministic RANSAC；另外，还有 sampling the hypotheses from a proposal distribution of the vehicle motion model。

上述改进算法中，preemptive RANSAC 最流行，因其有益于实时操作。

注，感觉这小节文不对题。

D. Is it Really Better to Use a Minimal Set in RANSAC?

不一定，尤其当图像对的噪声较大时

If one is concerned with certain speed requirements, using a minimal point set is definitely better than using a non-minimal set. However, even the 5-point RANSAC might not be the best idea if the image correspondences are very noisy. In this case, using more points than a minimal set is proved to give better performance (and more inliers) .

3. Error propagation

如图所示，相机运动估计的不确定性的传递。其中，相邻帧位姿变换 $T$ 的不确定性，来源于两点，camera geometry 和 the image features。随着相对位姿变换 $T$ 的增加并聚合到绝对位姿 $C$ 中， $C$ 的不确定性将逐渐增长。
在这里插入图片描述

4. Camera pose optimization

VO computes the camera poses by concatenating the transformations, in most cases from two subsequent views at times $k$ and $k$ -1 (see Part I of this tutorial). However, it might also be possible to compute the transformations between the current time $k$ and the $n$ last time steps $T_{k, k-2}, \ldots, T_{k, k-n},$ or even for any time step $T_{i, j} .$ If these transformations are known, they can be used to improve the camera poses by using them as additional constraints in a pose-graph optimization.

A. Pose-Graph Optimization

可以用位姿图表示一系列相机位姿，其中，节点表示相机位姿，边表示相邻位姿变换。边约束 $e_{i j}$ 可以定义如下 cost function，
$\sum_{e_{i j}}\left\|C_{i}-T_{e_{i j}} C_{j}\right\|^{2} （4）$
其中， $T_{e_{i j}}$ 是位姿 $i$ 和 $j$ 之间的变换，注意两者不一定相邻。理论上， $C_{i}=T_{e_{i j}} C_{j}$ ，但是实际计算中，存在误差，因此，位姿图优化的目的是，求得能够最小化 cost function 的相机位姿参数。因为，该 cost function 为非线性，所以采用非线性优化算法，如 $L$ - $M$ 算法。

1) Loop Constraints for Pose-Graph Optimization:

回环约束可以消除长时间的累计误差，在此不赘述。

基于外观法的回环检测。一般在 loop detection（即选出回环候选图像）之后，执行对极约束的几何验证（geometric verification）；接着，针对验证后得到的匹配图像对，使用 wide-baseline feature matches 计算位姿变换；最后，将该位姿变换作为额外的回环约束，添加到位姿图中。

注，这里对极约束的几何验证，我猜应该是，采用 epipolar matching，如有足够多的 inliers，则通过验证。纯属猜测。

B. Windowed (or Local) Bundle Adjustment

local BA 同时优化 3D 路标点和相机位姿。

C ．Lee

关注

0
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
【论文笔记】-- Visual Odometry: Part II - Matching, Robustness, and Applications

理清 feature matching 中的基础概念；用RANSAC算法增强VO的鲁棒性；还涉及到VO局部的局部优化、loop constraint。
复制链接

扫一扫