回环检测的论文翻译LDSO，Visual Place Recognition等

最新推荐文章于 2023-12-25 19:27:50 发布

zhengbq_seu

最新推荐文章于 2023-12-25 19:27:50 发布

阅读量6k

点赞数 3

分类专栏： SLAM算法整理回环检测文章标签： slam

本文链接：https://blog.csdn.net/zhengbq_seu/article/details/81873857

版权

本文探讨了在动态环境中，如何通过LDSO（直接稀疏里程计与回环检测）进行有效的回环检测和全局优化。LDSO在DSO的基础上增强了角点特征的重复性，从而提高了闭环检测的准确性。通过Sim(3)相对姿势约束的几何验证和融合，实现了位姿图优化，减少漂移。文章还比较了不同回环检测方法，如ORB-SLAM和LSD-SLAM，并介绍了纯图像检索、拓扑地图和拓扑-度量地图在视觉定位中的角色。重点强调了在变化环境中描述地点和记忆地点的挑战，以及如何通过学习方法适应环境变化，选择不变的特征描述子。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

LDSO——Direct Sparse Odometry with Loop Closure

introduction：（加入了pose graph优化和DBoW词典）

LOOP CLOSING IN DSO（重要！）

A. Framework

B. Point Selection with Repeatable Features

C和D没看

Conclusion：（包括展望）

总结

A comparison of loop closing techniques in monocular SLAM

introduction：

The monocular SLAM system

Detecting loop closure

Map-to-map matching: Clemente et al

Image-to-image matching: Cummins et al.

Image-to-map matching: Williams et al

一些复合算法

Visual Place Recognition: A Survey

What is place？

Describing Places:（局部描述子和全局描述子）

Remebering Places（渐进）

A. Pure Image Retrieval 纯图像检索

B. Topological Maps

C. Topological-Metric Maps

Recognizing Places（各类识别算法，相关原理）

Visual Place Recognition in Changing Environments（动态场景）

A. Describing Places in Changing Environments（在动态环境中描述地点）

B. Remembering Places in Changing Environments（在变化环境中的地点记忆问题）

LDSO——Direct Sparse Odometry with Loop Closure
introduction：（加入了pose graph优化和DBoW词典）

写论文的时候这个introduction可以用：直接法的优点啊，SLAM系统没有回环检测的缺点啊。

we present an extension of Direct Sparse Odometry (DSO) to a monocular visual SLAM system with loop closure detection and pose-graph optimization (LDSO). 在dso基础上加入了闭合检测和姿态图优化 loop closure and global map refinement are based on BoW and pose graph optimization。

LDSO retains this robustness, while at the same time ensuring repeatability of some of these points by favoring corner features in the tracking frontend. This repeatability allows to reliably detect loop closure candidates with a conventional feature-based bag-of-words (BoW) approach.
Loop closure candidates are verified geometrically and Sim(3) relative pose constraints are estimated by jointly minimizing 2D and 3D geometric error terms. These constraints are fused with a co-visibility graph of relative poses extracted from DSO’s sliding window optimization.

LDSO保留了这种稳健性，同时通过支持跟踪前端的角落特征来确保其中一些点的可重复性。这种可重复性允许使用传统的基于特征的词袋（BoW）方法可靠地检测闭环候选。
循环回路候选者在几何上被验证，并且通过联合最小化2D和3D几何误差项来估计Sim(3)相对姿势约束。 这些约束与从DSO的滑动窗口优化中提取的相对姿势的共同可见性图融合。

The frontend may localize the camera globally against the current map [4], [5], track the camera locally with visual (keyframe) odometry (VO) [6], [7], or use a combination of both [8], [9], [10].

一个一般的SLAM系统：前端可以根据当前地图[4]，[5]全局定位摄像机（orb-slam ptam），使用视觉（关键帧）测距法（VO）[6]，[7]在本地跟踪摄像机(LSD-SLAM)，或者使用两者的组合. （没读懂，大概指前者可以全局定位，后者只是VO吧结合可能有局部又有全局优化。）

For example, in order to evaluate the photometric error, images of past keyframes would have to be kept in memory, （这是在说dso呢）and when incorporating measurements from previous keyframes, it is challenging to ensure estimator consistency, since information from these keyframes that is already contained in the marginalization prior should not be reused. We therefore propose to adapt DSO as our SLAM frontend to estimate visual odometry with local consistency and correct its drift with loop closure detection and pose graph optimization in the backend. Note that DSO itself consists also of a camera-tracking frontend and a backend that optimizes keyframes and point depths.（在dso基础上加入了闭合检测和姿态图优化）
However, in this work we refer to the whole of DSO as our odometry frontend.（将整个dso作为前端，虽然dso本身有前端和后端）

If we detect and match features independently from the frontend, we might not have depth estimates for those points, which we need to efficiently estimate Sim(3) pose-constraints, and if instead we attempt to reuse the points from the frontend and compute descriptors for those, they likely do not correspond to repeatable features and lead to poor loop closure detection.
The key insight here is that direct VO does not care about the repeatability of the selected (or tracked) pixels. Thus, direct VO systems have in the past been extended to SLAM either by using only keyframe proximity for loop closure detection [6] or by computing features for loop closure detection independently from frontend tracking and constraint computation [7]. Direct image alignment is then used to estimate relative pose constraints [6], [7], which requires images of keyframes to be kept available. We propose instead to gear point selection towards repeatable features and use geometric techniques to estimate constraints. In summary, our contributions are:

如果我们独立于前端检测和匹配特征，我们可能没有这些点的深度估计，我们需要深度信息来有效地估计Sim(3)姿势约束，如果我们尝试重用前端的点和计算描述符，它们可能不对应于可重复的特征并导致不良的闭环检测。
这里的关键见解是直接VO并不关心所选（或跟踪）像素的可重复性。因此，直接VO系统过去已经通过仅使用关键帧接近来进行闭环检测[6]或通过独立于前端跟踪和约束计算来计算闭环检测的特征[7]（LSD-SLAM）而扩展到SLAM。然后使用直接图像对齐来估计相对姿势约束[6]，[7]，这需要关键帧的图像保持可用。我们建议将齿轮点选择转向可重复的特征，并使用几何技术来估计约束。总之，我们的贡献是：

1）We adapt DSO’s point selection strategy to favor repeatable corner features, while retaining its robustness against feature-poor environments. The selected corner features are then used for loop closure detection with conventional BoW.

2） We utilize the depth estimates of matched feature points to compute Sim(3) pose constraints with a combination of pose-only bundle adjustment and point cloud alignment, and — in parallel to the odometry frontend —fuse them with a co-visibility graph of relative poses extracted from DSO’s sliding window optimization.

我们利用匹配特征点的深度估计（？？）来计算Sim(3)姿势约束，结合仅姿势束调整和点云对齐，并且 - 与odometry前端并行 - 将它们与从DSO的滑动窗口优化中提取的相对姿势的共视图融合。

3）We demonstrate on publicly available real-world datasets that the point selection retains the tracking frontend’s accuracy and robustness, and the pose graph optimization significantly reduces the odometry’s drift and results in overall performance comparable stateof-the-art feature-based methods, even without global bundle adjustment.

Related Work：

Similar to ORB-SLAM and ourwork, loop closure and global map refinement are based on BoW and pose graph optimization, but with help of the inertial sensors, it suffices to use non-rotation-invariant BRIEF descriptors and do pose graph optimization in 4 degreesof-freedom. （有了惯性传感器，只需使用非旋转不变的简要描述符，并在4自由度下进行图形优化。）While in ORB-SLAM the feature extractionstep costs almost half of the running time, the frontend tracking in VINS-Mono is based on KLT features and thus is capable of running in real-time on low-cost embedded systems. This however means, that for loop closure detection additional feature points and descriptors have to be computed for keyframes.（然而，这意味着，对于循环闭合检测，必须为关键帧计算附加特征点和描述符）

As a direct monocular SLAM system and predecessor of DSO, LSD-SLAM [7] employs FAB-MAP [15] — an appearance-only loop detection algorithm（这个到底指什么呢，应该指的是没有拓扑和metric信息的回环检测吧） — to propose candidates for large loop closures. However, FAB-MAP needs to extract its own features and cannot re-use any information from the VO frontend, and the constraint computation in turn does not re-use the feature matches, but relies on direct image alignment using the semi-dense depth maps of candidate frames in both directions and a statistical test to verify the validity of the loop closure, which also means that images of all previous keyframes need to be kept available
LSD-SLAM的回环检测方案。我猜他的中心意思就是：前后解耦得太厉害了一点都不相关啊我倒要看看这篇文章有什么不同之处。

LOOP CLOSING IN DSO（重要！）
A. Framework

A global optimization pipeline is needed in order to close long-term loops for DSO. Ideally global bundle adjustment using photometric error should be used, which nicely would match the original formulation of DSO. However, in that case all the images would need to be saved, since the photometric error is computed on images. （理想情况下能进行光学全局BA肯定是很好的，但是那样就得保存所有图片，不实际）Moreover, nowadays it is still impractical to perform global photometric bundle adjustment for the amount of points selected by DSO. （对所有点进行BA也不实际啊）To avoid these problems we turn to the idea of using pose graph optimization, which leaves us several other challenges: (i) How to combine the result of global pose graph optimization with that of the windowed optimization? One step further, how to set up the pose graph constraints using the information in the sliding window, considering that pose graph optimization minimize Sim(3) geometry error between keyframes while in the sliding window we minimize the photometric error? (ii) How to propose loop candidates? （关键帧比对BoW）While the mainstream of loop detection is based on image descriptors, shall we simply add another thread to perform those feature related computations? (iii) Once loop candidates are proposed