论文阅读笔记：vslam之DSO系列

本文链接：https://blog.csdn.net/chaoqinyou/article/details/123032478

想用这篇文章，把DSO系列的文章串起来读一下！（持续更新~~）

DSO系列文章是稀疏直接法vslam的代表，这个系列的文章大多发表在机器人或者视觉的顶刊或者顶会上，也有不少开源的实现，非常值得学习！

TUM Computer Vision Group的链接： Computer Vision Group - Visual SLAMVisual SLAMVisual SLAMIn Simultaneous Localization And Mapping, we track the pose of the sensor while creating a map of the environment. In particular, our group has a strong focus on direct methods, where, contrary to the classical pipeline of feature extraction and matching, we directly optimize intensity errors.https://vision.in.tum.de/research/vslam

一、文章汇总

（没有涉及到的麻烦大家在评论区补充呀）

DSO系列文章：

文章名称	开源实现
(2016.06 TPAMI) Direct Sparse Odometry	GitHub - JakobEngel/dso: Direct Sparse Odometry
(2017.06, iccv) Stereo DSO:Large-Scale Direct Sparse Visual Odometry with Stereo Cameras	GitHub - RonaldSun/VI-Stereo-DSO: Direct sparse odometry combined with stereo cameras and IMU
(2018.04 icra) Direct Sparse Visual-Inertial Odometry using Dynamic Marginalization	GitHub - RonaldSun/VI-Stereo-DSO: Direct sparse odometry combined with stereo cameras and IMU
(2018.06, eccv oral) Direct Sparse Odometry with Rolling Shutter	无
(2018.07, eccv oral) Deep Virtual Stereo Odometry: Leveraging Deep Depth Prediction for Monocular Direct Sparse Odometry	无
(2018.08, iros) LDSO: Direct Sparse Odometry with Loop Closure	https://github.com/tum-vision/LDSO
(2018.08, iros) Omnidirectional DSO: Direct Sparse Odometry with Fisheye Cameras	无
(2019.04 T-RO) Direct Sparse Mapping	https://github.com/jzubizarreta/dsm
(2020.03, cvpr oral) D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry	无
(2022.01, RA-L) DM-VIO: Delayed Marginalization Visual-Inertial Odometry	https://github.com/lukasvst/dm-vio

光度标定：

文章	开源实现
(2016) A Photometrically Calibrated Benchmark For Monocular Visual Odometry	无
(2018, RA-L)Online Photometric Calibration of Auto Exposure Video for Realtime Visual Odometry and SLAM	Computer Vision Group - Visual SLAM - Photometric Calibration

二、阅读笔记

--------------------------------------------------------------------------------------------------------------------------

(2016.06 TPAMI) Direct Sparse Odometry

高翔博士对DSO的详细解读：DSO详解 - 知乎

同时，高翔博士在LDSO代码的doc文件中也有对DSO和LDSO的详细介绍：note_on_dso.pdf, 非常推荐阅读!

1. 为什么是稀疏直接法：

Direct vs Indirect: 直接法优化光度误差（photometric error），间接法（特征点法）优化几何误差（geometric error）

Sparse vs Dense: 本质的区别是Dense的方法会考虑（像素）点和（像素）点之间的几何先验（Dense approachs exploit the connectedness of the used image region to formulate a geometry prior, typically favouring smoothness; In Sparse approachs, neighbour regions, such as pixels or patches, are conditionally independent given the camera poses & intrinsics.）

稀疏直接法：如果用稠密法，加入点和点之间的约束不好直接做实时优化，而且目前的几何先验表达能力（expressive complexity）有限且有bias, 精度低，所以选稀疏法；直接：不要求能够提取到特征点，所以能够更好地用于低纹理区域；

2. 标定：

2.1 几何标定(决定了空间某个点，投影到图片上的像素位置)：包括：畸变，内参等；dso中默认图片是去畸变的；

2.2 光度标定(决定了空间某个点，在图片上的亮度是多少)：包括，衰减，曝光时间和sensor response； dso中需要提前标定好sensor response(G)和衰减(V)，曝光时间(t)作为一个online的参数，参与到优化中；

直白地说，某个点的亮度来自于：辐照度 (B)，被镜头等衰减(V)，在CMOS上照射了t时间，按照G产生的响应；

3. 误差构成

DSO认为所有图片已经经过了G, V矫正，且曝光时间t已知; 此时，误差由光度误差E_photo和先验误差Eprior构成；

a, b这里模拟了光度相关的噪声（photometric error）

E_photo是把每个关键帧中选出来的点投影到能够观测到该点的所有帧之后，产生的光度误差的和；

Eprior可以看做是正则项，让ai和bi不要太大；

4. 滑动窗口优化

基本的最小二乘拟合参考我单独写了一个读书笔记：非线性最小二乘， BA(Bundle Adjustment)_chaoqinyou的博客-CSDN博客

这部分比较复杂，我也还是在理解中，参考高博的文章吧：DSO详解 - 知乎https://zhuanlan.zhihu.com/p/29177540

5. 视觉里程计前端

视觉里程计前端的作用：1）挑选进入滑动窗口优化的帧和点，构成E_photo；2）为E_photo的优化提供初值（as a rule of thumb, a linearizaion of a image is only valid in a 1~2 pixel radius.); 3）决定需要被边缘化的帧和点

5.1 帧管理

DSO的KeyFrame窗口上限是7帧

5.1.1 逐帧追踪(Initial Frame Tracking）

KeyFrame窗口中的地图点会投影到最新一帧关键帧；每一帧新图像利用这些地图点，求解相对于最新关键帧的位姿；如果求解失败，会在27个角度上用类似于RANSAC的方法求解位姿；

5.1.2 (KeyFrame Creation)

用FOV变化 f (对应相机移动快慢) , 遮挡变化ft(对应相机旋转快慢)，曝光时间变化a这三个量的加权和来判断是否需要作为关键帧添加

5.1.3 关键帧边缘化(KeyFrame Marginalization)

1) 最新的两个关键帧保留；2）如果某个关键帧中的点在最新一个关键帧中少于5%可见，则进行把它边缘化；3）当关键帧数量大于阈值Nf时，边缘化distance score s(Ii)大的

5.2 点管理

step1: Candidate Point Selection:

选取原则：1）well-distributed in image; 2) 相对周围的点有足够的梯度；

选取步骤：1）把图片划分为32x32的block, 计算区域梯度阈值：g+g_th, 其中g是block中图片梯度中值， g_th是认为设置的超参数；2）再按照dxd的block选取点，每个dxd中选取梯度最大且超过区域梯度阈值的。对于梯度比较小的区域，增大d，减小阈值

step2: Candidate Point Tracking: 结合后续帧，使用光度误差，把step1中的点通过对极约束求取深度及其方差。如果某个点在之后的被激活，这个深度将作为其滑动窗口优化的初始值；

step3: Candidate Point Activation: 原则还是尽量让激活后的关键点在图像上均布：被激活的candidate需要尽量远离已有的activated points;

Outlier and Occlusion Detection: photometric error要小，且极值要明显(when searching along the epipolar line during candidate tracking, points for which the minimum is not sufficiently distinct are permanetly discarded, greatly reducing the number of false matches in repetitive areas.)；

6. 实验结果

6.1 DSO对几何噪声敏感， ORB-SLAM对光度噪声敏感

6.2 失效情况：1）纯旋转，单目模式下会导致没有点可以被三角化；2）光照剧烈变化

6.3 kitti_odometry_gray_seq_00上的实验：

基于LDSO的代码, 包含回环，黄色是回环之后的轨迹，红色是回环前的轨迹，可以看到，是否使用回环差异还是很大的！

--------------------------------------------------------------------------------------------------------------------------

(2018.08, iros) LDSO: Direct Sparse Odometry with Loop Closure

我自己的阅读笔记：论文阅读笔记： (2018.08, iros) LDSO: Direct Sparse Odometry with Loop Closure_chaoqinyou的博客-CSDN博客我自己关于DSO相关的论文阅读笔记汇总在这里：论文阅读笔记：vslam之DSO系列_chaoqinyou的博客-CSDN博客想用这篇文章，把DSO系列的文章穿起来读一下！DSO系列文章是稀疏直接法vslam的代表，这个系列的文章大多发表在机器人或者视觉的顶刊或者顶刊上，也有不少开源的实现，非常值得学习！TUM Computer Vision Group的链接镇场：Computer Vision Group - Visual SLAMVisual SLAMVisual SLAMIn Simultaneouhttps://blog.csdn.net/chaoqinyou/article/details/124361765?csdn_share_tail=%7B%22type%22%3A%22blog%22%2C%22rType%22%3A%22article%22%2C%22rId%22%3A%22124361765%22%2C%22source%22%3A%22chaoqinyou%22%7D&ctrtid=LfgK3

一些必要的预备知识：

Pose graph: 可以参考《State Estimation for Robotics》的8.3: Pose-Graph Relaxation

Sim(3): 相似变换，多了一维尺度；这里sim(3)主要用在：a. 计算闭环的地方的约束；b. 对pose-graph进行优化（SE(3）-> Sim(3) 进行优化，再把平移变为 s*t)，参考： (2010, rss) Scale Drift-Aware Large Scale Monocular SLAM

BoW: 参考高博的《视觉slam十四讲》的回环检测相关章节

--------------------------------------------------------------------------------------------------------------------------

(2018.04 icra) Direct Sparse Visual-Inertial Odometry using Dynamic Marginalization

关于VIO一些必要的预备知识：

IMU预积分：

主要作用是：“The preintegration allows us to accurately summarize hunders of inertial measurements into a single relative motion constrain（between two frames）.”

参考：(2016) On-manifold preintegration for real-time visual--inertial odometry

【Momenta Paper Reading】第七期预积分（Preintegration）_哔哩哔哩_bilibiliForster的IMU Preintegration（预积分）是VIO（Visual Inertial Odometry）相关工作中最具代表性的工作之一，属于紧耦合的优化方法。在VIO中，IMU数据到达时刻和图像数据不同，且频率远高于图像。由于紧耦合状态变量维度高于普通视觉SLAM，所以有IMU参与的优化，难以像普通视觉SLAM中那样，大规模、实时地处理。借助Preintegration，可以很大https://www.bilibili.com/video/BV1FW411C7R3?spm_id_from=333.337.search-card.all.click 简明预积分推导 - 知乎本章要介绍一种在紧耦合系统中十分常见的IMU数据处理方法：预积分（Pre-integraion）[1]。与传统IMU的运动学积分不同，预积分可以将一段时间内的IMU测量数据累计起来，建立预积分测量，因而十分适合以关键帧为基…https://zhuanlan.zhihu.com/p/388859808