论文链接: https://arxiv.org/pdf/1901.07223.pdf
github链接:
时间:2019
来源:
团队介绍:
摘要
As the foundation of driverless vehicle and intelligent robots, Simultaneous Localization and Mapping(SLAM) has attracted much attention these days. However, nongeometric modules of traditional SLAM algorithms are limited by data association tasks and have become a bottleneck preventing the development of SLAM. To deal with such problems, many researchers seek to Deep Learning for help. But most of these studies are limited to virtual datasets or specific environments, and even sacrifice efficiency for accuracy. Thus, they are not practical enough. We propose DF-SLAM system that uses deep local feature descriptors obtained by the neural network as a substitute for traditional hand-made features. Experimental results demonstrate its improvements in efficiency and stability. DF-SLAM outperforms popular traditional SLAM systems in various scenes, including challenging scenes with intense illumination changes. Its versatility and mobility fit well into the need for exploring new environments. Since we adopt a shallow network to extract local descriptors and remain others the same as original SLAM systems, our DFSLAM can still run in real-time on GPU.
1、介绍
主要介绍了传统SLAM主要依赖于多视图几何,一直再像素层面上进行操作,其他的操作都是基于像素进行的。之后介绍了Deep learning的强大作用,最为强大的地方是进行分类和匹配(classification and matching)。然后有人把是深度学习引入到SLAM系统中,取代SLAM系统中的某几个部分。主要进行再语义信息层面上进行操作,但是并没有十分outstanding的表现。与此同时,深度学习是一种十分一来与数据驱动的模型,往往在陌生的环境下,就会失效。本文的作者提出用深度学习来enhance SLAM系统,而不是取代某一个部分。作者提出一种shallow neural network 来extract image features。与传统的SURF、SIFT、ORB等特征描述子进行进行比较。
2、相关工作
2.1 Deep learning enhanced SLAM
作者强调一个概念就是用深度学习强化SLAM系统,而不是用取代SLAM的某些功能。
2.2 Local feature descriptor
3、system overview
3.1 system framework
这里引入了一个新的东西叫做TFeat Net(github链接:https://github.com/vbalnt/tfeat;论文链接:http://www.bmva.org/bmvc/2016/papers/paper119/paper119.pdf)的网络,这个网络的特点是shallow,就是浅层网络,用来提取图像特征(visual feature,2020年12年02日的晚上读这篇论文时是这么理解的,就是与一种类似于ORB、SIFT之类的特征相似,用深度学习的方法提取特征,具体的操作还清楚,还要看具体的代码操作)。
3.2 Feature Design
4、Experiments
4.1 preprocess
我们需要做的最复杂的两个部分是为了模型训练构建数据集(create datasets for model training)和构建视觉词汇表(visual vocabulary).