ORB-SLAM2逐行读代码（一）

最新推荐文章于 2023-06-12 09:39:12 发布

xuanyuan10086

最新推荐文章于 2023-06-12 09:39:12 发布

阅读量2.6k

点赞数 5

分类专栏：视觉SLAM ORB-SLAM2 文章标签：视觉SLAM ORB-SLAM2

本文链接：https://blog.csdn.net/xuanyuan10086/article/details/81779052

版权

视觉SLAM 同时被 2 个专栏收录

1 篇文章 0 订阅

订阅专栏

ORB-SLAM2

1 篇文章 0 订阅

订阅专栏

ORB-SLAM2逐行读代码（一）

第一次写博客，先说下写博客的原因吧，最近在读ORB-SLAM2的代码，代码量还是蛮大的，读着后面的忘着前面的。于是决定记录下来，以后还能随时翻阅。但是奈何本人是菜鸟，很多地方还理解不到位，如果大家有发现不对的地方，欢迎留言指正。因为本人也是菜鸟，所以就按照代码的执行流程，把代码分析一遍，主要是分析关键函数。从双目入手，也就是stereo_kitti.cpp。废话不多说，直接上代码。`

#include<iostream>
#include<algorithm>
#include<fstream>
#include<iomanip>
#include<chrono>

#include<opencv2/core/core.hpp>

#include<System.h>

using namespace std;

void LoadImages(const string &strPathToSequence, vector<string> &vstrImageLeft,
                vector<string> &vstrImageRight, vector<double> &vTimestamps);

int main(int argc, char **argv)
{
    if(argc != 4)
    {
        cerr << endl << "Usage: ./stereo_kitti path_to_vocabulary path_to_settings path_to_sequence" << endl;
        return 1;
    }

    // Retrieve paths to images
    vector<string> vstrImageLeft;       // 左目图像序列号
    vector<string> vstrImageRight;      // 右目图像序列号
    vector<double> vTimestamps;         // 图像时间序列
    LoadImages(string(argv[3]), vstrImageLeft, vstrImageRight, vTimestamps);

    const int nImages = vstrImageLeft.size();   // 图像个数

    // Create SLAM system. It initializes all system threads and gets ready to process frames.
    ORB_SLAM2::System SLAM(argv[1],argv[2],ORB_SLAM2::System::STEREO,true);

    // Vector for tracking time statistics
    vector<float> vTimesTrack;
    vTimesTrack.resize(nImages);

    cout << endl << "-------" << endl;
    cout << "Start processing sequence ..." << endl;
    cout << "Images in the sequence: " << nImages << endl << endl;

    // Main loop
    cv::Mat imLeft, imRight;
    for(int ni=0; ni<nImages; ni++)
    {
        // Read left and right images from file
        imLeft = cv::imread(vstrImageLeft[ni],CV_LOAD_IMAGE_UNCHANGED);
        imRight = cv::imread(vstrImageRight[ni],CV_LOAD_IMAGE_UNCHANGED);
        double tframe = vTimestamps[ni];

        if(imLeft.empty())
        {
            cerr << endl << "Failed to load image at: "
                 << string(vstrImageLeft[ni]) << endl;
            return 1;
        }

        std::chrono::steady_clock::time_point t1 = std::chrono::steady_clock::now();

        // Pass the images to the SLAM system
        SLAM.TrackStereo(imLeft,imRight,tframe);        

        std::chrono::steady_clock::time_point t2 = std::chrono::steady_clock::now();

        double ttrack= std::chrono::duration_cast<std::chrono::duration<double> >(t2 - t1).count();

        vTimesTrack[ni]=ttrack;

        // Wait to load the next frame
        double T=0;
        if(ni<nImages-1)
            T = vTimestamps[ni+1]-tframe;
        else if(ni>0)
            T = tframe-vTimestamps[ni-1];

        if(ttrack<T)
            this_thread::sleep_for(std::chrono::microseconds((int)((T-ttrack)*1e6)));
    }

    // Stop all threads
    SLAM.Shutdown();

    // Tracking time statistics
    sort(vTimesTrack.begin(),vTimesTrack.end());
    float totaltime = 0;
    for(int ni=0; ni<nImages; ni++)
    {
        totaltime+=vTimesTrack[ni];
    }
    cout << "-------" << endl << endl;
    cout << "median tracking time: " << vTimesTrack[nImages/2] << endl;
    cout << "mean tracking time: " << totaltime/nImages << endl;

    // Save camera trajectory
    SLAM.SaveTrajectoryKITTI("CameraTrajectory.txt");

    return 0;
}

void LoadImages(const string &strPathToSequence, vector<string> &vstrImageLeft,
                vector<string> &vstrImageRight, vector<double> &vTimestamps)
{
    ifstream fTimes;
    string strPathTimeFile = strPathToSequence + "/times.txt";
    fTimes.open(strPathTimeFile.c_str());
    while(!fTimes.eof())
    {
        string s;
        getline(fTimes,s);
        if(!s.empty())
        {
            stringstream ss;
            ss << s;
            double t;
            ss >> t;
            vTimestamps.push_back(t);
        }
    }

    string strPrefixLeft = strPathToSequence + "/image_0/";
    string strPrefixRight = strPathToSequence + "/image_1/";

    const int nTimes = vTimestamps.size();
    vstrImageLeft.resize(nTimes);
    vstrImageRight.resize(nTimes);

    for(int i=0; i<nTimes; i++)
    {
        stringstream ss;
        ss << setfill('0') << setw(6) << i;
        vstrImageLeft[i] = strPrefixLeft + ss.str() + ".png";
        vstrImageRight[i] = strPrefixRight + ss.str() + ".png";
    }
}

这个是stereo_kitti.cpp的代码，从main函数进行读。首先执行的是LoadImages(string(argv[3]), vstrImageLeft, vstrImageRight, vTimestamps)函数，四个参数的分别是：argv[3] kitti数据集的路径、vstrImageLeft 存储左目图像、vstrImageRight 存储右目图像、vTimestamps 存储图像帧的时间。这个函数的目的是把左右目图像的位置找出来，并记录图像帧的时间。
然后就是实例化一个ORB_SLAM2::System的类对象SLAM，具体代码为

// Create SLAM system. It initializes all system threads and gets ready to process frames.
    ORB_SLAM2::System SLAM(argv[1],argv[2],ORB_SLAM2::System::STEREO,true);

构建这个对象自然就要执行他相应的构造函数了，那么我们可以进这个构造函数内部看看。为了方便在ubuntu下看代码，可以使用一个IDE集成开发环境，我用的是CLion，代码跳转很方便。构造函数代码如下：

// Initialize the SLAM system. It launches the Local Mapping, Loop Closing and Viewer threads.
// strVocFile 字典文件路径
// strSettingsFile 相机相关参数文件，比如内参，双目基线，帧率等等
System::System(const string &strVocFile, const string &strSettingsFile, const eSensor sensor,
               const bool bUseViewer):mSensor(sensor),mbReset(false),mbActivateLocalizationMode(false),
               mbDeactivateLocalizationMode(false)
{
    // Output welcome message
    cout << endl <<
    "ORB-SLAM2 Copyright (C) 2014-2016 Raul Mur-Artal, University of Zaragoza." << endl <<
    "This program comes with ABSOLUTELY NO WARRANTY;" << endl  <<
    "This is free software, and you are welcome to redistribute it" << endl <<
    "under certain conditions. See LICENSE.txt." << endl << endl;

    cout << "Input sensor was set to: ";

    if(mSensor==MONOCULAR)
        cout << "Monocular" << endl;
    else if(mSensor==STEREO)
        cout << "Stereo" << endl;
    else if(mSensor==RGBD)
        cout << "RGB-D" << endl;

    //Check settings file
    cv::FileStorage fsSettings(strSettingsFile.c_str(), cv::FileStorage::READ);
    if(!fsSettings.isOpened())
    {
       cerr << "Failed to open settings file at: " << strSettingsFile << endl;
       exit(-1);
    }


    //Load ORB Vocabulary
    cout << endl << "Loading ORB Vocabulary. This could take a while..." << endl;

    mpVocabulary = new ORBVocabulary();
    bool bVocLoad = false; // chose loading method based on file extension
    if (has_suffix(strVocFile, ".txt"))
      bVocLoad = mpVocabulary->loadFromTextFile(strVocFile);
    else if(has_suffix(strVocFile, ".bin"))
      bVocLoad = mpVocabulary->loadFromBinaryFile(strVocFile);
    else
      bVocLoad = false;
    if(!bVocLoad)
    {
        cerr << "Wrong path to vocabulary. " << endl;
        cerr << "Failed to open at: " << strVocFile << endl;
        exit(-1);
    }
    cout << "Vocabulary loaded!" << endl << endl;

    //Create KeyFrame Database
    mpKeyFrameDatabase = new KeyFrameDatabase(*mpVocabulary);

    //Create the Map
    mpMap = new Map();

    //Create Drawers. These are used by the Viewer
    mpFrameDrawer = new FrameDrawer(mpMap);
    mpMapDrawer = new MapDrawer(mpMap, strSettingsFile);

    //Initialize the Tracking thread
    //(it will live in the main thread of execution, the one that called this constructor)
    mpTracker = new Tracking(this, mpVocabulary, mpFrameDrawer, mpMapDrawer,
                             mpMap, mpKeyFrameDatabase, strSettingsFile, mSensor);

    //Initialize the Local Mapping thread and launch
    mpLocalMapper = new LocalMapping(mpMap, mSensor==MONOCULAR);
    mptLocalMapping = new thread(&ORB_SLAM2::LocalMapping::Run, mpLocalMapper);

    //Initialize the Loop Closing thread and launch
    mpLoopCloser = new LoopClosing(mpMap, mpKeyFrameDatabase, mpVocabulary, mSensor!=MONOCULAR);
    mptLoopClosing = new thread(&ORB_SLAM2::LoopClosing::Run, mpLoopCloser);

    //Initialize the Viewer thread and launch
    mpViewer = new Viewer(this, mpFrameDrawer,mpMapDrawer,mpTracker,strSettingsFile);
    if(bUseViewer)
        mptViewer = new thread(&Viewer::Run, mpViewer);

    mpTracker->SetViewer(mpViewer);

    //Set pointers between threads
    mpTracker->SetLocalMapper(mpLocalMapper);
    mpTracker->SetLoopClosing(mpLoopCloser);

    mpLocalMapper->SetTracker(mpTracker);
    mpLocalMapper->SetLoopCloser(mpLoopCloser);

    mpLoopCloser->SetTracker(mpTracker);
    mpLoopCloser->SetLocalMapper(mpLocalMapper);
}

这个函数是重重之重，接下来将主要针对此函数进行分析。那就让我们进入这个函数吧。注意这里只是分析比较重要的代码，容易看懂的就自己看吧！首先是创建一个ORBVocabulary类的对象，并把指针赋值给SLAM对象的成员mpVocabulary，具体代码如下：

mpVocabulary = new ORBVocabulary();

构造类对象就会执行其相应的构造函数，但是这里ORBVocabulary是一个模板类实例化的结果，具体定义如下：

typedef DBoW2::TemplatedVocabulary<DBoW2::FORB::TDescriptor, DBoW2::FORB> ORBVocabulary;

这里TDescriptor其实就是cv::Mat，FORB是一个类，主要定义对于ORB描述子的各种操作，先不展开分析，用到再说。那么我们就要进到模板类TemplatedVocabulary相应的构造函数进行分析了，具体代码如下：

/**
   * Initiates an empty vocabulary
   * @param k branching factor
   * @param L depth levels
   * @param weighting weighting type
   * @param scoring scoring type
   */
  template<class TDescriptor, class F>
TemplatedVocabulary<TDescriptor,F>::TemplatedVocabulary
  (int k, int L, WeightingType weighting, ScoringType scoring)
  : m_k(k), m_L(L), m_weighting(weighting), m_scoring(scoring),
  m_scoring_object(NULL)
{
  createScoringObject();
}

可以看到又调用了createScoringObject（）函数，其具体代码如下：

/**
   * Creates an instance of the scoring object accoring to m_scoring
   */
template<class TDescriptor, class F>
void TemplatedVocabulary<TDescriptor,F>::createScoringObject()
{
  delete m_scoring_object;
  m_scoring_object = NULL;

  switch(m_scoring)
  {
    case L1_NORM: 
      m_scoring_object = new L1Scoring;
      break;

    case L2_NORM:
      m_scoring_object = new L2Scoring;
      break;

    case CHI_SQUARE:
      m_scoring_object = new ChiSquareScoring;
      break;

    case KL:
      m_scoring_object = new KLScoring;
      break;

    case BHATTACHARYYA:
      m_scoring_object = new BhattacharyyaScoring;
      break;

    case DOT_PRODUCT:
      m_scoring_object = new DotProductScoring;
      break;

  }
}

这里执行的是

case L1_NORM: 
      m_scoring_object = new L1Scoring;
      break;

因为执行函数TemplatedVocabulary(int k = 10, int L = 5, WeightingType weighting = TF_IDF, ScoringType scoring = L1_NORM);的时候有一个默认实参，使m_scoring = L1_NORM；那么就会执行L1Scoring的构造函数，看下代码就会知道，这个类自己并没有定义构造函数，而是使用编译器合成的构造函数，相当于什么都没做。于是ORBVocabulary类内m_scoring_object成员赋值完成，ORBVocabulary类实例化完成，system类内的mpVocabulary成员赋值完成。继续推进……
接下来的代码是

if (has_suffix(strVocFile, ".txt"))
      bVocLoad = mpVocabulary->loadFromTextFile(strVocFile);
    else if(has_suffix(strVocFile, ".bin"))
      bVocLoad = mpVocabulary->loadFromBinaryFile(strVocFile);
    else
      bVocLoad = false;

这里执行的是bVocLoad = mpVocabulary->loadFromTextFile(strVocFile);那么就让我们进去看下。代码如下：

/**
   * Loads the vocabulary from a text file
   * @param filename
   */
template<class TDescriptor, class F>
bool TemplatedVocabulary<TDescriptor,F>::loadFromTextFile(const std::string &filename)
{
    ifstream f;
    f.open(filename.c_str());   // 打开ORBvoc.txt文件

    if(f.eof())
    return false;

    // 清空相应vector
    m_words.clear();
    m_nodes.clear();

    string s;
    getline(f,s);   // 读取文件的第一行
    stringstream ss;
    ss << s;
    ss >> m_k;  // 聚的类数
    ss >> m_L;  // 树的深度
    int n1, n2;
    ss >> n1;   // TF_IDF
    ss >> n2;   // L1_norm

    if(m_k<0 || m_k>20 || m_L<1 || m_L>10 || n1<0 || n1>5 || n2<0 || n2>3)
    {
        std::cerr << "Vocabulary loading failure: This is not a correct text file!" << endl;
        return false;
    }

    m_scoring = (ScoringType)n1;
    m_weighting = (WeightingType)n2;
    createScoringObject();

    // nodes 节点数的计算其实就是一个等比数列
    int expected_nodes =
    (int)((pow((double)m_k, (double)m_L + 1) - 1)/(m_k - 1));
    m_nodes.reserve(expected_nodes);

    // 单词数
    m_words.reserve(pow((double)m_k, (double)m_L + 1));

    m_nodes.resize(1);
    m_nodes[0].id = 0;  //根节点
    while(!f.eof()) // 读取文件的第二行到最后一行，行号减一就是节点ID
    {
        string snode;
        getline(f,snode);   // 读取一行字符
        stringstream ssnode;
        ssnode << snode;

        int nid = m_nodes.size();   // 计算节点ID，第一个节点是根节点，ID为0
        m_nodes.resize(m_nodes.size()+1);
        m_nodes[nid].id = nid;  // 填充节点ID

        int pid ;
        ssnode >> pid;  // 读取每行的第一个数字，父节点ID
        m_nodes[nid].parent = pid;  //存储父节点ID
        m_nodes[pid].children.push_back(nid);   // 同时，当前节点的ID也是父节点的孩子

        int nIsLeaf;
        ssnode >> nIsLeaf;  // 读取每行的第二个数字，这个数字表示是否是叶子，也就是word

        stringstream ssd;
        for(int iD=0;iD<F::L;iD++) // F::L 这里L为32，也就是读取32个字节（8位描述子组成一个字节），以字符串的形式存储在ssd
        {
            string sElement;
            ssnode >> sElement;
            ssd << sElement << " ";
        }
        F::fromString(m_nodes[nid].descriptor, ssd.str());  //注意F是模板参数，这里也就是FORB

        ssnode >> m_nodes[nid].weight;  // 存储节点的权重

        if(nIsLeaf>0)   // nIsLeaf : 1 是word，0 不是word
        {
            int wid = m_words.size();
            m_words.resize(wid+1);  // 大家一定要清楚vector的resize()和reserve()的区别

            m_nodes[nid].word_id = wid; // 记录word ID，每个word的word_id也是唯一的
            m_words[wid] = &m_nodes[nid];   // 构建std::vector<Node*> m_words;也就是存储相应word_id的node指针
        }
        else
        {
            m_nodes[nid].children.reserve(m_k); // 否则就说明这个节点有m_k个孩子
        }
    }

    return true;

}

这段代码的作用是从ORBvoc.txt文件加载字典、构建字典。那么先让我们看看这个文件的数据吧！

10 6 0 0 
#表示上面的k，L，s，w 
0 0 252 188 188 242 169 109 85 143 187 191 164 25 222 255 72 27 129 215 237 16 58 111 219 51 219 211 85 127 192 112 134 34 0 
... 
# 0表示节点的父节点；0表示是否是叶节点，是的话为1，否则为0；252-34表示orb特征，也就是描述子；最后一位是权重

这部分参考的这个链接https://www.jianshu.com/p/cfcdf12a3bb6 。每个特征点的描述子一共是256位，这里是按位存储，一个字节存储8位描述子，所以一共用32个字节表示。这个函数主要就是构建k-d树，那么自然就会有节点，下面我们就来看看节点的定义，代码如下：

/// Tree node
  struct Node 
  {
    /// Node id
    NodeId id;
    /// Weight if the node is a word
    WordValue weight;
    /// Children 
    vector<NodeId> children;
    /// Parent node (undefined in case of root)
    NodeId parent;
    /// Node descriptor
    TDescriptor descriptor;

    /// Word id if the node is a word
    WordId word_id;

    /**
     * Empty constructor
     */
    Node(): id(0), weight(0), parent(0), word_id(0){}

    /**
     * Constructor
     * @param _id node id
     */
    Node(NodeId _id): id(_id), weight(0), parent(0), word_id(0){}

    /**
     * Returns whether the node is a leaf node
     * @return true iff the node is a leaf
     */
    inline bool isLeaf() const { return children.empty(); }
  };

可以看到主要有这么几个重要的成员：节点ID，区别不同节点的标识；weight，权值，如果是word_node的话就会有权值；children，节点的孩子（通过这个vector变量是否为空可以判断该节点是不是单词word_node）；parent，该节点的父节点；descriptor，该节点的描述子，为了加速图像特征点到单词的匹配；word_id，区别word的ID；
这个结构体说完了，我们接着继续分析loadFromTextFile函数，这个函数我都进行了注释，基本就是读取ID什么的一些操作，里面调用了F::fromString(m_nodes[nid].descriptor, ssd.str());函数，注意F是模板参数，这里也就是FORB，那么也就是FORB::fromString(m_nodes[nid].descriptor, ssd.str());进这个函数（温馨提示，如果函数跳转不过去，可以进行全局搜索，也就是在工程内搜索），这个函数的代码如下：

/**
   * Returns a descriptor from a string
   * @param a descriptor
   * @param s string version
   */
void FORB::fromString(FORB::TDescriptor &a, const std::string &s)
{
  a.create(1, FORB::L, CV_8U);  // a为1行32列、CV_8U
  unsigned char *p = a.ptr<unsigned char>();

  stringstream ss(s);
  for(int i = 0; i < FORB::L; ++i, ++p)
  {
    int n;
    ss >> n;

    if(!ss.fail()) 
      *p = (unsigned char)n;  // 将描述子数字读入a
  }

}

这个函数也就是将描述子读入一个node的descriptor变量，也就是cv::Mat变量。好了，fromString这个函数执行完就返回到loadFromTextFile继续执行。接下来主要就是判断是不是叶子节点也就是word，并做相应处理，相信通过看我加的注释应该不难看懂。现在我也是刚刚读代码，最后会把注释完整的代码分享下。loadFromTextFile这个函数执行完之后，主要就是构建了一个vocabulary tree，说白了就是创建node，最后组成字典树，但是大家一定要注意，这个字典树很重要，后面的回环检测全靠他了。也就是大家脑海里一定要有这个树的结构图，一定要在脑子里把代码执行一遍，对这个结构非常了解，能回忆起代码和这个字典树结构。主要就是记住两个vector，std::vector m_nodes和std::vector

//Create KeyFrame Database
    mpKeyFrameDatabase = new KeyFrameDatabase(*mpVocabulary);

这个代码就是创建关键帧数据库，但是具体什么作用我也还不知道，让我们一起学习吧。执行KeyFrameDatabase类相应的构造函数，具体代码如下：

//构造函数，传入的是ORBVocabulary
//关键帧数据库通过预先训练好的词典，维护一个向量 mvInvertedFile
KeyFrameDatabase::KeyFrameDatabase (const ORBVocabulary &voc):
    mpVoc(&voc)
{
    mvInvertedFile.resize(voc.size()); // number of words
}

然后就会执行mvInvertedFile.resize(voc.size())这个语句，跳转到voc.size()就会发现其返回值其实就是word的个数，也就是说mvInvertedFile这个vector的有效容量就是单词个数。其实这个比较重要，希望大家有个印象，后续我会对其进行分析。执行完以后就会返回System.cpp继续往下执行。接下来的代码如下：

//Create the Map
    mpMap = new Map();

    //Create Drawers. These are used by the Viewer
    mpFrameDrawer = new FrameDrawer(mpMap);
    mpMapDrawer = new MapDrawer(mpMap, strSettingsFile);

主要就是创建地图、帧显示、地图显示。大家可以自己进去相应的构造函数看下，很容易看懂。
发现已经写了很多了，虽然大部分贴的是代码。先到这吧，接下来我也得再分析下代码。让我们记住现在还在System.cpp里面，也就是从main.cpp里面实例化System类的一个SLAM对象，还在System的构造函数里面。