视觉SLAM十四讲——ch7

最新推荐文章于 2024-09-11 22:47:39 发布

weixin_58021155

最新推荐文章于 2024-09-11 22:47:39 发布

阅读量4.3k

点赞数 2

分类专栏：视觉SLAM 文章标签：计算机视觉几何学视觉检测

本文链接：https://blog.csdn.net/weixin_58021155/article/details/123496372

版权

在这里插入图片描述

视觉SLAM十四讲——ch7

ch7视觉里程计

本章目标：

1.理解图像特征点的意义，并掌握在单副图像中提取出特征点及多副图像中匹配特征点的方法
2.理解对极几何的原理，利用对极几何的约束，恢复出图像之间的摄像机的三维运动
3.理解PNP问题，以及利用已知三维结构与图像的对应关系求解摄像机的三维运动
4.理解ICP问题，以及利用点云的匹配关系求解摄像机的三维运动
5.理解如何通过三角化获得二维图像上对应点的三维结构

本章目的：基于特征点法的vo，将介绍什么是特征点，如何提取和匹配特征点，以及如何根据配对的特征点估计运动和场景结构，从而实现两帧间视觉里程计。

1 特征点

角点、SIFT（尺度不变特征变换）、SURF、ORB（后三者为人为设计，具有更多优点）。

1.1 特征点的组成

1 关键点：指特征点在图像里的位置

2 描述子：通常是一个向量，按照某种人为设计的方式，描述了该关键点周围像素的信息。相似的特征应该有相似的描述子（即当两个特征点的描述子在向量空间上的距离相近，认为这两个特征点是一样的）。

1.2 ORB特征

ORB特征：OrientedFAST关键点+BRIEF描述子。提取ORB特征的步骤：

1 .2.1提取FAST角点：

找出图中的角点，计算特征点的主方向，为后续BRIEF描述子增加了旋转不变特性。（FAST角点主要检测局部像素灰度变化明显的地方）。它的检测过程如下：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-C6zkqAV4-1647311685760)(视觉SLAM十四讲——ch7.assets/14d5abad89ab841faeebd7534c42e42-16462760426531.png)]

特点：速度快
缺点： 1).FAST特征点数量很大且不确定，但是我们希望对图像提取固定数量的特征
2).FAST角点不具有方向信息，并且存在尺度问题

解决方式：1).指定要提取的角点数量N，对原始FAST角点分别计算Harris响应值，然后选取前N个具有最大响应值的角点作为最终的角点集合
2).添加尺度和旋转的描述

尺度不变性的实现：构建图像金字塔，并在金字塔的每一层上检测角点（金字塔：指对图像进行不同层次的降采样，以获得不同分辨率的图像）
特征旋转的实现：灰度质心法（质心：指以图像块灰度值作为权重的中心）

1.2.2.计算BRIEF描述子

对前一步提取出的特征点周围图像区域进行扫描
特点：使用随机选点的比较，速度非常快，由于使用了二进制表达，存储起来也十分方便，适用于实时的图像匹配。

在不同图像之间进行特征匹配的方法：
1.暴力匹配：浮点类型的描述子，使用欧式距离度量
二进制类型的描述子（比如本例中的BRIEF描述子），使用汉明距离度量
缺点：当特征点数量很大时，暴力匹配法的运算量会变得很大

2.快速近似最近邻（FLANN）：适合匹配特征点数量极多的情况。

1.2.3 图像特征提取代码详解

#include<iostream>
#include<opencv2/core/core.hpp>
#include<opencv2/features2d/features2d.hpp>
#include<opencv2/highgui/highgui.hpp>
#include<chrono>

using namespace std;
using namespace cv;

int main(int argc, char **argv)
{
   
   /*
    if (argc != 3)
    {
        cout <<"usage:feature_extraction img1 img2 "<< endl;
        return 1;
    }
    */
    //读取图像
    Mat img_1 = imread("/home/tgc/slambook2/CH7/1.png",1);
    Mat img_2 = imread("/home/tgc/slambook2/CH7/2.png",1);

    //断言，用于代码的检查
    assert(img_1.data != nullptr && img_2.data != nullptr); 
    
    //初始化
    std::vector<KeyPoint> keypoint_1, keypoint_2; //声明关键点名字和类型
    Mat description_1, description_2;//声明描述子的名字和类型
    Ptr<FeatureDetector> detector = ORB::create();  //ORB创建关键点类
    Ptr<DescriptorExtractor> descriptor = ORB::create();//ORB创建描述子类
    Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("BruteForce-Hamming");//用于匹配关键点描述符的抽象基类

    //第一步，检测oriented fast角点位置
    chrono::steady_clock::time_point t1 = chrono::steady_clock::now();
    detector->detect(img_1, keypoint_1);
    detector->detect(img_2, keypoint_2);

    //第二步，根据角点位置计算brief描述子
    descriptor->compute(img_1, keypoint_1, description_1);
    descriptor->compute(img_2, keypoint_2, description_2);
    chrono::steady_clock::time_point t2 = chrono::steady_clock::now();
    chrono::duration<double> time_used = chrono::duration_cast<chrono::duration<double>>(t2-t1);
    cout<<"time cost:"<< time_used.count()<<endl;

    Mat outimg1;
    drawKeypoints(img_1,keypoint_1,outimg1,Scalar::all(-1),DrawMatchesFlags::DEFAULT);
    imshow("orb_features",outimg1);

    //第三步。对两幅图像中的brief描述子进行匹配，使用hamming距离
    vector<DMatch> matches;
    chrono::steady_clock::time_point t3 = chrono::steady_clock::now();
    matcher->match(description_1,description_2,matches);//描述子匹配，
    chrono::steady_clock::time_point t4 = chrono::steady_clock::now();
    chrono::duration<double> time_used_match = chrono::duration_cast<chrono::duration<double>>(t4-t3);
    cout<<"match time cost :"<< time_used_match.count()<<endl;

    //第四步，匹配点对筛选
    //计算最小和最大距离
    auto min_max = minmax_element(matches.begin(),matches.end());//指向最小和最大的迭代器，begin返回指向的第一个读/写迭代器
    double min_dist = min_max.first->distance;
    double max_dist = min_max.second->distance;

    printf("--max dist:%f\n",max_dist);
    printf("--min dist:%f\n",min_dist);

    //当描述子之间的距离大于两倍最小的距离时，即认为匹配有误、
    //但有时候最小距离会非常小。设置一个经验值30作为下限
    vector<DMatch> good_match;
    for (int i = 0; i < description_1.rows; i++)
    {
   
        if (matches[i].distance <= max(2* min_dist,30.0))
        {
   
            good_match.push_back(matches[i]);
        }
    }

    //第五步。绘制匹配结果
    Mat img_match;
    Mat img_goodmatch;
    drawMatches(img_1,keypoint_1,img_2,keypoint_2,matches,img_match);
    drawMatches(img_1,keypoint_1,img_2,keypoint_2,good_match,img_goodmatch);
    imshow("match",img_match);
    imshow("good_match",img_goodmatch);
    waitKey(0);
    
    return 0;
    
}

1.2.4 结果分析

图片1：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-UecP7jPi-1647311685761)(视觉SLAM十四讲——ch7.assets/1372381-20180727153341182-1525732735-16462766616633.png)]

图片2

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-gR0Rrr8H-1647311685762)(视觉SLAM十四讲——ch7.assets/1372381-20180727153414203-226154175-16462766676635.png)]

暴力匹配：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-V4DLqjl6-1647311685762)(视觉SLAM十四讲——ch7.assets/1372381-20180727153506919-387915385-16462767832818.png)]

设置30为下限的经验值匹配：

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-370Tx8F1-1647311685762)(视觉SLAM十四讲——ch7.assets/1372381-20180727153635299-359755985-16462767875569.png)]

结果分析：筛选的依据是汉明距离小于最小距离的两倍，这是工程经验方法，但还是有误匹配，所以后面在估计运动时，需要去除误匹配的算法。

2 2D-2D ：对极几何约束

从两张图片中得到一对匹配好的特征点，如果有若干对这样的匹配点，就可以通过二维图像对应的点，计算出两帧之间相机的运动（R、t）。

2.1单目相机——对极几何求解

如果是单目相机：只能得到2D的像素坐标->根据两组2D点估计运动，方法：对极几何

步骤如下： 1).根据配对点的像素位置求出本质矩阵E或基础矩阵F

2).根据E或F求出R，t。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-y4DJKzn0-1647311685763)(视觉SLAM十四讲——ch7.assets/c1407a05f0d0e6a9ed05d717a2ab762-164627824431811-164627824612812.png)]

2.1.1本质矩阵E：

1).E在不同尺度下是等价的

2).本质矩阵的内在性质：E的奇异值一定是[σ,σ,0]T的形式

3).由于尺度等价性，E实际上有5个自由度。

求解方法：八点法(Eight-point -algorithm)。求出E后，对其分解->R,t

2.1.2 单应矩阵（Homograohy）H：

描述了两个平面之间的映射关系。如果场景中的特征点都落在同一个平面上（墙、地面等），可通过单应性来估计运动

求解方法：直接线性变换法（Direct Linear Transform）。求出H后，对其分解（分解方法：数值法、解析法）->R,t

2.2 对极约束求解相机运动代码详解

#include<iostream>
#include<opencv2/core/core.hpp>
#include<opencv2/features2d/features2d.hpp>
#include<opencv2/highgui/highgui.hpp>
#include<opencv2/calib3d/calib3d.hpp>

using namespace std;
using namespace cv;
//声明特征匹配函数
void find_feature_matches(
    const Mat &img_1, const Mat &img_2,
    std::vector<KeyPoint> &keyponts_1,
    std::vector<KeyPoint> &keyponts_2,
    std::vector<DMatch> &matches
);
//声明位姿估计函数
void pose_estimation_2d2d(
    std::vector<KeyPoint> keypoints_1,
    std::vector<KeyPoint> keypoints_2,
    std::vector<DMatch> matches,
    Mat &R, Mat &t
);
//像素坐标公式，公式5.5变换一下就ok
// 像素坐标转相机归一化坐标
Point2d pixel2cam(const Point2d &p, const Mat &K); //p是像素坐标，k是相机内参

int main(int argc, char **argv) {
   
  if (argc != 3) {
   
    cout << "usage: pose_estimation_2d2d img1 img2" << endl;
    return 1;
  }
  //-- 读取图像
  Mat img_1 = imread(argv[1], CV_LOAD_IMAGE_COLOR);//读取图片的路径，及彩色图
  Mat img_2 = imread(argv[2], CV_LOAD_IMAGE_COLOR);
  assert(img_1.data && img_2.data && "Can not load images!");
//定义要用到的变量 ，如keypoints_1 matches
  vector<KeyPoint> keypoints_1, keypoints_2;
  vector<DMatch> matches;
  //计算两幅图片之间的匹配特征点
  find_feature_matches(img_1, img_2, keypoints_1, keypoints_2, matches);
  cout << "一共找到了" << matches.size() << "组匹配点" << endl;

  //-- 估计两张图像间运动
  Mat R, t;//定义旋转和平移
  pose_estimation_2d2d(keypoints_1, keypoints_2, matches, R, t);//计算旋转和平移
  
  //当然还需要验证以下R和t是否满足对极约束
  //-- 验证E=t^R*scale
  Mat t_x =//t_x是反对称矩阵，公式3.3
    (Mat_<double>(3, 3) << 0, -t.at<double>(2, 0), t.at<double>(1, 0),
      t.at<double>(2, 0), 0, -t.at<double>(0, 0),
      -t.at<double>(1, 0), t.at<double>(0, 0), 0);

  cout << "t^R=" << endl << t_x * R << endl;//E=t^R,公式7.10

  //-- 验证对极约束
  //queryIdx是查询图像的描述子和特征点的下标，
  //trainIdx是训练图像的描述子和特征点的下标。
  Mat K = (Mat_<double>(3, 3) << 520.9, 0, 325.1, 0, 521.0, 249.7, 0, 0, 1);//定义内参矩阵
  for (DMatch m: matches) {
   
    Point2d pt1 = pixel2cam(keypoints_1[m.queryIdx].pt, K);
    Mat y1 = (Mat_<double>(3, 1) << pt1.x, pt1.y, 1);
    Point2d pt2 = pixel2cam(keypoints_2[m.trainIdx].pt, K);
    Mat y2 = (Mat_<double>(3, 1) << pt2.x, pt2.y, 1);
    Mat d = y2.t() * t_x * R * y1;//y2.t()是转置矩阵，公式7.8，若d很趋近于零，则说明没啥问题
    cout << "epipolar constraint = " << d << endl;
  }
  return 0;
}
//函数实现
//匹配函数
void find_feature_matches(const Mat &img_1, const Mat &img_2,
                          std::vector<KeyPoint> &keypoints_1,
                          std::vector<KeyPoint> &keypoints_2,
                          std::vector<DMatch> &matches) {
   
  //先初始化，创建咱们要用到的对象
    //定义两个关键点对应的描述子，同时创建检测keypoints的检测器
  Mat descriptors_1, descriptors_2;
  Ptr<FeatureDetector> detector = ORB::create();//keypoints检测器
  Ptr<DescriptorExtractor> descriptor = ORB::create();//描述子提取器
  Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("BruteForce-Hamming");//描述子匹配器（方法：暴力匹配）
  //-- 第一步:检测 Oriented FAST 角点位置
  detector->detect(img_1, keypoints_1);//得到图1的关键点（keypoints_1）
  detector->detect(img_2, keypoints_2);//得到图2的关键点（keypoints_2）

  //-- 第二步:根据角点位置计算 BRIEF 描述子
  descriptor->compute(img_1, keypoints_1, descriptors_1);//得到descriptors_1
  descriptor->compute(img_2, keypoints_2, descriptors_2);//得到descriptors_2

  //-- 第三步:对两幅图像中的BRIEF描述子进行匹配，使用 Hamming 距离
  vector<DMatch> match;//暂时存放匹配点,因为后面还要进行筛选
  matcher->match(descriptors_1, descriptors_2, match);

  //-- 第四步:对match进行筛选，得到好的匹配点，把好的匹配点放在matches中
  //先定义两个变量，一个是最大距离，一个是最小距离
  double min_dist = 10000, max_dist = 0;

  //找出所有匹配之间的最小距离和最大距离, 即是最相似的和最不相似的两组点之间的距离
  for (int i = 0; i < descriptors_1.rows; i++) {
   //描述子本质是由 0,1 组成的向量
    double dist = match[i].distance;
    if (dist < min_dist) min_dist = dist;
    if (dist > max_dist) max_dist = dist;
  }

  printf("-- Max dist : %f \n", max_dist);
  printf("-- Min dist : %f \n", min_dist);

  //当描述子之间的距离大于两倍的最小距离时,即认为匹配有误.但有时候最小距离会非常小,设置一个经验值30作为下限.
  for (int i = 0; i < descriptors_1.rows; i++) {
   
    if (match[i].distance <= max(2 * min_dist, 30.0)) {
   
      matches.push_back(match[i]);
    }
  }
}

Point2d pixel2cam(const Point2d &p, const Mat &k)//p是像素坐标，k是相机内参
{
   
    return Point2d(
        (p.x - k.at<double>(0,2))/ k.at<double>(0,0),//世界坐标系的X
        (p.y - k.at<double>(1,2) )/ k.at<double>(1,1)//世界坐标系的Y，其中Z被归一化
    );
}
//实现位姿估计函数
void pose_estimation_2d2d(std::vector<KeyPoint> keypoints_1, 
                                                            std::vector<KeyPoint> keypoints_2,
                                                            std::vector<DMatch> matches,
                                                            Mat &R, Mat &t)
{
   
    //相机内参
    Mat K =(Mat_<double> (3,3)<<520.9,0,325.1,0,521.0,249.7,0,0,1);

    //把匹配点转换为float点形式，咱们要把关键点的像素点拿出来 ,定义两个容器接受两个图关键点的像素位置
    vector<Point2f> points1;  // 计算本质矩阵和基础矩阵需要浮点类型的数据
    vector<Point2f> points2; 

    for (int i = 0; i < (int)matches.size(); i++)
    {
   
        //queryIdx是图1中匹配的关键点的对应编号
        //trainIdx是图2中匹配的关键点的对应编号
        //pt可以把关键点的像素位置取出来
        points1.push_back(keypoints_1[matches[i].queryIdx].pt);
        points2.push_back(keypoints_2[matches[i].trainIdx].pt);
    }

    //计算基础矩阵
    Mat fundamental_matrix;
    fundamental_matrix = findFundamentalMat(points1, points2,CV_FM_8POINT);
    cout<<"fundenmatal matrix" <<fundamental_matrix<<endl;

    //计算本质矩阵
    Point2d principal_point(325.1, 249.7); //相机光心，//把内参K中的cx ,cy放进一个向量里面 =相机的光心
    double focal_length = 521; // 相机焦距
    //之所以取上面的principal_point、focal_length是因为计算本质矩阵的函数要用
    //得到本质矩阵essential_matrix
    Mat essential_matrix;
    essential_matrix = findEssentialMat(points1,points2,focal_length,principal_point);
    cout<<"essential matrix" <<essential_matrix<<endl;
   
   //计算单应矩阵
   Mat homography_matrix;
   homography_matrix = findHomography(points1,points2,RANSAC,3);
    cout<<"homography matrix" <<homography_matrix<<endl;
    
    //从本质矩阵中恢复旋转和平移信息
    //此函数仅在opencv3中提供
    recoverPose(essential_matrix,points1,points2,R,t,focal_length,principal_point);
    cout<< "R is"<< R << endl;
    cout<< "t is"<< t << endl;
}

2.3 结果分析

-- Max dist : 95.000000 
-- Min dist : 4.000000 
一共找到了79组匹配点
fundamental_matrix is 
[4.844484382466111e-06, 0.0001222601840188731, -0.01786737827487386;
 -0.0001174326832719333, 2.122888800459598e-05, -0.01775877156212593;
 0.01799658210895528, 0.008143605989020664, 1]
essential_matrix is 
[-0.02036185505234771, -0.4007110038118444, -0.033240742498241;
 0.3939270778216368, -0.03506401846698084, 0.5857110303721015;
 -0.006788487241438231, -0.5815434272915687, -0.01438258684486259]
homography_matrix is 
[0.9497129583105288, -0.143556453147626, 31.20121878625771;
 0.04154536627445031, 0.9715568969832015, 5.306887618807696;
 -2.81813676978796e-05, 4.353702039810921e-05, 1]
R is 
[0.9985961798781877, -0.05169917220143662, 0.01152671359827862;
 0.05139607508976053, 0.9983603445075083, 0.02520051547522452;
 -0.01281065954813537, -0.02457271064688494, 0.9996159607036126]
t is 
[-0.8220841067933339;
 -0.0326974270640541;
 0.5684264241053518]
t^R=
[-0.02879601157010514, -0.5666909361828475, -0.0470095088643642;
 0.5570970160413605, -0.04958801046730488, 0.8283204827837457;
 -0.009600370724838811, -0.8224266019846685, -0.02034004937801358]
epipolar constraint = [0.002528128704106514]
epipolar constraint = [-0.001663727901710814]
epipolar constraint = [-0.0008009088410885212]
epipolar constraint = [0.0001705869410470254]
epipolar constraint = [-0.0003338015008984979]
epipolar constraint = [0.0003385525272308065]
epipolar constraint = [0.0001729349818584552]
epipolar constraint = [-9.552408320477601e-06]
epipolar constraint = [-0.0008834408754688165]
epipolar constraint = [-0.000444586092781381]
epipolar constraint = [-0.000156067923593961]
epipolar constraint = [0.001512967129777068]
epipolar constraint = [-0.0002644334828964742]
epipolar constraint = [-3.514414351252562e-06]
epipolar constraint = [-0.0004170632811044614]
epipolar constraint = [-0.0007749589896892117]
epipolar constraint = [0.002091463454860276]
epipolar constraint = [-0.001016195254389909]
epipolar constraint = [0.0005870797511206075]
epipolar constraint = [0.0002701337927295891]
epipolar constraint = [-0.0008153290073634545]
epipolar constraint = [0.005595329855570208]
epipolar constraint = [0.004456949384590653]
epipolar constraint = [-0.00109225844630996]
epipolar constraint = [-0.00122027527435923]
epipolar constraint = [0.0001206258081121597]
epipolar constraint = [-0.0007167735646266809]
epipolar constraint = [0.002034045481378033]
epipolar constraint = [-0.001256283161205782]
epipolar constraint = [-2.841028832301085e-06]
epipolar constraint = [-0.0009452266349239957]
epipolar co