[学习SLAM]根据F或E求解RT的ORB_SLAM2与opencv的对比分析

最新推荐文章于 2023-08-24 17:11:04 发布

苏源流

最新推荐文章于 2023-08-24 17:11:04 发布

阅读量2k

点赞数

分类专栏：视觉SLAM 运动估计 SLAM

本文链接：https://blog.csdn.net/KYJL888/article/details/104491732

版权

视觉SLAM 同时被 3 个专栏收录

61 篇文章 28 订阅

订阅专栏

SLAM

57 篇文章 21 订阅

订阅专栏

运动估计

37 篇文章 2 订阅

订阅专栏

本文主要的本文主要的目的是对比。分析。Slam中的rt求解。和open cv的。Rt求解。在算法和代码上有什么区别和联系。

ORB_SLAM2 计算 F21 的代码：

cv::Mat Initializer::ComputeF21(const vector<cv::Point2f> &vP1,const vector<cv::Point2f> &vP2)
{
    const int N = vP1.size();

    cv::Mat A(N,9,CV_32F);

    for(int i=0; i<N; i++)
    {
        const float u1 = vP1[i].x;
        const float v1 = vP1[i].y;
        const float u2 = vP2[i].x;
        const float v2 = vP2[i].y;

        A.at<float>(i,0) = u2*u1;
        A.at<float>(i,1) = u2*v1;
        A.at<float>(i,2) = u2;
        A.at<float>(i,3) = v2*u1;
        A.at<float>(i,4) = v2*v1;
        A.at<float>(i,5) = v2;
        A.at<float>(i,6) = u1;
        A.at<float>(i,7) = v1;
        A.at<float>(i,8) = 1;
    }

    cv::Mat u,w,vt;

    cv::SVDecomp(A,w,u,vt,cv::SVD::MODIFY_A | cv::SVD::FULL_UV);

    cv::Mat Fpre = vt.row(8).reshape(0, 3);

    cv::SVDecomp(Fpre,w,u,vt,cv::SVD::MODIFY_A | cv::SVD::FULL_UV);

    w.at<float>(2)=0;

    return  u*cv::Mat::diag(w)*vt;
}

F21 的真正意思是：

总结

ORB_SLAM2::Initializer 用于单目情况下的初始化

Initializer 的构造函数中传入第一张影像，这张影像被称作 reference frame（rFrame）。在获得第二张影像时传入第二张影像，这张影像被称作 current frame（cFrame）。这一部分传入的代码可以在ORB_SLAM2::Tracking::MonocularInitialization()中查看，要求 rFrame 与 cFrame 都至少具有 101 个特征点，而且 cFrame 与 rFrame 粗匹配结果不少于 100 个点对。这个粗匹配也很有意思，可以查看ORBmatcher::SearchForInitalization()（粗匹配是对每一个 rFrame 的特征点选定一定大小的窗口，以该特征点在 rFrame 上的坐标为中心，在 cFrame 上提取出覆盖网格内所有的特征点，计算 ORB 描述子的距离，距离够小就说明是匹配点）。

Initializer::Initialize()

在这个函数中完成初始化。首先生成 RANSAC 需要用的最小子集的集合mvSets。随后开两个线程同步进行FindHomography和FindFundamental，这两个函数分别返回SH、SF这两个数值用于判定是使用 H 作为初始模型更好，还是用 F 作为初始模型更好。

在用SH、SF判定是使用 H 还是 F 之后就是用 H （ReconstructH()）或 F （ReconstructF()）生成 R、t 和对应的可以三角化的点用于初始地图生成。

ReconstructH() 是使用 Motion and structure from motion in a piecewise planar
environment 生成 8 种可能结果，再使用CheckRT()确定是哪一种最为合适。

Initializer::FindHomography()

每次使用 8 个点通过 SVD 分解计算得到 H21。值得注意的是在进行 Homography 计算之前先进行归一化过程（在函数Normalize()中进行）。

Initializer::Normalize()

归一化过程是将所有的 KeyPoints 进行一次 Affine Transformation，使得变换后的 KeyPoints 均值为原点 00，方差为单位阵 II。

Initializer::ComputeH21()

在归一化之后，使用归一化的坐标计算 Homography。

没啥好讲，就是 Direct Linear Transformation，参考 MVG Page 88。

Initializer::CheckHomography()

用 H21 和 H12 分别将 rFrame(1) 中的特征点和 cFrame(2) 转换到另一张影像中，计算匹配点的距离误差，距离误差转换为卡方距离，卡方距离小于 5.991 说明显著性为 5%，应该认为它们匹配成功，否则不成功将这一对匹配标记为 false。注意这里有两个自由度。

https://en.wikipedia.org/wiki/Chi-squared_distribution#Table_of_.CF.872_values_vs_p-values

匹配成功能就能将于显著性相关的数值加入到评分中，评分越高说明由这八个点计算出的 Homography 越正确。即由CheckHomography()返回的评分currentScore，取 RANSAC 中评分最高的 Homography 作为最终选定的 Homography。

这个 score 会被传入Initialize()函数中的 SH，用作计算 SH/(SH+SF)，判断是使用 Homography 还是 Fundamental。

Initializer::ReconstructH()

用函数FindHomography()中 RANSAC 计算得到的 Homography 分解，分解能够得到 8 种可能的 R,tR,t 结果，用CheckRT()判断选择哪一种结果。

好像这有点错了吧，应该用所有的 inlier 匹配计算 Homography，再用这个更靠谱的 Homography 分解计算 R,tR,t。

函数ReconstructH()最后也输出三角化成功的三维点。

Initializer::FindFundamental()

FindFundamental()的计算过程与 FindHomography 类似，都是需要进行归一化操作。

函数ComputeF21()用八点法计算 Fundamental，计算得到的实际 Fundamental 通过设置最小特征值为 0 投影到 Fundamental 空间，作为输出。

函数CheckFundamental()是将点与线的距离作为误差，计算卡方距离，注意这里有一个自由度，所以显著性检验使用的卡方距离为 3.84。

都差不多，没啥好说的。

Initializer::CheckRT()

这个函数挺重要的，因为分解 H 和 F 都会有很多可能的结果，使用这个函数能够分辨出什么结果是靠谱的。

函数CheckRT()接受 R,tR,t ，一组成功的匹配。最后给出的结果是这组匹配中有多少匹配是能够在这组 R,tR,t 下正确三角化的（即 ZZ 都大于0），并且输出这些三角化之后的三维点。

如果三角化生成的三维点 ZZ 小于等于0，且三角化的“前方交会角”（余弦是 cosParallax）不会太小，那么这个三维点三角化错误，舍弃。

通过了 ZZ 的检验，之后将这个三维点分别投影到两张影像上，计算投影的像素误差，误差大于2倍中误差，舍弃。

总结

ORB 里面对通过“最大值”确定的结果都非常小心。一般要求这个“最大值” outstanding，如 ORBmatcher 的构造函数中就有会传入一个 (0,1) 的数值给成员变量 mfNNratio，只有最小距离小于次小距离的 mfNNratio 倍才能算是匹配成功，不允许出现相似的匹配，而取好那么一点点的匹配作为匹配结果。

在Initializer::ReconstructH()中最后 8 个可能结果中，最好模型 inlier 数要大于次好模型 inlier 的 1/0.75 倍。

opencv 8点法与5点法

OpenCV findFundamentalMat (8点法)

/** @brief Calculates a fundamental matrix from the corresponding points in two images.

@param points1 Array of N points from the first image. The point coordinates should be
floating-point (single or double precision).
@param points2 Array of the second image points of the same size and format as points1 .
@param method Method for computing a fundamental matrix.
-   **CV_FM_7POINT** for a 7-point algorithm. \f$N = 7\f$
-   **CV_FM_8POINT** for an 8-point algorithm. \f$N \ge 8\f$
-   **CV_FM_RANSAC** for the RANSAC algorithm. \f$N \ge 8\f$
-   **CV_FM_LMEDS** for the LMedS algorithm. \f$N \ge 8\f$
@param param1 Parameter used for RANSAC. It is the maximum distance from a point to an epipolar
line in pixels, beyond which the point is considered an outlier and is not used for computing the
final fundamental matrix. It can be set to something like 1-3, depending on the accuracy of the
point localization, image resolution, and the image noise.
@param param2 Parameter used for the RANSAC or LMedS methods only. It specifies a desirable level
of confidence (probability) that the estimated matrix is correct.
@param mask

The epipolar geometry is described by the following equation:

\f[[p_2; 1]^T F [p_1; 1] = 0\f]

where \f$F\f$ is a fundamental matrix, \f$p_1\f$ and \f$p_2\f$ are corresponding points in the first and the
second images, respectively.

The function calculates the fundamental matrix using one of four methods listed above and returns
the found fundamental matrix. Normally just one matrix is found. But in case of the 7-point
algorithm, the function may return up to 3 solutions ( \f$9 \times 3\f$ matrix that stores all 3
matrices sequentially).

The calculated fundamental matrix may be passed further to computeCorrespondEpilines that finds the
epipolar lines corresponding to the specified points. It can also be passed to
stereoRectifyUncalibrated to compute the rectification transformation. :
@code
    // Example. Estimation of fundamental matrix using the RANSAC algorithm
    int point_count = 100;
    vector<Point2f> points1(point_count);
    vector<Point2f> points2(point_count);

    // initialize the points here ...
    for( int i = 0; i < point_count; i++ )
    {
        points1[i] = ...;
        points2[i] = ...;
    }

    Mat fundamental_matrix =
     findFundamentalMat(points1, points2, FM_RANSAC, 3, 0.99);
@endcode
 */
CV_EXPORTS_W Mat findFundamentalMat( InputArray points1, InputArray points2,
                                     int method = FM_RANSAC,
                                     double param1 = 3., double param2 = 0.99,
                                     OutputArray mask = noArray() );

/** @overload */
CV_EXPORTS Mat findFundamentalMat( InputArray points1, InputArray points2,
                                   OutputArray mask, int method = FM_RANSAC,
                                   double param1 = 3., double param2 = 0.99 );

OpenCV `findEssentialMat (5点法` N (N \>= 5) `)`

E = cv::findEssentialMat( points1, points2, focal, pp, cv::RANSAC, 0.999, 1.0, mask); cv::recoverPose(E, points1, points2, R, t, focal, pp, mask);

Is the rotation and translation calculated from points1 to points2 or points2 to points1?

/** @brief Calculates an essential matrix from the corresponding points in two images.

@param points1 Array of N (N \>= 5) 2D points from the first image. The point coordinates should
be floating-point (single or double precision).
@param points2 Array of the second image points of the same size and format as points1 .
@param cameraMatrix Camera matrix \f$K = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$ .
Note that this function assumes that points1 and points2 are feature points from cameras with the
same camera matrix.
@param method Method for computing a fundamental matrix.
-   **RANSAC** for the RANSAC algorithm.
-   **MEDS** for the LMedS algorithm.
@param prob Parameter used for the RANSAC or LMedS methods only. It specifies a desirable level of
confidence (probability) that the estimated matrix is correct.
@param threshold Parameter used for RANSAC. It is the maximum distance from a point to an epipolar
line in pixels, beyond which the point is considered an outlier and is not used for computing the
final fundamental matrix. It can be set to something like 1-3, depending on the accuracy of the
point localization, image resolution, and the image noise.
@param mask Output array of N elements, every element of which is set to 0 for outliers and to 1
for the other points. The array is computed only in the RANSAC and LMedS methods.

This function estimates essential matrix based on the five-point algorithm solver in @cite Nister03 .
@cite SteweniusCFS is also a related. The epipolar geometry is described by the following equation:

\f[[p_2; 1]^T K^{-T} E K^{-1} [p_1; 1] = 0\f]

where \f$E\f$ is an essential matrix, \f$p_1\f$ and \f$p_2\f$ are corresponding points in the first and the
second images, respectively. The result of this function may be passed further to
decomposeEssentialMat or recoverPose to recover the relative pose between cameras.
 */
CV_EXPORTS_W Mat findEssentialMat( InputArray points1, InputArray points2,
                                 InputArray cameraMatrix, int method = RANSAC,
                                 double prob = 0.999, double threshold = 1.0,
                                 OutputArray mask = noArray() );

/** @overload
@param points1 Array of N (N \>= 5) 2D points from the first image. The point coordinates should
be floating-point (single or double precision).
@param points2 Array of the second image points of the same size and format as points1 .
@param focal focal length of the camera. Note that this function assumes that points1 and points2
are feature points from cameras with same focal length and principal point.
@param pp principal point of the camera.
@param method Method for computing a fundamental matrix.
-   **RANSAC** for the RANSAC algorithm.
-   **LMEDS** for the LMedS algorithm.
@param threshold Parameter used for RANSAC. It is the maximum distance from a point to an epipolar
line in pixels, beyond which the point is considered an outlier and is not used for computing the
final fundamental matrix. It can be set to something like 1-3, depending on the accuracy of the
point localization, image resolution, and the image noise.
@param prob Parameter used for the RANSAC or LMedS methods only. It specifies a desirable level of
confidence (probability) that the estimated matrix is correct.
@param mask Output array of N elements, every element of which is set to 0 for outliers and to 1
for the other points. The array is computed only in the RANSAC and LMedS methods.

This function differs from the one above that it computes camera matrix from focal length and
principal point:

\f[K =
\begin{bmatrix}
f & 0 & x_{pp}  \\
0 & f & y_{pp}  \\
0 & 0 & 1
\end{bmatrix}\f]
 */
CV_EXPORTS_W Mat findEssentialMat( InputArray points1, InputArray points2,
                                 double focal = 1.0, Point2d pp = Point2d(0, 0),
                                 int method = RANSAC, double prob = 0.999,
                                 double threshold = 1.0, OutputArray mask = noArray() );

/** @brief Decompose an essential matrix to possible rotations and translation.

@param E The input essential matrix.
@param R1 One possible rotation matrix.
@param R2 Another possible rotation matrix.
@param t One possible translation.

This function decompose an essential matrix E using svd decomposition @cite HartleyZ00 . Generally 4
possible poses exists for a given E. They are \f$[R_1, t]\f$, \f$[R_1, -t]\f$, \f$[R_2, t]\f$, \f$[R_2, -t]\f$. By
decomposing E, you can only get the direction of the translation, so the function returns unit t.
 */
CV_EXPORTS_W void decomposeEssentialMat( InputArray E, OutputArray R1, OutputArray R2, OutputArray t );

1 findEssentialMat,The result of this function may be passed further to decomposeEserPosentialMat or recovse to recover the relative pose between cameras.

2 findEssentialMat,This function estimates essential matrix based on the five-point algorithm solver in @cite Nister03 .

OpenCV decomposeEssentialMat

/** @brief Decompose an essential matrix to possible rotations and translation.

@param E The input essential matrix.
@param R1 One possible rotation matrix.
@param R2 Another possible rotation matrix.
@param t One possible translation.

This function decompose an essential matrix E using svd decomposition @cite HartleyZ00 . Generally 4
possible poses exists for a given E. They are \f$[R_1, t]\f$, \f$[R_1, -t]\f$, \f$[R_2, t]\f$, \f$[R_2, -t]\f$. By
decomposing E, you can only get the direction of the translation, so the function returns unit t.
 */
CV_EXPORTS_W void decomposeEssentialMat( InputArray E, OutputArray R1, OutputArray R2, OutputArray t );

OpenCV int recoverPose


/** @brief Recover relative camera rotation and translation from an estimated essential matrix and the
corresponding points in two images, using cheirality check. Returns the number of inliers which pass
the check.

@param E The input essential matrix.
@param points1 Array of N 2D points from the first image. The point coordinates should be
floating-point (single or double precision).
@param points2 Array of the second image points of the same size and format as points1 .
@param cameraMatrix Camera matrix \f$K = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$ .
Note that this function assumes that points1 and points2 are feature points from cameras with the
same camera matrix.
@param R Recovered relative rotation.
@param t Recoverd relative translation.
@param mask Input/output mask for inliers in points1 and points2.
:   If it is not empty, then it marks inliers in points1 and points2 for then given essential
matrix E. Only these inliers will be used to recover pose. In the output mask only inliers
which pass the cheirality check.
This function decomposes an essential matrix using decomposeEssentialMat and then verifies possible
pose hypotheses by doing cheirality check. The cheirality check basically means that the
triangulated 3D points should have positive depth. Some details can be found in @cite Nister03 .

This function can be used to process output E and mask from findEssentialMat. In this scenario,
points1 and points2 are the same input for findEssentialMat. :
@code
    // Example. Estimation of fundamental matrix using the RANSAC algorithm
    int point_count = 100;
    vector<Point2f> points1(point_count);
    vector<Point2f> points2(point_count);

    // initialize the points here ...
    for( int i = 0; i < point_count; i++ )
    {
        points1[i] = ...;
        points2[i] = ...;
    }

    // cametra matrix with both focal lengths = 1, and principal point = (0, 0)
    Mat cameraMatrix = Mat::eye(3, 3, CV_64F);

    Mat E, R, t, mask;

    E = findEssentialMat(points1, points2, cameraMatrix, RANSAC, 0.999, 1.0, mask);
    recoverPose(E, points1, points2, cameraMatrix, R, t, mask);
@endcode
 */
CV_EXPORTS_W int recoverPose( InputArray E, InputArray points1, InputArray points2,
                            InputArray cameraMatrix, OutputArray R, OutputArray t,
                            InputOutputArray mask = noArray() );

/** @overload
@param E The input essential matrix.
@param points1 Array of N 2D points from the first image. The point coordinates should be
floating-point (single or double precision).
@param points2 Array of the second image points of the same size and format as points1 .
@param R Recovered relative rotation.
@param t Recoverd relative translation.
@param focal Focal length of the camera. Note that this function assumes that points1 and points2
are feature points from cameras with same focal length and principal point.
@param pp principal point of the camera.
@param mask Input/output mask for inliers in points1 and points2.
:   If it is not empty, then it marks inliers in points1 and points2 for then given essential
matrix E. Only these inliers will be used to recover pose. In the output mask only inliers
which pass the cheirality check.

This function differs from the one above that it computes camera matrix from focal length and
principal point:

\f[K =
\begin{bmatrix}
f & 0 & x_{pp}  \\
0 & f & y_{pp}  \\
0 & 0 & 1
\end{bmatrix}\f]
 */
CV_EXPORTS_W int recoverPose( InputArray E, InputArray points1, InputArray points2,
                            OutputArray R, OutputArray t,
                            double focal = 1.0, Point2d pp = Point2d(0, 0),
                            InputOutputArray mask = noArray() );

/** @overload
@param E The input essential matrix.
@param points1 Array of N 2D points from the first image. The point coordinates should be
floating-point (single or double precision).
@param points2 Array of the second image points of the same size and format as points1.
@param cameraMatrix Camera matrix \f$K = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}\f$ .
Note that this function assumes that points1 and points2 are feature points from cameras with the
same camera matrix.
@param R Recovered relative rotation.
@param t Recoverd relative translation.
@param distanceThresh threshold distance which is used to filter out far away points (i.e. infinite points).
@param mask Input/output mask for inliers in points1 and points2.
:   If it is not empty, then it marks inliers in points1 and points2 for then given essential
matrix E. Only these inliers will be used to recover pose. In the output mask only inliers
which pass the cheirality check.
@param triangulatedPoints 3d points which were reconstructed by triangulation.
 */

CV_EXPORTS_W int recoverPose( InputArray E, InputArray points1, InputArray points2,
                            InputArray cameraMatrix, OutputArray R, OutputArray t, double distanceThresh, InputOutputArray mask = noArray(),
                            OutputArray triangulatedPoints = noArray());

1 recoverPose,using cheirality check. Returns the number of inliers which pass the check.2 recoverPose,This function decomposes an essential matrix using decomposeEssentialMat and then verifies possible pose hypotheses by doing cheirality check. The cheirality check basically means that the triangulated 3D points should have positive depth. Some details can be found in @cite Nister03 .

五点法与cheirality check


/* This is a 5-point algorithm contributed to OpenCV by the author, Bo Li.
   It implements the 5-point algorithm solver from Nister's paper:
   Nister, An efficient solution to the five-point relative pose problem, PAMI, 2004.
*/