视觉SLAM十四讲作业练习(6)

一. 光流法

2. LK光流

2.0 文献阅读

Lucas-Kanade 20 Years On: A Unifying Framework
在这里插入图片描述

就看了这些,MD英文文献看的太费劲了,算法什么的一点都看不懂。。。。。

还是借鉴以下别人的吧。。。

http://t.csdn.cn/4mzHg

2.1文献综述

​ 1)光流法分类

分类实例复杂度应用场景
Forwards Additive(正向加性)Lucas-Kanade (1981) O ( n 2 N + n 3 ) O(n^2N+n^3) O(n2N+n3)Any warp
Forwards Compositional(正向组合)Shum-Szeliski (2000) O ( n 2 N + n 3 ) O(n^2N+n^3) O(n2N+n3)Any semi-group
Inverse Additive(逆向加性)Hager-Belhumeur (1998) O ( n N + n 3 ) O(nN+n^3) O(nN+n3)Simple linear 2D +
Inverse Compositional(逆向组合)Baker-Matthews (2001) O ( n N + n k + k 3 ) O(nN+nk+k^3) O(nN+nk+k3)Any group

2)在compositional 中,为什么有时候需要做原始图像的wrap?该wrap 有何物理意义?

算法的目标是有一个模板图像 T ( x ) T(x) T(x) 和一个原始图像 $ I(x)$ ,要求给原始图像做一个wrap变换,并求这个变换函数 W ( x ; p ) W(x;p) W(x;p) ,使得经过变换候的图像 I ( W ( x ; p ) ) I(W(x;p)) I(W(x;p)) 与模板图像 T T T 之间的差异最小。用数学表达式来描述上边的过程就是:
a r g    m i n ∑ x ∣ ∣ I ( W ( x ; p ) ) − T ( x ) ∣ ∣ 2 arg\;min \sum_x||I(W(x;p))-T(x)||^2 argminxI(W(x;p))T(x)2
3)forward 和inverse 有何差别?

如果模板中有很多噪声,则应该使用正向算法。如果输入图像中噪声更多,那就采用逆向算法。逆向算法只用计算一次Hessian矩阵,而正向算法需要在每次迭代过程中都计算一遍Hessian矩阵。因此,逆向算法更高效,计算量较小。

正向法在第二个图像 I 2 I_2 I2处求梯度,而逆向法直接用第一个图像 I 1 I_1 I1处的梯度来代替。正向法和逆向法的目标函数也不一样。

2.2 ,2.3光流的实现

1从最小二乘角度来看,每个像素的误差怎么定义?
e r r o r = I 1 ( x i , y i ) − I 2 ( x i + Δ x i , y i + Δ y i ) error = I_1(x_i, y_i)-I_2(x_i +\Delta x_i, y_i+\Delta y_i) error=I1(xi,yi)I2(xi+Δxi,yi+Δyi)
2误差相对于自变量的导数如何定义?

待估计的变量为 Δ x i \Delta x_i Δxi Δ y i \Delta y_i Δyi

所以:
∂ e ∂ Δ x i = − ∂ I 2 ∂ Δ x i \frac{\partial e}{\partial \Delta x_i} = -\frac{\partial I_2}{\partial \Delta x_i} Δxie=ΔxiI2

∂ e ∂ Δ y i = − ∂ I 2 ∂ Δ y i \frac{\partial e}{\partial \Delta y_i} = -\frac{\partial I_2}{\partial \Delta y_i} Δyie=ΔyiI2

​ 因为图像中每个点之间的像素值是离散的,不能直接用微分来求导,这里我们用差分来代替微分求导,所以用中心差分方式来进行求导计算。
f ( x + 1 , y ) = I 2 ( x i + Δ x i + 1 , y i + Δ y i ) f ( x + 1 , y ) = I_2 ( x_i + Δ x_i + 1 , y_i + Δ y_i ) f(x+1,y)=I2(xi+Δxi+1,yi+Δyi)

f ( x − 1 , y ) = I 2 ( x i + Δ x i − 1 , y i + Δ y i ) f ( x − 1 , y ) = I_2 ( x_i + Δ x_i − 1 , y_i + Δ y_i ) f(x1,y)=I2(xi+Δxi1,yi+Δyi)

一阶泰勒展开:
f ( x + 1 , y ) = f ( x , y ) + f ′ ( x ) ,      f ( x − 1 , y ) = f ( x , y ) − f ′ ( x ) . f(x+1,y)=f(x,y)+f'(x),\;\;f(x−1,y)=f(x,y)−f'(x). f(x+1,y)=f(x,y)+f(x),f(x1,y)=f(x,y)f(x).
得:
f ′ ( x ) = f ( x + 1 , y ) − f ( x − 1 , y ) 2 f'(x)=\frac{f(x+1,y)−f(x−1,y)}{2} f(x)=2f(x+1,y)f(x1,y)
同理:
f ′ ( y ) = f ( x , y + 1 ) − f ( x , y − 1 ) 2 f'(y)=\frac{f(x,y+1)−f(x,y-1)}{2} f(y)=2f(x,y+1)f(x,y1)
因此:
∂ e ∂ Δ x i = − ∂ I 2 ∂ Δ x i = − I 2 ( x i + Δ x i + 1 , y i + Δ y i ) − I 2 ( x i + Δ x i − 1 , y i + Δ y i ) 2 \frac{\partial e}{\partial \Delta x_i} = -\frac{\partial I_2}{\partial \Delta x_i} = -\frac{I_2 ( x_i + Δ x_i + 1 , y_i + Δ y_i )- I_2 ( x_i + Δ x_i − 1 , y_i + Δ y_i )}{2} Δxie=ΔxiI2=2I2(xi+Δxi+1,yi+Δyi)I2(xi+Δxi1,yi+Δyi)

∂ e ∂ Δ y i = − ∂ I 2 ∂ Δ y i = − I 2 ( x i + Δ x i , y i + Δ y i + 1 ) − I 2 ( x i + Δ x i , y i + Δ y i − 1 ) 2 \frac{\partial e}{\partial \Delta y_i} = -\frac{\partial I_2}{\partial \Delta y_i} = -\frac{I_2 ( x_i + Δ x_i , y_i + Δ y_i+1 )-I_2 ( x_i + Δ x_i , y_i + Δ y_i -1)}{2} Δyie=ΔyiI2=2I2(xi+Δxi,yi+Δyi+1)I2(xi+Δxi,yi+Δyi1)

3代码

void OpticalFlowSingleLevel(
        const Mat &img1,
        const Mat &img2,
        const vector<KeyPoint> &kp1,
        vector<KeyPoint> &kp2,
        vector<bool> &success,
        bool inverse
) {
    // parameters
    int half_patch_size = 4;
    int iterations = 10;
    bool have_initial = !kp2.empty();

    for (size_t i = 0; i < kp1.size(); i++) {
        auto kp = kp1[i];
        double dx = 0, dy = 0; // dx,dy need to be estimated
        if (have_initial) {
            dx = kp2[i].pt.x - kp.pt.x;
            dy = kp2[i].pt.y - kp.pt.y;
        }

        double cost = 0, lastCost = 0;
        bool succ = true; // indicate if this point succeeded

        // Gauss-Newton iterations
        for (int iter = 0; iter < iterations; iter++) {
            Eigen::Matrix2d H = Eigen::Matrix2d::Zero();
            Eigen::Vector2d b = Eigen::Vector2d::Zero();
            cost = 0;

            // 检查是否在边缘
            if (kp.pt.x + dx <= half_patch_size || kp.pt.x + dx >= img1.cols - half_patch_size ||
                kp.pt.y + dy <= half_patch_size || kp.pt.y + dy >= img1.rows - half_patch_size) {   // go outside
                succ = false;
                break;
            }

            // compute cost and jacobian  x和y都是[-4,3]
            for (int x = -half_patch_size; x < half_patch_size; x++)
                for (int y = -half_patch_size; y < half_patch_size; y++) {
                    // TODO START YOUR CODE HERE (~8 lines)
                    double error;
                    error = GetPixelValue(img1, kp.pt.x + x, kp.pt.y + y) - GetPixelValue(img2, kp.pt.x +x + dx, kp.pt.y + y+ dy);
                    Eigen::Vector2d J;  // Jacobian
                    if (inverse == false) {
                        // Forward Jacobian
                        J = -1.0 * Eigen::Vector2d(
                            0.5 * (GetPixelValue(img2, kp.pt.x + dx + x + 1, kp.pt.y + dy + y) -
                                   GetPixelValue(img2, kp.pt.x + dx + x - 1, kp.pt.y + dy + y)),
                            0.5 * (GetPixelValue(img2, kp.pt.x + dx + x, kp.pt.y + dy + y + 1) -
                                   GetPixelValue(img2, kp.pt.x + dx + x, kp.pt.y + dy + y - 1))
                        );
                    } else {
                        // Inverse Jacobian
                        // NOTE this J does not change when dx, dy is updated, so we can store it and only compute error
                        J = -1.0 * Eigen::Vector2d(
                            0.5 * (GetPixelValue(img1, kp.pt.x + dx + x + 1, kp.pt.y + dy + y) -
                                   GetPixelValue(img1, kp.pt.x + dx + x - 1, kp.pt.y + dy + y)),
                            0.5 * (GetPixelValue(img1, kp.pt.x + dx + x, kp.pt.y + dy + y + 1) -
                                   GetPixelValue(img1, kp.pt.x + dx + x, kp.pt.y + dy + y - 1))
                        );
                    }
                    if (inverse == false || iter == 0){
                        H += J * J.transpose();
                    }
                    b += - J * error;
                    cost += error * error;
                    // TODO END YOUR CODE HERE
                }

            // compute update
            // TODO START YOUR CODE HERE (~1 lines)
            Eigen::Vector2d update = H.ldlt().solve(b);
            // TODO END YOUR CODE HERE
            
            if (isnan(update[0])) {
                // sometimes occurred when we have a black or white patch and H is irreversible
                cout << "update is nan" << endl;
                succ = false;
                break;
            }
            if (iter > 0 && cost > lastCost) {
                cout << "cost increased: " << cost << ", " << lastCost << endl;
                break;
            }

            // update dx, dy
            dx += update[0];
            dy += update[1];
            lastCost = cost;
            succ = true;
            
        }
        success.push_back(succ);

        // set kp2
        if (have_initial) {
            kp2[i].pt = kp.pt + Point2f(dx, dy);
        } else {
            KeyPoint tracked = kp;
            tracked.pt += cv::Point2f(dx, dy);
            kp2.push_back(tracked);
        }
    }
}

第一层循环遍历第一张图的每个关键点与第二张图计算dx与dy

​ 第二层循环高斯牛顿迭代

​ 第三层循环每个关键点的图像块遍历计算雅可比,海森阵等

​ 求解kp2

2.4推广至金字塔
void OpticalFlowMultiLevel(
        const Mat &img1,
        const Mat &img2,
        const vector<KeyPoint> &kp1,
        vector<KeyPoint> &kp2,
        vector<bool> &success,
        bool inverse) {

    // parameters
    int pyramids = 4;   //金字塔层数
    double pyramid_scale = 0.3;     //倍数0.5
    double scales[] = {1.0, 0.3, 0.09, 0.027};

    // create pyramids
    vector<Mat> pyr1, pyr2; // image pyramids 注意这里是个vector
    // TODO START YOUR CODE HERE (~8 lines)
    for (int i = 0; i < pyramids; i++)
    {
        if (i == 0)
        {   
            pyr1.push_back(img1);
            pyr2.push_back(img2);
        }
        else
        {
            Mat img1_pyr, img2_pyr;
            //resize函数改变图像的大小
            cv::resize(pyr1[i -1], img1_pyr, cv::Size(pyr1[i - 1].cols * pyramid_scale, pyr1[i - 1].rows * pyramid_scale));
            cv::resize(pyr2[i -1], img2_pyr, cv::Size(pyr2[i - 1].cols * pyramid_scale, pyr2[i - 1].rows * pyramid_scale));

            pyr1.push_back(img1_pyr);
            pyr2.push_back(img2_pyr);
        }
    }
    // TODO END YOUR CODE HERE
    // coarse-to-fine LK tracking in pyramids
    // 这里的关键:上一层的估计值做下一层的初始值
    // TODO START YOUR CODE HERE
    vector<KeyPoint> kp1_pyr, kp2_pyr;
    for (auto &kp:kp1)
    {
        auto kp_top = kp;
        kp_top.pt *= scales[pyramids -1];
        kp1_pyr.push_back(kp_top);
        kp2_pyr.push_back(kp_top);
    }
    for (int level = pyramids -1; level >= 0; level--)
    {
        success.clear();
        OpticalFlowSingleLevel(pyr1[level], pyr2[level], kp1_pyr, kp2_pyr, success, inverse);
        if (level > 0)
        {
            for (auto &kp:kp1_pyr){
                kp.pt /= pyramid_scale;
            }
            for (auto &kp:kp2_pyr){
                kp.pt /= pyramid_scale;
            }
        }
    }

    for (auto &kp:kp2_pyr)
    {
        kp2.push_back(kp);
    }
    // TODO END YOUR CODE HERE
    // don't forget to set the results into kp2
}

二. 3直接法

1)单层直接法

算法步骤:

​ 1在参考图像(reference)中随机选取1000个点(cv::RNG),在深度图像(disparity)中得到深度值( d e p t h = f ∗ b d depth = \frac{f*b}{d} depth=dfb,f与b为相机内参,d为视差且最小为一个像素)。将深度信息与随机取到的1000个点存入容器。

​ 2将参考图像的像素提取出来,可以得到像素坐标,由 p 1 = 1 Z 1 K P 1 p_1=\frac{1}{Z_1}KP_1 p1=Z11KP1,得 P 1 = Z 1 K − 1 p 1 P_1=Z_1K^{-1}p_1 P1=Z1K1p1,即将像素坐标转化到相机坐标。第一个相机到第二个相机的变换为R,t,所以当前图像的相机坐标 P 2 = T 21 P 1 P_2 = T_{21} P_1 P2=T21P1 。得到第二张图像的相机坐标后,转换到像素坐标系。

​ 3.求解误差,雅可比矩阵,H与b,得到扰动,然后算得T。

代码实现:

void DirectPoseEstimationSingleLayer(
        const cv::Mat &img1,
        const cv::Mat &img2,
        const VecVector2d &px_ref,
        const vector<double> depth_ref,
        Sophus::SE3d &T21
) {

    // parameters
    int half_patch_size = 4;
    int iterations = 100;

    double cost = 0, lastCost = 0;
    int nGood = 0;  // good projections 最后求平均误差用这个标记在图像内的点
    VecVector2d goodProjection;
    

    for (int iter = 0; iter < iterations; iter++) {
        nGood = 0;
        goodProjection.clear();
        goodProjection.resize(px_ref.size());
        // Define Hessian and bias
        Matrix6d H = Matrix6d::Zero();  // 6x6 Hessian
        Vector6d b = Vector6d::Zero();  // 6x1 bias

        for (size_t i = 0; i < px_ref.size(); i++) {

            // compute the projection in the second image
            // TODO START YOUR CODE HERE
            Eigen::Vector3d point_ref = depth_ref[i] * Eigen::Vector3d((px_ref[i][0] - cx)/fx, (px_ref[i][1] - cy)/fy, 1);
            Eigen::Vector3d point_cur = T21 * point_ref; 

            if(point_cur[2] < 0)
            {
                continue;
            }
                
            float u = fx * point_cur[0]/point_cur[2] + cx;
            float v = fy * point_cur[1]/point_cur[2] + cy;

            if (u < half_patch_size || u > img1.cols - half_patch_size || v < half_patch_size || v > img1.rows - half_patch_size)
            {
                continue;
            }    
            goodProjection[i] = Eigen::Vector2d(u, v);
            double X = point_cur[0];
            double Y = point_cur[1];
            double Z = point_cur[2];
            double Z_2 = Z * Z;
            nGood++;

            for (int x = -half_patch_size; x < half_patch_size; x++)
            {
                for (int y = -half_patch_size; y < half_patch_size; y++)
                {
                    double error;
                    error = GetPixelValue(img1, px_ref[i][0] + x, px_ref[i][1] + y) - GetPixelValue(img2, u + x, v + y);
                    
                    Matrix26d J_img_pixel;

                    J_img_pixel(0, 0) = fx * Z_inv;
                    J_img_pixel(0, 1) = 0;
                    J_img_pixel(0, 2) = -fx * X * Z_2inv;
                    J_img_pixel(0, 3) = -fx * X * Y * Z_2inv;
                    J_img_pixel(0, 4) = fx + fx * X * X * Z_2inv;
                    J_img_pixel(0, 5) = -fx * Y * Z_inv;

                    J_img_pixel(1, 0) = 0;
                    J_img_pixel(1, 1) = fy * Z_inv;
                    J_img_pixel(1, 2) = -fy * Y * Z_2inv;
                    J_img_pixel(1, 3) = -fy - fy * Y * Y * Z_2inv;
                    J_img_pixel(1, 4) = fy * X * Y * Z_2inv;
                    J_img_pixel(1, 5) = fy * X * Z_inv;

                    Eigen::Vector2d J_img_descent;
                    J_img_descent = Eigen::Vector2d(0.5 * (GetPixelValue(img2, u + 1 + x, v + y) - GetPixelValue(img2, u - 1 + x, v + y)), 
                                                    0.5 * (GetPixelValue(img2, u + x, v + 1 + y) - GetPixelValue(img2, u + x, v - 1 + y)));
                    Vector6d J = (J_img_descent.transpose() * J_img_pixel).transpose();
                    J = -J;
                    H += J * J.transpose();
                    b += -error * J;
                    cost += error * error;
                }
            }
        
            // END YOUR CODE HERE
        }

        // solve update and put it into estimation
        // TODO START YOUR CODE HERE
        Vector6d update = H.ldlt().solve(b);
        T21 = Sophus::SE3d::exp(update) * T21;
        // END YOUR CODE HERE
        cost /= nGood;
        if (isnan(update[0])) {
            // sometimes occurred when we have a black or white patch and H is irreversible
            cout << "update is nan" << endl;
            break;
        }
        if (iter > 0 && cost > lastCost) {
            cout << "cost increased: " << cost << ", " << lastCost << endl;
            break;
        }
        
        lastCost = cost;
        cout << "cost = " << cost << ", good = " << nGood << endl;
    }
    cout << "good projection: " << nGood << endl;
    cout << "T21 = \n" << T21.matrix() << endl;
    
    // in order to help you debug, we plot the projected pixels here
    cv::Mat img1_show, img2_show;
    cv::cvtColor(img1, img1_show, COLOR_GRAY2BGR);
    cv::cvtColor(img2, img2_show, COLOR_GRAY2BGR);
    for (auto &px: px_ref) {
        cv::rectangle(img1_show, cv::Point2f(px[0] - 2, px[1] - 2), cv::Point2f(px[0] + 2, px[1] + 2),
                      cv::Scalar(0, 250, 0));
    }
    /*
    for (auto &px: goodProjection) {
        cv::rectangle(img2_show, cv::Point2f(px[0] - 2, px[1] - 2), cv::Point2f(px[0] + 2, px[1] + 2),
                      cv::Scalar(0, 250, 0));
    }
    cv::imshow("reference", img1_show);
    cv::imshow("current", img2_show);
    cv::waitKey();
    */
    for (size_t i = 0; i < px_ref.size(); ++i){
        Eigen::Vector2d p_ref = px_ref[i];
        Eigen::Vector2d p_cur = goodProjection[i];
        if (p_cur(0) > 0 && p_cur(1) > 0 && p_ref(0) > 0 && p_ref(1) > 0){
            cv::rectangle(img2_show, cv::Point2f(p_cur[0] - 2, p_cur[1] - 2), 
                      cv::Point2f(p_cur[0] + 2, p_cur[1] + 2),
                      cv::Scalar(0, 250, 0));
            cv::line(img2_show, cv::Point2f(p_ref(0), p_ref(1)), cv::Point2f(p_cur(0), p_cur(1)), cv::Scalar(0,255,0));
        }
    }

    cv::imshow("reference", img1_show);
    cv::imshow("current", img2_show);
    cv::waitKey(1000);
}

Tips:1EIGEN_STATIC_ASSERT_SAME_MATRIX_SIZE编译时报这个错误的大概率矩阵或向量维度有误。

​ 2段错误(核心已转储) 执行时报这个错误大概率代码写错了,仔细检查代码。

​ 3运行的效果不太好大概率算法有误,检查算法是否正确。

2)多层直接法

void DirectPoseEstimationMultiLayer(
        const cv::Mat &img1,
        const cv::Mat &img2,
        const VecVector2d &px_ref,
        const vector<double> depth_ref,
        Sophus::SE3d &T21
) {

    // parameters
    int pyramids = 4;
    double pyramid_scale = 0.5;
    double scales[] = {1.0, 0.5, 0.25, 0.125};

    // create pyramids
    vector<cv::Mat> pyr1, pyr2; // image pyramids
    // TODO START YOUR CODE HERE
    for (int i = 0; i < pyramids; i++)
    {
        if (i == 0)
        {
            pyr1.push_back(img1);
            pyr2.push_back(img2);
        }
        else
        {
            cv::Mat img1_pyr, img2_pyr;
            cv::resize(pyr1[i - 1], img1_pyr, cv::Size(pyr1[i - 1].cols * pyramid_scale, pyr1[i - 1].rows * pyramid_scale));
            cv::resize(pyr2[i - 1], img2_pyr, cv::Size(pyr2[i - 1].cols * pyramid_scale, pyr2[i - 1].rows * pyramid_scale));
            pyr1.push_back(img1_pyr);
            pyr2.push_back(img2_pyr);
        }
    }
    // END YOUR CODE HERE

    double fxG = fx, fyG = fy, cxG = cx, cyG = cy;  // backup the old values
    for (int level = pyramids - 1; level >= 0; level--) {
        VecVector2d px_ref_pyr; // set the keypoints in this pyramid level
        for (auto &px: px_ref) {
            px_ref_pyr.push_back(scales[level] * px);
        }

        // TODO START YOUR CODE HERE
        // scale fx, fy, cx, cy in different pyramid levels
        fx = fxG * scales[level];
        fy = fyG * scales[level];
        cx = cxG * scales[level];
        cy = cyG * scales[level];
        // END YOUR CODE HERE
        DirectPoseEstimationSingleLayer(pyr1[level], pyr2[level], px_ref_pyr, depth_ref, T21);
    }
}

三.4*使用光流计算视差

视差图:以图像中任意一幅图为基准,其大小为基准图像的大小,元素值为视差值的图像。

所以给定视差图像disparity,取出其中的像素值为真实视差

估计视差为两张图像之间相同像素点的水平坐标之差。

double AnalyzeDisparityRMSE(const cv::Mat &left_img, 
                            const cv::Mat &right_img,
                            cv::Mat &img2, 
                            const cv::Mat &disparity_img, 
                            const vector<KeyPoint> &kp1, 
                            const vector<KeyPoint> &kp2, 
                            const vector<bool> &success)
{
    int count = 0;
    vector<double> disparity_est;
    vector<double> cost;
    // analyze disparity root mean square error
    cv::cvtColor(right_img, img2, COLOR_GRAY2BGR);
    for (int i = 0; i < kp2.size(); i++) {
        if (success[i]) {
            // draw disparity results
            cv::circle(img2, kp2[i].pt, 2, cv::Scalar(0, 250, 0), 2);
            cv::line(img2, kp1[i].pt, cv::Point2f(kp2[i].pt.x, kp1[i].pt.y), cv::Scalar(0, 250, 0));

            // estimated disparity
            double dis_est = kp2[i].pt.x - kp1[i].pt.x;
            disparity_est.push_back(dis_est);
            // ground truth disparity
            float u = kp1[i].pt.x;
            float v = kp1[i].pt.y;
            double disparity_gt = GetPixelValue(disparity_img, u, v);
            cost.push_back(dis_est-disparity_gt);

        }
    }
    double sum = accumulate(cost.begin(),cost.end(),0.0);
    double error = sum /cost.size();


    return error;

}

Ps:突然发现前面的知识有点忘了。。。。。。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值