本文参考博客:视觉SLAM十四讲(第二版)第8讲习题解答。2、4题与参考博文一致,1、3做了补充。
本文3、4题代码参考:HW-of-SLAMBOOK2
1、除了LK光流,还有哪些光流方法?它们各有什么特点?
光流方法 | 原理 | 优点 | 缺点 |
LK | 属于稀疏光流,通过两帧之间差分计算光流。 | 计算简单且快速,适合实时应用。 对小范围运动的估计精度较高。 | 对大运动或快速运动的鲁棒性差。 对光照变化敏感,噪声影响较大。 |
KLT | Lucas-Kanade 方法基础上,使用角点(特征点)的基本检测和跟踪。 属于稀疏光流,通过设置5x5 的窗口并假设串口内的像奈点保持相同移动距离进行最小二乘法求解。 | 提供特征点的鲁棒跟踪,适合在场景中跟踪特定物体。 能够在一定程度上处理遮挡。 | 需要良好角点的初始选择,特征点稀疏。 对于快速移动的特征点,可能会丢失跟踪。 |
HS | 属于稠密光流,通过最小化运动速度的二阶导数满足其平滑性限制(即邻域的像素点的速度相近),对目标导数求偏导从而求解速度。 | 适用于全局场景,可以生成稠密光流场。 对光照变化和噪声的鲁棒性相对较强。 | 计算成本高,不适合实时处理。 对快速运动或大的变形不够鲁棒。 |
Farneback | 使用二次多项式展开来估计光流,获得密集光流场。通过对图像的高斯金字塔进行下采样和特征匹配。 属于稠密光流,对像素邻域信息进行最小二乘加权拟合,并利用多项式对每个像素的邻域信息进行拟合。 | 生成稠密光流,能够估计每个像素的光流。 在一定程度上处理快速移动物体。 | 计算复杂度高,相较于稀疏方法速度较慢。 对于大量视频噪声,可能会产生不准确的结果。 |
利用OpenCV库对LK、HS、Farneback进行测试
Code
#include <opencv2/opencv.hpp>
#include <iostream>
using namespace cv;
using namespace std;
void showOpticalFlow(const Mat& flow) {
Mat hsv(flow.size(), CV_32FC3);
for (int y = 0; y < flow.rows; y++) {
for (int x = 0; x < flow.cols; x++) {
const Point2f& fxy = flow.at<Point2f>(y, x);
float angle = atan2(fxy.y, fxy.x);
float magnitude = norm(fxy);
hsv.at<Vec3f>(y, x)[0] = angle * 180 / CV_PI / 2 + 90; // 转换到色相范围
hsv.at<Vec3f>(y, x)[1] = 1; // 饱和度
hsv.at<Vec3f>(y, x)[2] = std::min(magnitude / 10.0f, 1.0f); // 明度
}
}
Mat bgr;
cvtColor(hsv, bgr, COLOR_HSV2BGR);
imshow("Flow", bgr);
}
void LKOpticalFlow(const Mat& prevImg, const Mat& nextImg) {
// 确定要跟踪的点
std::vector<Point2f> pointsA, pointsB;
goodFeaturesToTrack(prevImg, pointsA, 100, 0.01, 10);
vector<KeyPoint> kp1;
Ptr<GFTTDetector> detector = GFTTDetector::create(500, 0.01, 20); // maximum 500 keypoints
detector->detect(prevImg, kp1);
for (auto& kp : kp1) pointsA.push_back(kp.pt);
// 计算光流
std::vector<uchar> status;
std::vector<float> err;
calcOpticalFlowPyrLK(prevImg, nextImg, pointsA, pointsB, status, err);
int errSum = 0;
for (int i = 0; i < err.size(); i++)
errSum += err[i];
cout << "err: " << errSum << endl;
// 绘制结果
for (size_t i = 0; i < pointsB.size(); i++) {
if (status[i]) {
line(nextImg, pointsA[i], pointsB[i], Scalar(0, 255, 0), 2);
circle(nextImg, pointsB[i], 5, Scalar(0, 0, 255), -1);
}
}
}
void HSOpticalFlow(const Mat& prevImg, const Mat& nextImg) {
Mat flow;
calcOpticalFlowFarneback(prevImg, nextImg, flow, 0.5, 3, 15, 3, 5, 1.2, 0);
showOpticalFlow(flow);
}
void FarnebackOpticalFlow1(const Mat& prevImg, const Mat& nextImg) {
Mat flow;
calcOpticalFlowFarneback(prevImg, nextImg, flow, 0.5, 3, 15, 3, 5, 1.2, 0);
// 可视化光流
Mat flowImg;
showOpticalFlow(flow);
}
int main(int argc, char** argv) {
// 读取视频或图像序列
VideoCapture cap(0);
if (!cap.isOpened()) {
cerr << "Error opening video stream or file" << endl;
return -1;
}
Mat prevImg, nextImg;
cap >> prevImg;
cvtColor(prevImg, prevImg, COLOR_BGR2GRAY);
while (true) {
cap >> nextImg;
if (nextImg.empty()) break;
cvtColor(nextImg, nextImg, COLOR_BGR2GRAY);
Mat nextImgClone = nextImg.clone();
// 使用不同光流方法
LKOpticalFlow(prevImg, nextImgClone);
//HSOpticalFlow(prevImg, nextImg);
//FarnebackOpticalFlow1(prevImg, nextImgClone);
imshow("Optical Flow", nextImgClone);
// 退出条件
if (waitKey(30) >= 0) break;
prevImg = nextImg.clone();
}
cap.release();
destroyAllWindows();
return 0;
}
运行结果:
LK:
HS:
Farneback:
Note:函数 showOpticalFlow 将光流数据转换为 HSV 图像,然后再转换为 BGR 显示。
2、在本节程序的求图像梯度过程中,我们简单地求了u + 1和 u + 2的灰度之差除以2,作为u方向上的梯度值。这种做法有什么缺点?提示:对于距离较近的特征,变化应该较快;而距离较远的特征在图像中变化较慢,求梯度时能否利用此信息?
由提示可以知道,当特征距离较近时,变化较快,像素跨度较大;反之距离较远时,像素跨度较小。仅通过对前后两个像素作差除以2作为梯度值无法较好地代表该像素对应的梯度值,该做法并未考虑深度值对梯度的影响,使得在目标函数的优化计算中梯度指导的下降方向可能不准确。
因此,在求梯度的时候,可以根据不同特征的深度值设置一个加权系数,当距离较近时,加权系数较小;距离较远时,加权系数较大。从而平衡不同距离的特征增量,一定程度上减少陷入局部最优的情况。
3、直接法是否能和光流一样,提出“反向法”的概念?即,使用原始图像的梯度代替目标图像的梯度?
可以。通过设置雅克比矩阵 为 ,即对J_img_pixel中img2修改为img1即可。
Code
#include <opencv2/opencv.hpp>
#include <sophus/se3.hpp>
#include <boost/format.hpp>
#include <pangolin/pangolin.h>
using namespace std;
typedef vector<Eigen::Vector2d, Eigen::aligned_allocator<Eigen::Vector2d>> VecVector2d;
// Camera intrinsics
double fx = 718.856, fy = 718.856, cx = 607.1928, cy = 185.2157;
// baseline
double baseline = 0.573;
// paths
string left_file = "./left.png";
string disparity_file = "./disparity.png";
boost::format fmt_others("./%06d.png"); // other files
// useful typedefs
typedef Eigen::Matrix<double, 6, 6> Matrix6d;
typedef Eigen::Matrix<double, 2, 6> Matrix26d;
typedef Eigen::Matrix<double, 6, 1> Vector6d;
/// class for accumulator jacobians in parallel
class JacobianAccumulator {
public:
JacobianAccumulator(
const cv::Mat &img1_,
const cv::Mat &img2_,
const VecVector2d &px_ref_,
const vector<double> depth_ref_,
Sophus::SE3d &T21_) :
img1(img1_), img2(img2_), px_ref(px_ref_), depth_ref(depth_ref_), T21(T21_) {
projection = VecVector2d(px_ref.size(), Eigen::Vector2d(0, 0));
}
/// accumulate jacobians in a range
void accumulate_jacobian(const cv::Range &range);
/// get hessian matrix
Matrix6d hessian() const { return H; }
/// get bias
Vector6d bias() const { return b; }
/// get total cost
double cost_func() const { return cost; }
/// get projected points
VecVector2d projected_points() const { return projection; }
/// reset h, b, cost to zero
void reset() {
H = Matrix6d::Zero();
b = Vector6d::Zero();
cost = 0;
}
private:
const cv::Mat &img1;
const cv::Mat &img2;
const VecVector2d &px_ref;
const vector<double> depth_ref;
Sophus::SE3d &T21;
VecVector2d projection; // projected points
std::mutex hessian_mutex;
Matrix6d H = Matrix6d::Zero();
Vector6d b = Vector6d::Zero();
double cost = 0;
};
/**
* pose estimation using direct method
* @param img1
* @param img2
* @param px_ref
* @param depth_ref
* @param T21
*/
void DirectPoseEstimationMultiLayer(
const cv::Mat &img1,
const cv::Mat &img2,
const VecVector2d &px_ref,
const vector<double> depth_ref,
Sophus::SE3d &T21
);
/**
* pose estimation using direct method
* @param img1
* @param img2
* @param px_ref
* @param depth_ref
* @param T21
*/
void DirectPoseEstimationSingleLayer(
const cv::Mat &img1,
const cv::Mat &img2,
const VecVector2d &px_ref,
const vector<double> depth_ref,
Sophus::SE3d &T21
);
// bilinear interpolation
inline float GetPixelValue(const cv::Mat &img, float x, float y) {
// boundary check
if (x < 0) x = 0;
if (y < 0) y = 0;
if (x >= img.cols) x = img.cols - 1;
if (y >= img.rows) y = img.rows - 1;
uchar *data = &img.data[int(y) * img.step + int(x)];
float xx = x - floor(x);
float yy = y - floor(y);
return float(
(1 - xx) * (1 - yy) * data[0] +
xx * (1 - yy) * data[1] +
(1 - xx) * yy * data[img.step] +
xx * yy * data[img.step + 1]
);
}
int main(int argc, char **argv) {
cv::Mat left_img = cv::imread(left_file, 0);
cv::Mat disparity_img = cv::imread(disparity_file, 0);
// let's randomly pick pixels in the first image and generate some 3d points in the first image's frame
cv::RNG rng;
int nPoints = 2000;
int boarder = 20;
VecVector2d pixels_ref;
vector<double> depth_ref;
// generate pixels in ref and load depth data
for (int i = 0; i < nPoints; i++) {
int x = rng.uniform(boarder, left_img.cols - boarder); // don't pick pixels close to boarder
int y = rng.uniform(boarder, left_img.rows - boarder); // don't pick pixels close to boarder
int disparity = disparity_img.at<uchar>(y, x);
double depth = fx * baseline / disparity; // you know this is disparity to depth
depth_ref.push_back(depth);
pixels_ref.push_back(Eigen::Vector2d(x, y));
}
// estimates 01~05.png's pose using this information
Sophus::SE3d T_cur_ref;
for (int i = 1; i < 6; i++) { // 1~10
cv::Mat img = cv::imread((fmt_others % i).str(), 0);
// try single layer by uncomment this line
// DirectPoseEstimationSingleLayer(left_img, img, pixels_ref, depth_ref, T_cur_ref);
DirectPoseEstimationMultiLayer(left_img, img, pixels_ref, depth_ref, T_cur_ref);
}
return 0;
}
void DirectPoseEstimationSingleLayer(
const cv::Mat &img1,
const cv::Mat &img2,
const VecVector2d &px_ref,
const vector<double> depth_ref,
Sophus::SE3d &T21) {
const int iterations = 10;
double cost = 0, lastCost = 0;
auto t1 = chrono::steady_clock::now();
JacobianAccumulator jaco_accu(img1, img2, px_ref, depth_ref, T21);
for (int iter = 0; iter < iterations; iter++) {
jaco_accu.reset();
cv::parallel_for_(cv::Range(0, px_ref.size()),
std::bind(&JacobianAccumulator::accumulate_jacobian, &jaco_accu, std::placeholders::_1));
Matrix6d H = jaco_accu.hessian();
Vector6d b = jaco_accu.bias();
// solve update and put it into estimation
Vector6d update = H.ldlt().solve(b);;
T21 = Sophus::SE3d::exp(update) * T21;
cost = jaco_accu.cost_func();
if (std::isnan(update[0])) {
// sometimes occurred when we have a black or white patch and H is irreversible
cout << "update is nan" << endl;
break;
}
if (iter > 0 && cost > lastCost) {
cout << "cost increased: " << cost << ", " << lastCost << endl;
break;
}
if (update.norm() < 1e-3) {
// converge
break;
}
lastCost = cost;
cout << "iteration: " << iter << ", cost: " << cost << endl;
}
cout << "T21 = \n" << T21.matrix() << endl;
auto t2 = chrono::steady_clock::now();
auto time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1);
cout << "direct method for single layer: " << time_used.count() << endl;
// plot the projected pixels here
cv::Mat img2_show;
cv::cvtColor(img2, img2_show, CV_GRAY2BGR);
VecVector2d projection = jaco_accu.projected_points();
for (size_t i = 0; i < px_ref.size(); ++i) {
auto p_ref = px_ref[i];
auto p_cur = projection[i];
if (p_cur[0] > 0 && p_cur[1] > 0) {
cv::circle(img2_show, cv::Point2f(p_cur[0], p_cur[1]), 2, cv::Scalar(0, 250, 0), 2);
cv::line(img2_show, cv::Point2f(p_ref[0], p_ref[1]), cv::Point2f(p_cur[0], p_cur[1]),
cv::Scalar(0, 250, 0));
}
}
cv::imshow("current", img2_show);
cv::waitKey();
}
void JacobianAccumulator::accumulate_jacobian(const cv::Range &range) {
// parameters
const int half_patch_size = 1;
int cnt_good = 0;
Matrix6d hessian = Matrix6d::Zero();
Vector6d bias = Vector6d::Zero();
double cost_tmp = 0;
for (size_t i = range.start; i < range.end; i++) {
// compute the projection in the second image
Eigen::Vector3d point_ref =
depth_ref[i] * Eigen::Vector3d((px_ref[i][0] - cx) / fx, (px_ref[i][1] - cy) / fy, 1);
Eigen::Vector3d point_cur = T21 * point_ref;
if (point_cur[2] < 0) // depth invalid
continue;
float u = fx * point_cur[0] / point_cur[2] + cx, v = fy * point_cur[1] / point_cur[2] + cy;
if (u < half_patch_size || u > img2.cols - half_patch_size || v < half_patch_size ||
v > img2.rows - half_patch_size)
continue;
projection[i] = Eigen::Vector2d(u, v);
double X = point_cur[0], Y = point_cur[1], Z = point_cur[2],
Z2 = Z * Z, Z_inv = 1.0 / Z, Z2_inv = Z_inv * Z_inv;
cnt_good++;
// and compute error and jacobian
for (int x = -half_patch_size; x <= half_patch_size; x++)
for (int y = -half_patch_size; y <= half_patch_size; y++) {
double error = GetPixelValue(img1, px_ref[i][0] + x, px_ref[i][1] + y) -
GetPixelValue(img2, u + x, v + y);
Matrix26d J_pixel_xi;
Eigen::Vector2d J_img_pixel;
J_pixel_xi(0, 0) = fx * Z_inv;
J_pixel_xi(0, 1) = 0;
J_pixel_xi(0, 2) = -fx * X * Z2_inv;
J_pixel_xi(0, 3) = -fx * X * Y * Z2_inv;
J_pixel_xi(0, 4) = fx + fx * X * X * Z2_inv;
J_pixel_xi(0, 5) = -fx * Y * Z_inv;
J_pixel_xi(1, 0) = 0;
J_pixel_xi(1, 1) = fy * Z_inv;
J_pixel_xi(1, 2) = -fy * Y * Z2_inv;
J_pixel_xi(1, 3) = -fy - fy * Y * Y * Z2_inv;
J_pixel_xi(1, 4) = fy * X * Y * Z2_inv;
J_pixel_xi(1, 5) = fy * X * Z_inv;
//使用img1的梯度作为第二个图像的梯度
J_img_pixel = Eigen::Vector2d(
0.5 * (GetPixelValue(img1, px_ref[i][0] + 1 + x, px_ref[i][1] + y) - GetPixelValue(img1, px_ref[i][0] - 1 + x, px_ref[i][1] + y)),
0.5 * (GetPixelValue(img1, px_ref[i][0] + x, px_ref[i][1] + 1 + y) - GetPixelValue(img1, px_ref[i][0] + x, px_ref[i][1] - 1 + y))
);
// total jacobian
Vector6d J = -1.0 * (J_img_pixel.transpose() * J_pixel_xi).transpose();
hessian += J * J.transpose();
bias += -error * J;
cost_tmp += error * error;
}
}
if (cnt_good) {
// set hessian, bias and cost
unique_lock<mutex> lck(hessian_mutex); //对该进程加锁,保证数据正确性
H += hessian;
b += bias;
cost += cost_tmp / cnt_good;
}
}
void DirectPoseEstimationMultiLayer(
const cv::Mat &img1,
const cv::Mat &img2,
const VecVector2d &px_ref,
const vector<double> depth_ref,
Sophus::SE3d &T21) {
// parameters
int pyramids = 4;
double pyramid_scale = 0.5;
double scales[] = {1.0, 0.5, 0.25, 0.125};
// create pyramids
vector<cv::Mat> pyr1, pyr2; // image pyramids
for (int i = 0; i < pyramids; i++) {
if (i == 0) {
pyr1.push_back(img1);
pyr2.push_back(img2);
} else {
cv::Mat img1_pyr, img2_pyr;
cv::resize(pyr1[i - 1], img1_pyr,
cv::Size(pyr1[i - 1].cols * pyramid_scale, pyr1[i - 1].rows * pyramid_scale));
cv::resize(pyr2[i - 1], img2_pyr,
cv::Size(pyr2[i - 1].cols * pyramid_scale, pyr2[i - 1].rows * pyramid_scale));
pyr1.push_back(img1_pyr);
pyr2.push_back(img2_pyr);
}
}
double fxG = fx, fyG = fy, cxG = cx, cyG = cy; // backup the old values
for (int level = pyramids - 1; level >= 0; level--) {
VecVector2d px_ref_pyr; // set the keypoints in this pyramid level
for (auto &px: px_ref) {
px_ref_pyr.push_back(scales[level] * px);
}
// scale fx, fy, cx, cy in different pyramid levels
fx = fxG * scales[level];
fy = fyG * scales[level];
cx = cxG * scales[level];
cy = cyG * scales[level];
DirectPoseEstimationSingleLayer(pyr1[level], pyr2[level], px_ref_pyr, depth_ref, T21);
}
}
原理:
反向法会用原始图像的梯度信息来估计运动,而不是目标图像的梯度。这意味着你将使用当前帧的梯度来指导如何从前一帧到达当前位置。
优点:
可能在某些情况下减少计算复杂度,因为不需要实时计算目标图像的梯度。
可以利用已知的图像特征来增强跟踪稳定性。
缺点:
可能导致估计不准确,因为原始图像与目标图像之间的变化(如形变、光照变化等)会影响结果。
对于大运动或快速变化的场景,反向法可能会失效。
追踪结果如下:
可以发现与原结果相比相差较小。
4、使用Ceres或g2o实现稀疏直接法和半稠密直接法。
以下代码分别是使用Ceres和g2o实现的稀疏直接法。
Ceres
#include <iostream>
#include <opencv2/opencv.hpp>
#include <ceres/ceres.h>
#include <Eigen/Core>
#include "sophus/se3.hpp"
#include <chrono>
#include <boost/format.hpp>
using namespace std;
using namespace Eigen;
using namespace cv;
typedef vector<Eigen::Vector2d, Eigen::aligned_allocator<Eigen::Vector2d>> VecVector2d;
typedef Eigen::Matrix<double, 2, 6> Matrix26d;
// Camera intrinsics
double fx = 718.856, fy = 718.856, cx = 607.1928, cy = 185.2157;
// baseline
double baseline = 0.573;
// paths
string left_file = "./left.png";
string disparity_file = "./disparity.png";
boost::format fmt_others("./%06d.png"); // other files
/**
* pose estimation using direct method
* @param img1
* @param img2
* @param px_ref
* @param depth_ref
* @param se3
*/
void DirectPoseEstimationMultiLayer(
const cv::Mat &img1,
const cv::Mat &img2,
const VecVector2d &px_ref,
const vector<double> depth_ref,
Sophus::Vector6d &se3
);
/**
* pose estimation using direct method
* @param img1
* @param img2
* @param px_ref
* @param depth_ref
* @param se3
*/
void DirectPoseEstimationSingleLayer(
const cv::Mat &img1,
const cv::Mat &img2,
const VecVector2d &px_ref,
const vector<double> depth_ref,
Sophus::Vector6d &se3
);
// bilinear interpolation
inline float GetPixelValue(const cv::Mat &img, float x, float y) {
// boundary check
if (x < 0) x = 0;
if (y < 0) y = 0;
if (x >= img.cols) x = img.cols - 1;
if (y >= img.rows) y = img.rows - 1;
uchar *data = &img.data[int(y) * img.step + int(x)];
float xx = x - floor(x);
float yy = y - floor(y);
return float(
(1 - xx) * (1 - yy) * data[0] +
xx * (1 - yy) * data[1] +
(1 - xx) * yy * data[img.step] +
xx * yy * data[img.step + 1]
);
}
class DirectMethodLuminaError : public ceres::SizedCostFunction<1, 6> {
public:
EIGEN_MAKE_ALIGNED_OPERATOR_NEW
DirectMethodLuminaError(Vector2d px_ref, double depth_ref, Mat img1, Mat img2) :
_px_ref(px_ref), _depth_ref(depth_ref), _img1(img1), _img2(img2) {}
virtual ~DirectMethodLuminaError() {}
virtual bool Evaluate(
double const* const* parameters, double *residuals, double **jacobians) const {
Eigen::Map<const Eigen::Matrix<double,6,1>> se3(*parameters);
Sophus::SE3d T = Sophus::SE3d::exp(se3);
Eigen::Vector3d point_ref =_depth_ref * Eigen::Vector3d((_px_ref[0] - cx) / fx, (_px_ref[1] - cy) / fy, 1);
Eigen::Vector3d point_cur = T * point_ref;
double u = fx * point_cur[0] / point_cur[2] + cx, v = fy * point_cur[1] / point_cur[2] + cy;
residuals[0] = GetPixelValue(_img1, _px_ref[0], _px_ref[1]) - GetPixelValue(_img2, u, v);
if(jacobians != NULL) {
if(jacobians[0] != NULL) {
Eigen::Map<Eigen::Matrix<double, 6, 1>> J(jacobians[0]);
double X = point_cur[0], Y = point_cur[1], Z = point_cur[2];
double Z2 = Z * Z, Z_inv = 1.0 / Z, Z2_inv = Z_inv * Z_inv;
Matrix26d J_pixel_xi;
Eigen::Vector2d J_img_pixel;
J_pixel_xi(0, 0) = fx * Z_inv;
J_pixel_xi(0, 1) = 0;
J_pixel_xi(0, 2) = -fx * X * Z2_inv;
J_pixel_xi(0, 3) = -fx * X * Y * Z2_inv;
J_pixel_xi(0, 4) = fx + fx * X * X * Z2_inv;
J_pixel_xi(0, 5) = -fx * Y * Z_inv;
J_pixel_xi(1, 0) = 0;
J_pixel_xi(1, 1) = fy * Z_inv;
J_pixel_xi(1, 2) = -fy * Y * Z2_inv;
J_pixel_xi(1, 3) = -fy - fy * Y * Y * Z2_inv;
J_pixel_xi(1, 4) = fy * X * Y * Z2_inv;
J_pixel_xi(1, 5) = fy * X * Z_inv;
J_img_pixel = Eigen::Vector2d(
0.5 * (GetPixelValue(_img2, u + 1 , v ) - GetPixelValue(_img2, u - 1 , v )),
0.5 * (GetPixelValue(_img2, u , v + 1 ) - GetPixelValue(_img2, u , v - 1 ))
);
J = -1.0 * (J_img_pixel.transpose() * J_pixel_xi).transpose();
}
}
return true;
}
private:
const Vector2d _px_ref;
const double _depth_ref;
const Mat _img1;
const Mat _img2;
};
int main(int argc, char **argv) {
cv::Mat left_img = cv::imread(left_file, 0);
cv::Mat disparity_img = cv::imread(disparity_file, 0);
// let's randomly pick pixels in the first image and generate some 3d points in the first image's frame
cv::RNG rng;
int nPoints = 2000;
int boarder = 20;
VecVector2d pixels_ref;
vector<double> depth_ref;
// generate pixels in ref and load depth data
for (int i = 0; i < nPoints; i++) {
int x = rng.uniform(boarder, left_img.cols - boarder); // don't pick pixels close to boarder
int y = rng.uniform(boarder, left_img.rows - boarder); // don't pick pixels close to boarder
int disparity = disparity_img.at<uchar>(y, x);
double depth = fx * baseline / disparity; // you know this is disparity to depth
depth_ref.push_back(depth);
pixels_ref.push_back(Eigen::Vector2d(x, y));
}
Sophus::Vector6d se3;
for (int i = 1; i < 6; i++) { // 1~10
cv::Mat img = cv::imread((fmt_others % i).str(), 0);
// try single layer by uncomment this line
// DirectPoseEstimationSingleLayer(left_img, img, pixels_ref, depth_ref, se3);
cout<< "现在是第" << i <<"次"<<endl;
DirectPoseEstimationMultiLayer(left_img, img, pixels_ref, depth_ref, se3);
}
return 0;
}
void DirectPoseEstimationSingleLayer(
const cv::Mat &img1,
const cv::Mat &img2,
const VecVector2d &px_ref,
const vector<double> depth_ref,
Sophus::Vector6d &se3
){
ceres::Problem problem;
for(int j=0; j<px_ref.size(); ++j) {
ceres::CostFunction *cost_function;
cost_function = new DirectMethodLuminaError(px_ref[j], depth_ref[j], img1, img2);
problem.AddResidualBlock(cost_function, NULL, se3.data());
}
auto t1 = chrono::steady_clock::now();
ceres::Solver::Options options;
options.dynamic_sparsity = true;
options.max_num_iterations = 100;
options.sparse_linear_algebra_library_type = ceres::SUITE_SPARSE;
options.minimizer_type = ceres::TRUST_REGION;
options.linear_solver_type = ceres::SPARSE_NORMAL_CHOLESKY;
options.trust_region_strategy_type = ceres::DOGLEG;
options.minimizer_progress_to_stdout = false;
options.dogleg_type = ceres::SUBSPACE_DOGLEG;
ceres::Solver::Summary summary;
ceres::Solve(options, &problem, &summary);
//std::cout << summary.BriefReport() << "\n";
std::cout << "estimated pose: \n" << Sophus::SE3d::exp(se3).matrix() << std::endl;
auto t2 = chrono::steady_clock::now();
auto time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1);
cout << "solve pnp in ceres cost time: " << time_used.count() << " seconds." << endl;
}
void DirectPoseEstimationMultiLayer(
const cv::Mat &img1,
const cv::Mat &img2,
const VecVector2d &px_ref,
const vector<double> depth_ref,
Sophus::Vector6d &se3
){
// parameters
int pyramids = 4;
double pyramid_scale = 0.5;
double scales[] = {1.0, 0.5, 0.25, 0.125};
// create pyramids
vector<cv::Mat> pyr1, pyr2; // image pyramids
for (int i = 0; i < pyramids; i++) {
if (i == 0) {
pyr1.push_back(img1);
pyr2.push_back(img2);
} else {
cv::Mat img1_pyr, img2_pyr;
cv::resize(pyr1[i - 1], img1_pyr,
cv::Size(pyr1[i - 1].cols * pyramid_scale, pyr1[i - 1].rows * pyramid_scale));
cv::resize(pyr2[i - 1], img2_pyr,
cv::Size(pyr2[i - 1].cols * pyramid_scale, pyr2[i - 1].rows * pyramid_scale));
pyr1.push_back(img1_pyr);
pyr2.push_back(img2_pyr);
}
}
double fxG = fx, fyG = fy, cxG = cx, cyG = cy; // backup the old values
for (int level = pyramids - 1; level >= 0; level--) {
VecVector2d px_ref_pyr; // set the keypoints in this pyramid level
for (auto &px: px_ref) {
px_ref_pyr.push_back(scales[level] * px);
}
// scale fx, fy, cx, cy in different pyramid levels
fx = fxG * scales[level];
fy = fyG * scales[level];
cx = cxG * scales[level];
cy = cyG * scales[level];
cout<<"pyramid"<< level+1 <<'\t';
DirectPoseEstimationSingleLayer(pyr1[level], pyr2[level], px_ref_pyr, depth_ref, se3);
}
}
G2O
#include <iostream>
#include <opencv2/opencv.hpp>
#include <Eigen/Core>
#include <g2o/core/base_vertex.h>
#include <g2o/core/base_unary_edge.h>
#include <g2o/core/block_solver.h>
#include <g2o/core/optimization_algorithm_gauss_newton.h>
#include <g2o/core/optimization_algorithm_levenberg.h>
#include <g2o/solvers/dense/linear_solver_dense.h>
#include <chrono>
#include <sophus/se3.hpp>
#include <boost/format.hpp>
using namespace std;
using namespace cv;
// Camera intrinsics
double fx = 718.856, fy = 718.856, cx = 607.1928, cy = 185.2157;
// baseline
double baseline = 0.573;
// paths
string left_file = "./left.png";
string disparity_file = "./disparity.png";
boost::format fmt_others("./%06d.png"); // other files
// g2o
typedef vector<Eigen::Vector2d, Eigen::aligned_allocator<Eigen::Vector2d>> VecVector2d;
typedef Eigen::Matrix<double, 2, 6> Matrix26d;
/**
* pose estimation using direct method
* @param img1
* @param img2
* @param px_ref
* @param depth_ref
* @param T
*/
void DirectPoseEstimationMultiLayer(
const cv::Mat &img1,
const cv::Mat &img2,
const VecVector2d &px_ref,
const vector<double> depth_ref,
Sophus::SE3d &T
);
/**
* pose estimation using direct method
* @param img1
* @param img2
* @param px_ref
* @param depth_ref
* @param T
*/
void DirectPoseEstimationSingleLayer(
const cv::Mat &img1,
const cv::Mat &img2,
const VecVector2d &px_ref,
const vector<double> depth_ref,
Sophus::SE3d &T
);
// bilinear interpolation
inline double GetPixelValue(const cv::Mat &img, float x, float y) {
// boundary check
if (x < 0) x = 0;
if (y < 0) y = 0;
if (x >= img.cols) x = img.cols - 1;
if (y >= img.rows) y = img.rows - 1;
uchar *data = &img.data[int(y) * img.step + int(x)];
float xx = x - floor(x);
float yy = y - floor(y);
return double(
(1 - xx) * (1 - yy) * data[0] +
xx * (1 - yy) * data[1] +
(1 - xx) * yy * data[img.step] +
xx * yy * data[img.step + 1]
);
}
/// vertex and edges used in g2o ba
class VertexPose : public g2o::BaseVertex<6, Sophus::SE3d> {
public:
EIGEN_MAKE_ALIGNED_OPERATOR_NEW;
virtual void setToOriginImpl() override {
_estimate = Sophus::SE3d();
}
/// left multiplication on SE3
virtual void oplusImpl(const double *update) override {
Eigen::Matrix<double, 6, 1> update_eigen;
update_eigen << update[0], update[1], update[2], update[3], update[4], update[5];
_estimate = Sophus::SE3d::exp(update_eigen) * _estimate;
}
virtual bool read(istream &in) override {}
virtual bool write(ostream &out) const override {}
};
//1元边,测量值维度是1,对应测量值类型为灰度差,顶点对应的数据类型都是VertexPose
class EdgeProjection : public g2o::BaseUnaryEdge<1, double, VertexPose> {
public:
EIGEN_MAKE_ALIGNED_OPERATOR_NEW;
EdgeProjection(const Eigen::Vector2d &px_ref, const double depth_ref, const Mat img1, const Mat img2)
: _px_ref(px_ref), _depth_ref(depth_ref), _img1(img1), _img2(img2) {}
virtual void computeError() override {
const VertexPose *v = static_cast<VertexPose *> (_vertices[0]);
Sophus::SE3d T = v->estimate();
Eigen::Vector3d point_ref =_depth_ref * Eigen::Vector3d((_px_ref[0] - cx) / fx, (_px_ref[1] - cy) / fy, 1);
Eigen::Vector3d point_cur = T * point_ref;
double u1 = fx * point_cur[0] / point_cur[2] + cx, v1 = fy * point_cur[1] / point_cur[2] + cy;
_error(0, 0) = _measurement - GetPixelValue(_img2, u1, v1);
}
virtual void linearizeOplus() override {
const VertexPose *v = static_cast<VertexPose *> (_vertices[0]);
Sophus::SE3d T = v->estimate();
Eigen::Vector3d point_ref =_depth_ref * Eigen::Vector3d((_px_ref[0] - cx) / fx, (_px_ref[1] - cy) / fy, 1);
Eigen::Vector3d point_cur = T * point_ref;
double u1 = fx * point_cur[0] / point_cur[2] + cx, v1 = fy * point_cur[1] / point_cur[2] + cy;
Eigen::Matrix<double, 6, 1> J;
double X = point_cur[0], Y = point_cur[1], Z = point_cur[2];
double Z2 = Z * Z, Z_inv = 1.0 / Z, Z2_inv = Z_inv * Z_inv;
Matrix26d J_pixel_xi;
Eigen::Vector2d J_img_pixel;
J_pixel_xi(0, 0) = fx * Z_inv;
J_pixel_xi(0, 1) = 0;
J_pixel_xi(0, 2) = -fx * X * Z2_inv;
J_pixel_xi(0, 3) = -fx * X * Y * Z2_inv;
J_pixel_xi(0, 4) = fx + fx * X * X * Z2_inv;
J_pixel_xi(0, 5) = -fx * Y * Z_inv;
J_pixel_xi(1, 0) = 0;
J_pixel_xi(1, 1) = fy * Z_inv;
J_pixel_xi(1, 2) = -fy * Y * Z2_inv;
J_pixel_xi(1, 3) = -fy - fy * Y * Y * Z2_inv;
J_pixel_xi(1, 4) = fy * X * Y * Z2_inv;
J_pixel_xi(1, 5) = fy * X * Z_inv;
J_img_pixel = Eigen::Vector2d(
0.5 * (GetPixelValue(_img2, u1 + 1 , v1 ) - GetPixelValue(_img2, u1 - 1 , v1 )),
0.5 * (GetPixelValue(_img2, u1 , v1 + 1 ) - GetPixelValue(_img2, u1 , v1 - 1 ))
);
// total jacobian
J = -1.0 * (J_img_pixel.transpose() * J_pixel_xi).transpose();
_jacobianOplusXi << J[0], J[1], J[2], J[3], J[4], J[5];
}
virtual bool read(istream &in) override {}
virtual bool write(ostream &out) const override {}
private:
const Eigen::Vector2d _px_ref;
const double _depth_ref;
const Mat _img1;
const Mat _img2;
};
int main(int argc, char **argv) {
cv::Mat left_img = cv::imread(left_file, 0);
cv::Mat disparity_img = cv::imread(disparity_file, 0);
// let's randomly pick pixels in the first image and generate some 3d points in the first image's frame
cv::RNG rng;
int nPoints = 2000;
int boarder = 20;
VecVector2d pixels_ref;
vector<double> depth_ref;
// generate pixels in ref and load depth data
for (int i = 0; i < nPoints; i++) {
int x = rng.uniform(boarder, left_img.cols - boarder); // don't pick pixels close to boarder
int y = rng.uniform(boarder, left_img.rows - boarder); // don't pick pixels close to boarder
int disparity = disparity_img.at<uchar>(y, x);
double depth = fx * baseline / disparity; // you know this is disparity to depth
depth_ref.push_back(depth);
pixels_ref.push_back(Eigen::Vector2d(x, y));
}
Sophus::SE3d T;
for (int i = 1; i < 6; i++) { // 1~10
cv::Mat img = cv::imread((fmt_others % i).str(), 0);
// try single layer by uncomment this line
// DirectPoseEstimationSingleLayer(left_img, img, pixels_ref, depth_ref, T_cur_ref);
cout<< "现在是第" << i <<"次"<<endl;
DirectPoseEstimationMultiLayer(left_img, img, pixels_ref, depth_ref, T);
}
return 0;
}
void DirectPoseEstimationSingleLayer(
const cv::Mat &img1,
const cv::Mat &img2,
const VecVector2d &px_ref,
const vector<double> depth_ref,
Sophus::SE3d &T
){
// 构建图优化,先设定g2o
typedef g2o::BlockSolver<g2o::BlockSolverTraits<6, 1>> BlockSolverType;
typedef g2o::LinearSolverDense<BlockSolverType::PoseMatrixType> LinearSolverType; // 线性求解器类型
// 梯度下降方法,可以从GN, LM, DogLeg 中选
auto solver = new g2o::OptimizationAlgorithmGaussNewton(
g2o::make_unique<BlockSolverType>(g2o::make_unique<LinearSolverType>()));
g2o::SparseOptimizer optimizer; // 图模型
optimizer.setAlgorithm(solver); // 设置求解器
optimizer.setVerbose(true); // 打开调试输出
VertexPose *vertex_pose = new VertexPose(); // camera vertex_pose
vertex_pose->setId(0);
vertex_pose->setEstimate(T);
optimizer.addVertex(vertex_pose);
// edges
int index = 1;
//新增部分:第一个相机作为顶点连接的边
for (size_t i = 0; i < px_ref.size(); ++i) {
EdgeProjection *edge = new EdgeProjection(px_ref[i], depth_ref[i], img1, img2);
edge->setId(index);
edge->setVertex(0, vertex_pose);
edge->setMeasurement(GetPixelValue(img1, px_ref[i][0], px_ref[i][1]));
edge->setInformation(Eigen::Matrix<double, 1, 1>::Identity());
optimizer.addEdge(edge);
index++;
}
chrono::steady_clock::time_point t1 = chrono::steady_clock::now();
optimizer.setVerbose(false);
optimizer.initializeOptimization();
optimizer.optimize(10);
chrono::steady_clock::time_point t2 = chrono::steady_clock::now();
chrono::duration<double> time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1);
cout << "optimization costs time: " << time_used.count() << " seconds." << endl;
cout << "pose estimated =\n" << vertex_pose->estimate().matrix() << endl;
T = vertex_pose->estimate();
}
void DirectPoseEstimationMultiLayer(
const cv::Mat &img1,
const cv::Mat &img2,
const VecVector2d &px_ref,
const vector<double> depth_ref,
Sophus::SE3d &T
){
// parameters
int pyramids = 4;
double pyramid_scale = 0.5;
double scales[] = {1.0, 0.5, 0.25, 0.125};
// create pyramids
vector<cv::Mat> pyr1, pyr2; // image pyramids
for (int i = 0; i < pyramids; i++) {
if (i == 0) {
pyr1.push_back(img1);
pyr2.push_back(img2);
} else {
cv::Mat img1_pyr, img2_pyr;
cv::resize(pyr1[i - 1], img1_pyr,
cv::Size(pyr1[i - 1].cols * pyramid_scale, pyr1[i - 1].rows * pyramid_scale));
cv::resize(pyr2[i - 1], img2_pyr,
cv::Size(pyr2[i - 1].cols * pyramid_scale, pyr2[i - 1].rows * pyramid_scale));
pyr1.push_back(img1_pyr);
pyr2.push_back(img2_pyr);
}
}
double fxG = fx, fyG = fy, cxG = cx, cyG = cy; // backup the old values
for (int level = pyramids - 1; level >= 0; level--) {
VecVector2d px_ref_pyr; // set the keypoints in this pyramid level
for (auto &px: px_ref) {
px_ref_pyr.push_back(scales[level] * px);
}
// scale fx, fy, cx, cy in different pyramid levels
fx = fxG * scales[level];
fy = fyG * scales[level];
cx = cxG * scales[level];
cy = cyG * scales[level];
cout<<"pyramid"<< level+1 <<'\t';
DirectPoseEstimationSingleLayer(pyr1[level], pyr2[level], px_ref_pyr, depth_ref, T);
}
}
与示例程序的直接法不同,这里没有考虑窗口内的其他像素点,雅克比矩阵以及残差的计算仅考虑当前像素点,因此Ceres计算得到的平移向量和示例程序结果相差较大,但G2O结果比较接近,一定程度上说明灰度不变这一假设很强,输入点的数量与位置对计算结果有着较大影响。且因为没有进行并行计算加速,计算速度比示例程序慢不少。
运行结果:
G2O
Ceres
输出日志solve pnp 仅打印有误,程序无误。