计算机视觉大型攻略 —— 视觉里程计(2) libviso2

最新推荐文章于 2023-10-30 22:29:56 发布

linusyue

最新推荐文章于 2023-10-30 22:29:56 发布

阅读量2.9k

点赞数 8

分类专栏：视觉定位文章标签： libviso 视觉里程计 visual odometry 计算机视觉 viso

本文链接：https://blog.csdn.net/plateros/article/details/103402793

版权

视觉定位专栏收录该内容

9 篇文章

订阅专栏

参考文献:

[1] Visual Odometry based on Stereo Image Sequences with RANSAC-based Outlier Rejection Scheme

[2] StereoScan: Dense 3d Reconstruction in Real-time

libviso2是AVG(Autonomous Vision Group)小组的开源项目。项目网址：http://www.cvlibs.net/software/libviso/

该项目实现了文献[1], [2]的视觉里程计和3D重建算法。本文主要分析视觉里程计(Visual Odometry, 以下简称VO)的部分，前半部分算法说明思路，后半部分给出了一些关键代码的注释。

更多关于VO的基础知识，请参考下面这篇文章。

传送门：视觉里程计综述

通常，VO算法分为如下几个步骤，

输入图像序列
特征点检测与匹配
运动估计
本地优化

文献[1], [2]设计了一种基于双目摄像机的算法，因此计算VO时需要四张图片——当前帧与前一帧的左，右图，而且这四张图已经做了立体校正。算法首先在这四张图的范围里寻找特征点并做匹配，再通过3D投影关系做运动估计，最后通过Kalman滤波做简单优化。libviso2实现了上述算法，对于单目相机，也给出了实现。以下内容主要解释了双目相机算法。

传送门：双目相机标定与校正

特征与匹配

如下图，libviso2采用了5*5的blob和corner掩码寻找特征点。

图(a)为Blob掩码。这是一个高斯拉普拉斯算子。图（b）是角点掩码。算法保留两种算子的局部最大和最小值作为特征点(blob max, blob min, corner max, corner min)。图(c)是特征点的特征描述符，保留了图示布局的梯度值(Sobel滤波)。

与SIFT，SURF相比, libviso2算法的特征与特征描述符看上去比较简单，并没有着重关注旋转/尺度不变性。这是因为作者认为相机自运动是平滑的刚体运动，不同图像之间特征的尺度，旋转变化不明显。更多关于特征值的内容，可移步我的专栏文章。

传送门: 特征与匹配专栏

特征匹配采用SAD算法。计算图(c)所示11*11窗口内特定位置的Sobel值的差的和。

匹配算法采用了"circle"策略，即首先选取当前帧左图图像的特征点，去前一帧的左图图像搜索best match，再去前一帧右图搜索best match，之后去当前帧右图搜索best match，再回到当前帧左图搜索特征点。

如果最终找到的特征点，与初始特征点满足"一致"的条件，作为内点保留，然后进一步计算sub-pixel坐标。

运动估计

在做运动估计计算之前，首先采用Bucket策略进一步减少或者说均衡特征点。图像被均分为很多长方形区域。每个区域限制特征点个数的最大值。

Bucketing策略

如上图，黄色实线为长方形的分割线。这样做会带来几个优点，

减少特征点提高效率。
均衡z轴(深度方向)的特征点分布，远近特征点均有机会被运动估计算法采用，从而提高了估计精度。
图像特征分布更均匀。考虑在动态场景中，大量的特征点可能位于独立的运动物体上。Bucketing之后，可以保证总有特征点不在这些运动的物体上，~~除非满眼都是运动物体~~。

运动估计算法

考虑相机模型，

上面的公式已知相机坐标系的3D坐标(x, y, z)转成2D的图像坐标(u, v)

相机已经矫正过了，所以可以认为f= fx = fy，斜角参数为0。s 为基线。由于采用左侧相机为坐标原点，因此右侧相机需要减掉基线长度。求解最优RT，可以建模为最优化如下的代价函数。

公式(2)，前半部分是左图的代价函数，后半部分为右图的代价函数。以前半部分为例，解释一下。设 $P_{i}, P_{i}^{'}$ 分别为当前帧和前一帧左图上的匹配点。 $P_{i}$ 的2维图像坐标为 $x_{i}^{(l)}$ ， $X_{i}$ 是 $P_{i}^{'}$ （前一帧的特征点）在相机坐标系下的三维坐标(x, y, z)，通过(r, t)和投影函数π，可以计算得到该坐标投影到当前帧上的图像坐标。该坐标与原先的坐标 $P_{i}$ 的差异，构成了代价函数。需要优化一个最优的(r, t)使这个差异最小。作者采用了牛顿高斯法来解决这个问题。

局部优化

由于匹配算法本身存在的误差，运动物体的干扰，VO算法会采用RASAC算法去除外点。libviso2同样采用了此算法。

RANSAC算法

从所有匹配点中，随机挑选3组点对。
计算刚体运动变换和内点个数，保留内点。
重复N次
采用含最多内点的组重新估计运动。

之所以随机挑选3对，是因为刚体运动条件下，3对是最小需要的独立点对数。

最后，该算法采用经典Kalman滤波，使用匀加速模型进一步优化最终结果。

完整的程序框架，伪代码形式如下。

//伪代码

[imgL0, imgR0, imgL1, imgR1] = getIimages(); //输入图像序列

FeaturePoints = matching(imgL0, imgR0, imgL1, imgR1);  //特征匹配
FeaturePoints = buckeingFeatures(FeaturePoints ); //Bucketing特征点

//ransac
for ransac_max //最大迭代次数
{
    candiPoints= randomSelect(FeaturePoints )  //随机挑选3个点
    [r, t] = minimize(candiPoints);     //牛顿高斯迭代优化
    [inliers, nums] = getInliers //计算内点个数，并保留内点
}

maxInliers = max_inlier(inliers, nums) //挑选内点最多的那一组
[r t] = minimize(maxInliers)  //用这些内点重新计算R, T

libviso2为保证实时性，并没有采用任何Pose-Graph或者BA优化处理，对积累误差的处理可能是不够的。采用卡尔曼滤波确实可以根据运动模型减少误差，但是对于积累误差的改善应该不大。另外，考虑一个极限情况，在某一帧，满屏都是运动物体（其实这种情况很常见)。此时，无论怎么优化，都很难得到正确的结果，因此还需要Keyframe的介入。因此libviso2算法的结果应该说是性能和效果的一个平衡。

代码

程序完整代码可以由项目网站下载。http://www.cvlibs.net/download.php?file=libviso2.zip

VO算法主体由下面process函数实现，

 33 bool VisualOdometryStereo::process (uint8_t *I1,uint8_t *I2,int32_t* dims,bool replace) {
 34 
 35   // push back images
 36   matcher->pushBack(I1,I2,dims,replace);
 37 
 38   // bootstrap motion estimate if invalid
 39   if (!Tr_valid) {
 40     matcher->matchFeatures(2);
 41     matcher->bucketFeatures(param.bucket.max_features,param.bucket.bucket_width,param.bucket.buck    et_height);
 42     p_matched = matcher->getMatches();
 43     updateMotion();
 44   }
 45 
 46   // match features and update motion
 47   if (Tr_valid) matcher->matchFeatures(2,&Tr_delta);
 48   else          matcher->matchFeatures(2);
 49   matcher->bucketFeatures(param.bucket.max_features,param.bucket.bucket_width,param.bucket.bucket    _height);
 50   p_matched = matcher->getMatches();
 51   return updateMotion();
 52 }

特征点计算computeFeatures在第36行 pushBack 中调用。

  95 void Matcher::pushBack (uint8_t *I1,uint8_t* I2,int32_t* dims,const bool replace) {
  96 
  97   // image dimensions
  98   int32_t width  = dims[0];
  99   int32_t height = dims[1];
 100   int32_t bpl    = dims[2];
       ...
       
 177   // compute new features for current frame
 178   computeFeatures(I1c,dims_c,m1c1,n1c1,m1c2,n1c2,I1c_du,I1c_dv,I1c_du_full,I1c_dv_full);
 179   if (I2!=0)
 180     computeFeatures(I2c,dims_c,m2c1,n2c1,m2c2,n2c2,I2c_du,I2c_dv,I2c_du_full,I2c_dv_full);
 181 }

computeFeatures函数对输入图像做sobel, blob, corner滤波。挑出blob(I_f1)和corner(I_f2)响应函数的极值，用其sobel值结果I_du, I_dv)计算特征描述符。

 649 void Matcher::computeFeatures (uint8_t *I,const int32_t* dims,int32_t* &max1,int32_t &num1,int32     _t* &max2,int32_t &num2,uint8_t* &I_du,uint8_t* &I_dv,uint8_t* &I_du_full,uint8_t* &I_dv_full) {
 650 
 651   int16_t *I_f1;
 652   int16_t *I_f2;
 653 
 654   int32_t dims_matching[3];
 655   memcpy(dims_matching,dims,3*sizeof(int32_t));
 656 
 657   // allocate memory for sobel images and filter images
 658   if (!param.half_resolution) {
 659     I_du = (uint8_t*)_mm_malloc(dims[2]*dims[1]*sizeof(uint8_t*),16);
 660     I_dv = (uint8_t*)_mm_malloc(dims[2]*dims[1]*sizeof(uint8_t*),16);
 661     I_f1 = (int16_t*)_mm_malloc(dims[2]*dims[1]*sizeof(int16_t),16);
 662     I_f2 = (int16_t*)_mm_malloc(dims[2]*dims[1]*sizeof(int16_t),16);
 663     filter::sobel5x5(I,I_du,I_dv,dims[2],dims[1]);
 664     filter::blob5x5(I,I_f1,dims[2],dims[1]);
 665     filter::checkerboard5x5(I,I_f2,dims[2],dims[1]);
 666   } else {
         ...
 680   }
 681 
 682   // extract sparse maxima (1st pass) via non-maximum suppression
 683   vector<Matcher::maximum> maxima1;
 684   if (param.multi_stage) {
 685     int32_t nms_n_sparse = param.nms_n*3;
 686     if (nms_n_sparse>10)
 687       nms_n_sparse = max(param.nms_n,10);
 688     nonMaximumSuppression(I_f1,I_f2,dims_matching,maxima1,nms_n_sparse);
 689     computeDescriptors(I_du,I_dv,dims_matching[2],maxima1);
 690   }
 691
 692   // extract dense maxima (2nd pass) via non-maximum suppression
 693   vector<Matcher::maximum> maxima2;
 694   nonMaximumSuppression(I_f1,I_f2,dims_matching,maxima2,param.nms_n);
 695   computeDescriptors(I_du,I_dv,dims_matching[2],maxima2);

matchFeatures实现了特征匹配。match函数实现了“circle"匹配策略。

 183 void Matcher::matchFeatures(int32_t method, Matrix *Tr_delta) {
...
 219   if (param.multi_stage) {                                                                       
 220     
 221     // 1st pass (sparse matches)
 222     matching(m1p1,m2p1,m1c1,m2c1,n1p1,n2p1,n1c1,n2c1,p_matched_1,method,false,Tr_delta);         
 223     removeOutliers(p_matched_1,method);                                                          
 224     
 225     // compute search range prior statistics (used for speeding up 2nd pass)                     
 226     computePriorStatistics(p_matched_1,method);                                                  
 227     
 228     // 2nd pass (dense matches)
 229     matching(m1p2,m2p2,m1c2,m2c2,n1p2,n2p2,n1c2,n2c2,p_matched_2,method,true,Tr_delta);          
 230     if (param.refinement>0)
 231       refinement(p_matched_2,method);
 232     removeOutliers(p_matched_2,method);                                                          
 233   
 234   // single pass matching                                                                        
 235   } else {
 236     matching(m1p2,m2p2,m1c2,m2c2,n1p2,n2p2,n1c2,n2c2,p_matched_2,method,false,Tr_delta);         
 237     if (param.refinement>0)
 238       refinement(p_matched_2,method);
 239     removeOutliers(p_matched_2,method);                                                          
 240   }

matcher->bucketFeatures 实现了Bucketing均匀采样。注意，Bucket大小固定，但是并不是在原图上划分Bucket，而是在(0, 0) 到(vmax, umax)的区域里划分。vmax, umax分别为匹配点的最大行数，列数。为进一步保证随机性，函数在选择那些特征点保留时，采用了random_shuffle，随机取。

 243 void Matcher::bucketFeatures(int32_t max_features,float bucket_width,float bucket_height) {
 244 
 245   // find max values
 246   float u_max = 0;
 247   float v_max = 0;
 248   for (vector<p_match>::iterator it = p_matched_2.begin(); it!=p_matched_2.end(); it++) {
 249     if (it->u1c>u_max) u_max=it->u1c;
 250     if (it->v1c>v_max) v_max=it->v1c;
 251   }
 252 
 253   // allocate number of buckets needed
 254   int32_t bucket_cols = (int32_t)floor(u_max/bucket_width)+1;
 255   int32_t bucket_rows = (int32_t)floor(v_max/bucket_height)+1;
 256   vector<p_match> *buckets = new vector<p_match>[bucket_cols*bucket_rows];
 257 
 258   // assign matches to their buckets
 259   for (vector<p_match>::iterator it=p_matched_2.begin(); it!=p_matched_2.end(); it++) {
 260     int32_t u = (int32_t)floor(it->u1c/bucket_width);
 261     int32_t v = (int32_t)floor(it->v1c/bucket_height);
 262     buckets[v*bucket_cols+u].push_back(*it);
 263   }
 264  
 265   // refill p_matched from buckets
 266   p_matched_2.clear();
 267   for (int32_t i=0; i<bucket_cols*bucket_rows; i++) {
 268 
 269     // shuffle bucket indices randomly
 270     std::random_shuffle(buckets[i].begin(),buckets[i].end());
 271 
 272     // add up to max_features features from this bucket to p_matched
 273     int32_t k=0;
 274     for (vector<p_match>::iterator it=buckets[i].begin(); it!=buckets[i].end(); it++) {
 275       p_matched_2.push_back(*it);
 276       k++;
 277       if (k>=max_features)
 278         break;
 279     }
 280   }

updateMotion调用estimateMotion(p_matched)，迭代计算R, T。该函数有两个循环，第一个循环是RANSAC迭代次数的循环，

第二个循环（110行）是通过牛顿高斯法优化上文的代价函数，根据三个点求r ,t。getInlier函数获得此r, t的所有内点。RANSAC次数结束后，取最多内点的r, t。用其所有内点重新优化，求得最终的r, t。

 54 vector<double> VisualOdometryStereo::estimateMotion (vector<Matcher::p_match> p_matched) {        
  ....                                                            
 81   // project matches of previous image into 3d                                                    
 82   for (int32_t i=0; i<N; i++) {
 83     double d = max(p_matched[i].u1p - p_matched[i].u2p,0.0001f);                                  
 84     X[i] = (p_matched[i].u1p-param.calib.cu)*param.base/d;
 85     Y[i] = (p_matched[i].v1p-param.calib.cv)*param.base/d;                                        
 86     Z[i] = param.calib.f*param.base/d;
 87   } 
....
 97   // initial RANSAC estimate                                                                      
 98   for (int32_t k=0;k<param.ransac_iters;k++) {                                                    
 99   
100     // draw random sample set
101     vector<int32_t> active = getRandomSample(N,3);
...
107     // minimize reprojection errors
108     VisualOdometryStereo::result result = UPDATED;
109     int32_t iter=0;
110     while (result==UPDATED) {
111       result = updateParameters(p_matched,active,tr_delta_curr,1,1e-6);
112       if (iter++ > 20 || result==CONVERGED)
113         break;
114     }
115 
116     // overwrite best parameters if we have more inliers
117     if (result!=FAILED) {
118       vector<int32_t> inliers_curr = getInlier(p_matched,tr_delta_curr);
119       if (inliers_curr.size()>inliers.size()) {
120         inliers = inliers_curr;
121         tr_delta = tr_delta_curr;
122       }
123     }
124   }
125 
126   // final optimization (refinement)
127   if (inliers.size()>=6) {
128     int32_t iter=0;
129     VisualOdometryStereo::result result = UPDATED;
130     while (result==UPDATED) {
131       result = updateParameters(p_matched,inliers,tr_delta,1,1e-8);
132       if (iter++ > 100 || result==CONVERGED)
133         break;
134     }
135

updateParameters函数实现了高斯牛顿法迭代优化。其中函数computeResidualAndJacobian计算高斯牛顿法所需的参差矩阵和雅克比矩阵。

179 VisualOdometryStereo::result VisualOdometryStereo::updateParameters(vector<Matcher::p_match> &p_matched,vector<int32_t> &active,vector<double> &tr,double step_size,double eps) {                       
180   
181   // we need at least 3 observations
182   if (active.size()<3)                                                                            
183     return FAILED;
184   
185   // extract observations and compute predictions                                                 
186   computeObservations(p_matched,active);                                                          
187   computeResidualsAndJacobian(tr,active);
188   
189   // init
190   Matrix A(6,6);                                                                                  
191   Matrix B(6,1);                                                                                  
192   
193   // fill matrices A and B                                                                        
194   for (int32_t m=0; m<6; m++) {                                                                   
195     for (int32_t n=0; n<6; n++) {                                                                 
196       double a = 0; 
197       for (int32_t i=0; i<4*(int32_t)active.size(); i++) {                                        
198         a += J[i*6+m]*J[i*6+n];                                                                   
199       }
200       A.val[m][n] = a;
201     } 
202     double b = 0; 
203     for (int32_t i=0; i<4*(int32_t)active.size(); i++) {                                          
204       b += J[i*6+m]*(p_residual[i]);                                                              
205     }
206     B.val[m][0] = b;
207   }