Ceres优化库使用详解
主要参考文档http://ceres-solver.org/nnls_modeling.html
第一部分
对于下述的带约束的线性最小二乘问题
min
x
1
2
∑
i
ρ
i
(
∥
f
i
(
x
i
1
,
…
,
x
i
k
)
∥
2
)
s.t.
l
j
≤
x
j
≤
u
j
\begin{array}{cl}{\min _{\mathbf{x}}} & {\frac{1}{2} \sum_{i} \rho_{i}\left(\left\|f_{i}\left(x_{i_{1}}, \ldots, x_{i_{k}}\right)\right\|^{2}\right)} \\ {\text { s.t. }} & {l_{j} \leq x_{j} \leq u_{j}}\end{array}
minx s.t. 21∑iρi(∥fi(xi1,…,xik)∥2)lj≤xj≤uj
ρ
i
(
∥
f
i
(
x
i
1
,
…
,
x
i
k
)
∥
2
)
\rho_{i}\left(\left\|f_{i}\left(x_{i_{1}}, \dots, x_{i_{k}}\right)\right\|^{2}\right)
ρi(∥fi(xi1,…,xik)∥2)是ResidualBlock
f
i
(
⋅
)
f_i(\cdot)
fi(⋅)是CostFunction
[
x
i
1
,
.
.
.
,
x
i
k
]
\left[x_{i_1},... , x_{i_k}\right]
[xi1,...,xik]是ParameterBlock
l
j
和
u
j
l_j和u_j
lj和uj是参数块的边界
ρ
i
\rho_i
ρi是LossFunction,比如Huber核函数、Cauchy核函数、高斯核函数等
若
ρ
i
(
x
)
=
x
\rho_i(x) = x
ρi(x)=x、
l
j
=
−
∞
l_j = -\infty
lj=−∞、
u
j
=
∞
u_j=\infty
uj=∞的话就得到了一个非常常见的无约束非线性最小二乘问题。
1
2
∑
i
∥
f
i
(
x
i
1
,
.
.
.
,
x
i
k
)
∥
2
.
\frac{1}{2}\sum_{i} \left\|f_i\left(x_{i_1}, ... ,x_{i_k}\right)\right\|^2.
21i∑∥fi(xi1,...,xik)∥2.
例1.1:
1
2
(
10
−
x
)
2
.
\frac{1}{2}(10 -x)^2.
21(10−x)2.
该问题的最小值是x=10.
(1)第一步是写一个函数计算CostFunction
f
(
x
)
=
10
−
x
f(x) = 10 - x
f(x)=10−x的值
struct CostFunctor {
template <typename T>
bool operator()(const T* const x, T* residual) const {
residual[0] = T(10.0) - x[0];
return true;
}
};
(2)一旦我们有了计算残差函数的方法,现在就可以用它来构造一个非线性最小二乘问题,并让Ceres来进行求解。
int main(int argc, char** argv) {
google::InitGoogleLogging(argv[0]);
// The variable to solve for with its initial value.
double initial_x = 5.0;
double x = initial_x;
// Build the problem.
Problem problem;
// Set up the only cost function (also known as residual). This uses
// auto-differentiation to obtain the derivative (jacobian).
CostFunction* cost_function =
new AutoDiffCostFunction<CostFunctor, 1, 1>(new CostFunctor);
problem.AddResidualBlock(cost_function, NULL, &x);
// Run the solver!
Solver::Options options;
options.linear_solver_type = ceres::DENSE_QR;
options.minimizer_progress_to_stdout = true;
Solver::Summary summary;
Solve(options, &problem, &summary);
std::cout << summary.BriefReport() << "\n";
std::cout << "x : " << initial_x
<< " -> " << x << "\n";
return 0;
}
AutoDiffCostFunction自动对CostFunctor进行求导。
除此之外还有数值求导、解析求导适应任意参数的目标函数
- 数值求导:
struct NumericDiffCostFunctor {
bool operator()(const double* const x, double* residual) const {
residual[0] = 10.0 - x[0];
return true;
}
};
//NumericDiff
CostFunction* cost_function =
new NumericDiffCostFunction<NumericDiffCostFunctor, ceres::CENTRAL, 1, 1>(
new NumericDiffCostFunctor);
problem.AddResidualBlock(cost_function, NULL, &x);
- 解析求导
class QuadraticCostFunction : public ceres::SizedCostFunction<1, 1> {
public:
virtual ~QuadraticCostFunction() {}
virtual bool Evaluate(double const* const* parameters,
double* residuals,
double** jacobians) const {
const double x = parameters[0][0];
residuals[0] = 10 - x;
// Compute the Jacobian if asked for.
if (jacobians != NULL && jacobians[0] != NULL) {
jacobians[0][0] = -1;
}
return true;
}
};
//AnalyticDiff
//QuadraticCostFunction继承自SizedCostFunction,而SizedCostFunction继承自
//CostFunction,因此此语句与上述两种对CostFunction的赋值操作略有不同
CostFunction = new QuadraticCostFunction;
problem.AddResidualBlock(CostFunction,NULL,&x);
例1.2: Powell’s Function
对于变量
x
=
[
x
1
,
x
2
,
x
3
,
x
4
]
x = \left[x_1, x_2, x_3, x_4 \right]
x=[x1,x2,x3,x4]
最小化目标函数
1
2
∥
F
(
x
)
∥
2
\frac{1}{2}\|F(x)\|^2
21∥F(x)∥2
其中
f
1
(
x
)
=
x
1
+
10
x
2
f
2
(
x
)
=
5
(
x
3
−
x
4
)
f
3
(
x
)
=
(
x
2
−
2
x
3
)
2
f
4
(
x
)
=
10
(
x
1
−
x
4
)
2
F
(
x
)
=
[
f
1
(
x
)
,
f
2
(
x
)
,
f
3
(
x
)
,
f
4
(
x
)
]
\begin{aligned} f_1(x) &= x_1 + 10x_2 \\ f_2(x) &= \sqrt{5} (x_3 - x_4)\\ f_3(x) &= (x_2 - 2x_3)^2\\ f_4(x) &= \sqrt{10} (x_1 - x_4)^2\\ F(x) &= \left[f_1(x),\ f_2(x),\ f_3(x),\ f_4(x) \right]\\ \end{aligned}
f1(x)f2(x)f3(x)f4(x)F(x)=x1+10x2=5(x3−x4)=(x2−2x3)2=10(x1−x4)2=[f1(x), f2(x), f3(x), f4(x)]
(1)第一步就是定义四个函数求解目标函数每一项的值。下面是F4
struct F4 {
template <typename T>
bool operator()(const T* const x1, const T* const x4, T* residual) const {
residual[0] = T(sqrt(10.0)) * (x1[0] - x4[0]) * (x1[0] - x4[0]);
return true;
}
};
同样的可以定义F1,F2,F3.
(2)构造优化问题
double x1 = 3.0; double x2 = -1.0; double x3 = 0.0; double x4 = 1.0;
Problem problem;
// Add residual terms to the problem using the using the autodiff
// wrapper to get the derivatives automatically.
problem.AddResidualBlock(
new AutoDiffCostFunction<F1, 1, 1, 1>(new F1), NULL, &x1, &x2);
problem.AddResidualBlock(
new AutoDiffCostFunction<F2, 1, 1, 1>(new F2), NULL, &x3, &x4);
problem.AddResidualBlock(
new AutoDiffCostFunction<F3, 1, 1, 1>(new F3), NULL, &x2, &x3)
problem.AddResidualBlock(
new AutoDiffCostFunction<F4, 1, 1, 1>(new F4), NULL, &x1, &x4);
例子1.3: Curve Fitting
函数形式:
y
=
e
m
x
+
c
.
y = e^{mx + c}.
y=emx+c.
(1)生成数据:
比如设置采样函数中的m=0.3,c=0.1 并添加方差
σ
\sigma
σ=0.2的噪声,生成采样数据。
(2)构造代价函数计算残差
struct ExponentialResidual {
ExponentialResidual(double x, double y)
: x_(x), y_(y) {}
template <typename T>
bool operator()(const T* const m, const T* const c, T* residual) const {
residual[0] = T(y_) - exp(m[0] * T(x_) + c[0]);
return true;
}
private:
// Observations for a sample.
const double x_;
const double y_;
};
(3)对于2n对采样数据(x,y)称之为观测数据,每一对数据都会创建一个代价函数。
double m = 0.0;
double c = 0.0;
Problem problem;
for (int i = 0; i < kNumObservations; ++i) {
CostFunction* cost_function =
new AutoDiffCostFunction<ExponentialResidual, 1, 1, 1>(
new ExponentialResidual(data[2 * i], data[2 * i + 1]));
problem.AddResidualBlock(cost_function, NULL, &m, &c);
}
曲线拟合效果
- Robust Curve Fitting
假如上面的采样数据中包含一些异常值,不服从噪声模型。那么数据拟合出的效果就会偏离groundtruth,效果如下所示
因此,添加鲁棒核函数
将上面程序中
problem.AddResidualBlock(cost_function, NULL , &m, &c);
替换为
problem.AddResidualBlock(cost_function, new CauchyLoss(0.5) , &m, &c);
添加柯西核函数的拟合效果如下
例1.4: Bundle Adjustment
下面是一个求解更加实际的BA优化问题。
(1)第一步构造代价函数
struct SnavelyReprojectionError {
SnavelyReprojectionError(double observed_x, double observed_y)
: observed_x(observed_x), observed_y(observed_y) {}
template <typename T>
bool operator()(const T* const camera,
const T* const point,
T* residuals) const {
// camera[0,1,2] are the angle-axis rotation.
T p[3];
ceres::AngleAxisRotatePoint(camera, point, p);
// camera[3,4,5] are the translation.
p[0] += camera[3]; p[1] += camera[4]; p[2] += camera[5];
// Compute the center of distortion. The sign change comes from
// the camera model that Noah Snavely's Bundler assumes, whereby
// the camera coordinate system has a negative z axis.
T xp = - p[0] / p[2];
T yp = - p[1] / p[2];
// Apply second and fourth order radial distortion.
const T& l1 = camera[7];
const T& l2 = camera[8];
T r2 = xp*xp + yp*yp;
T distortion = T(1.0) + r2 * (l1 + l2 * r2);
// Compute final projected point position.
const T& focal = camera[6];
T predicted_x = focal * distortion * xp;
T predicted_y = focal * distortion * yp;
// The error is the difference between the predicted and observed position.
residuals[0] = predicted_x - T(observed_x);
residuals[1] = predicted_y - T(observed_y);
return true;
}
// Factory to hide the construction of the CostFunction object from
// the client code.
static ceres::CostFunction* Create(const double observed_x,
const double observed_y) {
return (new ceres::AutoDiffCostFunction<SnavelyReprojectionError, 2, 9, 3>(
new SnavelyReprojectionError(observed_x, observed_y)));
}
double observed_x;
double observed_y;
};
其中observed_x,observed_y是观测数据(3D点在像素平面的投影),camera是相机的位姿和部分内参数据,point是3D点。
(2)构造BA优化问题
ceres::Problem problem;
for (int i = 0; i < bal_problem.num_observations(); ++i) {
ceres::CostFunction* cost_function =
SnavelyReprojectionError::Create(
bal_problem.observations()[2 * i + 0],
bal_problem.observations()[2 * i + 1]);
problem.AddResidualBlock(cost_function,
NULL /* squared loss */,
bal_problem.mutable_camera_for_observation(i),
bal_problem.mutable_point_for_observation(i));
}
(3)设置参数进行优化求解
ceres::Solver::Options options;
options.linear_solver_type = ceres::DENSE_SCHUR; //稠密舒尔补求解
options.minimizer_progress_to_stdout = true;
ceres::Solver::Summary summary;
ceres::Solve(options, &problem, &summary);
std::cout << summary.FullReport() << "\n";
第二部分
- CostFunction
CostFunction函数主要负责计算目标函数中的每一项的代价 f i ( x 1 , . . . , x k ) f_i\left(x_{1},...,x_{k}\right) fi(x1,...,xk)以及雅克比矩阵 J i J_i Ji
其中
J i = ∂ ∂ x i f ( x 1 , . . . , x k ) ∀ i ∈ { 1 , … , k } J_i = \frac{\partial}{\partial x_i} f(x_1, ..., x_k) \quad \forall i \in \{1, \ldots, k\} Ji=∂xi∂f(x1,...,xk)∀i∈{1,…,k}
CostFunction类
class CostFunction {
public:
virtual bool Evaluate(double const* const* parameters,
double* residuals,
double** jacobians) = 0;
const vector<int32>& parameter_block_sizes();
int num_residuals() const;
protected:
vector<int32>* mutable_parameter_block_sizes();
void set_num_residuals(int num_residuals);
};
CostFunction类里面的Evaluate函数,计算残差向量和Jocobian矩阵
bool CostFunction::Evaluate(double const *const *parameters, double *residuals, double **jacobians)
参数说明:
(1)parameters是大小为CostFunction::parameter_block_sizes_.size()的数组,parameters[i]是大小为parameter_block_sizes_[i]的数组,其中对应CostFunction所依赖的第i个参数块。
(2)residuals是一个大小为num_residuals_的数组
(3)jacobians是一个大小为CostFunction::parameter_block_sizes_.size()的数组。jacobians[i]是一个大小为num_residuals
×
\times
×parameter_block_sizes_[i]的行主数组。
- SizedCostFunction
如果事先知道残差的维数以及各个参数块的维数,那么这些值可以指定为模板参数,在SizeCostFunction中使用者只需要去实现方法 CostFunction::Evaluate()即可。
template<int kNumResiduals,
int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0,
int N5 = 0, int N6 = 0, int N7 = 0, int N8 = 0, int N9 = 0>
class SizedCostFunction : public CostFunction {
public:
virtual bool Evaluate(double const* const* parameters,
double* residuals,
double** jacobians) const = 0;
};
- AutoDiffCostFunction
定义一个CostFunction或者一个SizedCostFunction比较繁琐和容易出错,特别是当计算导数的时候。因此ceres提供了自动求导的接口。
template <typename CostFunctor,
int kNumResiduals, // Number of residuals, or ceres::DYNAMIC.
int N0, // Number of parameters in block 0.
int N1 = 0, // Number of parameters in block 1.
int N2 = 0, // Number of parameters in block 2.
int N3 = 0, // Number of parameters in block 3.
int N4 = 0, // Number of parameters in block 4.
int N5 = 0, // Number of parameters in block 5.
int N6 = 0, // Number of parameters in block 6.
int N7 = 0, // Number of parameters in block 7.
int N8 = 0, // Number of parameters in block 8.
int N9 = 0> // Number of parameters in block 9.
class AutoDiffCostFunction : public
SizedCostFunction<kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
public:
explicit AutoDiffCostFunction(CostFunctor* functor);
// Ignore the template parameter kNumResiduals and use
// num_residuals instead.
AutoDiffCostFunction(CostFunctor* functor, int num_residuals);
};
对于BA问题,重投影误差为2维,因此事先设定kNumResiduals=2,参数块N0=7为相机姿态(旋转四元数和平移),参数块N1=3(为三维特征点的维数)
例2.1:
目标函数:
1
2
∣
∣
(
k
−
x
⊤
y
)
∣
∣
2
\frac{1}{2}||(k - x^\top y)||^2
21∣∣(k−x⊤y)∣∣2
其中x,y分别是二维向量
(1)第一步
class MyScalarCostFunctor {
MyScalarCostFunctor(double k): k_(k) {}
template <typename T>
bool operator()(const T* const x , const T* const y, T* e) const {
e[0] = k_ - x[0] * y[0] - x[1] * y[1];
return true;
}
private:
double k_;
};
误差的平方由优化框架隐式完成。
(2)第二步
CostFunction* cost_function
= new AutoDiffCostFunction<MyScalarCostFunctor, 1, 2, 2>(
new MyScalarCostFunctor(1.0)); ^ ^ ^
| | |
Dimension of residual ------+ | |
Dimension of x ----------------+ |
Dimension of y -------------------+
AutoDiffCostFunction也支持自动确定残差维度
CostFunction* cost_function
= new AutoDiffCostFunction<MyScalarCostFunctor, DYNAMIC, 2, 2>(
new CostFunctorWithDynamicNumResiduals(1.0), ^ ^ ^
runtime_number_of_residuals); <----+ | | |
| | | |
| | | |
Actual number of residuals ------+ | | |
Indicate dynamic number of residuals --------+ | |
Dimension of x ------------------------------------+ |
Dimension of y ---------------------------------------+
注
意
:
\color{red}注意:
注意:
初学使用AutoDiffCostFunction时,一个常见的错误是错误地设置了大小。特别是,有一种倾向是将模板参数设置为(残差尺寸、参数数量),而不是为每个参数块传递一个尺寸参数。在上面的例子中,若<MyScalarCostFunction, 1,2 >,则它缺少最后一个模板参数2。
第三部分
LossFunction
常见的鲁棒核函数
-
TrivialLoss
ρ ( s ) = s \rho(s) = s ρ(s)=s -
HuberLoss
ρ ( s ) = { s s ≤ 1 2 s − 1 s > 1 \rho(s)=\left\{\begin{array}{ll}{s} & {s \leq 1} \\ {2 \sqrt{s}-1} & {s>1}\end{array}\right. ρ(s)={s2s−1s≤1s>1 -
SoftLOneLoss
ρ ( s ) = 2 ( 1 + s − 1 ) \rho(s) = 2 (\sqrt{1+s} - 1) ρ(s)=2(1+s−1) -
CauchyLoss
ρ ( s ) = log ( 1 + s ) \rho(s) = \log(1 + s) ρ(s)=log(1+s) -
ArctanLoss
ρ ( s ) = arctan ( s ) \rho(s) = \arctan(s) ρ(s)=arctan(s) -
TolerantLoss
ρ ( s , a , b ) = b log ( 1 + e ( s − a ) / b ) − b log ( 1 + e − a / b ) \rho(s,a,b) = b \log(1 + e^{(s - a) / b}) - b \log(1 + e^{-a / b}) ρ(s,a,b)=blog(1+e(s−a)/b)−blog(1+e−a/b)
第四部分
LocalParameterization
class LocalParameterization {
public:
virtual ~LocalParameterization() {}
virtual bool Plus(const double* x,
const double* delta,
double* x_plus_delta) const = 0;
virtual bool ComputeJacobian(const double* x, double* jacobian) const = 0;
virtual bool MultiplyByJacobian(const double* x,
const int num_rows,
const double* global_matrix,
double* local_matrix) const;
virtual int GlobalSize() const = 0;
virtual int LocalSize() const = 0;
};
这部分的讲解参考博客 https://blog.csdn.net/hzwwpgmwy/article/details/86490556
https://blog.csdn.net/sanshixionglueluelue/article/details/81037791
简单的讲,对于位姿优化的过程中由于旋转的更新不能直接加,因此需要重新定义加法,就需要用到这个参数化的过程。其中GlobalSize表示的是参数的维度比如相机的pose是个7维的,而LocalSize则表示参数实际表示的维度是6维的(因为4维的四元数表示3自由度的旋转)。