第九讲:后端1
1 概述
1.1 贝叶斯滤波的推导
1.3.1 结论
b e l ( x t ) bel(x_t) bel(xt)是后验概率, b e l ‾ ( x t ) \overline{bel}(x_t) bel(xt)是预测概率(作为先验),z是观测,u是里程计输入,通过当前的事件分布来估计参数x(位姿、路标点)
b e l ( x t ) = p ( x t ∣ z 1 : t , u 1 : t ) = η p ( z t ∣ x t ) b e l ‾ ( x t ) = η p ( z t ∣ x t ) ∫ p ( x t ∣ x t − 1 , u t ) p ( x t − 1 ∣ z 1. t − 1 , u 1. t − 1 ) d x t − 1 bel(x_t)=p(x_{t}|z_{1:t},u_{1:t})=\eta p(z_t|x_t)\overline{bel}(x_t)=\eta p(z_t|x_t)\int p(x_t|x_{t-1},u_t)p(x_{t-1}|z_{1.t-1},u_{1.t-1})dx_{t-1} bel(xt)=p(xt∣z1:t,u1:t)=ηp(zt∣xt)bel(xt)=ηp(zt∣xt)∫p(xt∣xt−1,ut)p(xt−1∣z1.t−1,u1.t−1)dxt−1
功能:已知状态量
t − 1
时刻的概率分布,在给定t
时刻的观测数据 ( z t , u t ) (z_{t},u_{t}) (zt,ut)的情况下估计出状态量在𝑢
时刻的概率分布
贝叶斯滤波流程
1.3.2 具体推导过程
第一步:贝叶斯公式
p ( x t ∣ z t ) = p ( z t ∣ x ) p ( x t ) p ( z t ) = η p ( z t ∣ x t ) p ( x t ) p(x_t|z_{t})=\frac{p(z_{t}|x)p(x_t)}{p(z_{t})}=\eta p(z_{t}|x_t)p(x_t) p(xt∣zt)=p(zt)p(zt∣x)p(xt)=ηp(zt∣xt)p(xt)
第二步: 引入条件 z 1 : t − 1 , u 1 : t z_{1:t-1},u_{1:t} z1:t−1,u1:t,条件贝叶斯
p ( x t ∣ z 1 : t , u 1 : t ) = p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) p ( x t ∣ z 1 : t − 1 , u 1 : t ) p ( z t ∣ z 1 : t − 1 , u 1 : t ) = η p ( z t ∣ x t , z 1 : t − 1 , u 1 : t ) p ( x t ∣ z 1 : t − 1 , u 1 : t ) = η p ( z t ∣ x t ) p ( x t ∣ z 1 : t − 1 , u 1 : t ) \begin{aligned} p(x_{t}|z_{1:t},u_{1:t})& =\frac{p(z_{t}|x_{t},z_{1:t-1},u_{1:t})p(x_{t}|z_{1:t-1},u_{1:t})}{p(z_{t}|z_{1:t-1},u_{1:t})} \\ &=\eta p(z_{t}|x_{t},z_{1:t-1},u_{1:t})p(x_{t}|z_{1:t-1},u_{1:t})\\ &=\eta p(z_t|x_t)p(x_{t}|z_{1:t-1},u_{1:t}) \end{aligned} p(xt∣z1:t,u1:t)=p(zt∣z1:t−1,u1:t)p(zt∣xt,z1:t−1,u1:t)p(xt∣z1:t−1,u1:t)=ηp(zt∣xt,z1:t−1,u1:t)p(xt∣z1:t−1,u1:t)=ηp(zt∣xt)p(xt∣z1:t−1,u1:t)
其中(当前的观测z与当前的位姿x构成关系,与引入条件(之前的观测和u)无关,独立;而当前的位姿x与之前的观测等等都有关系)
p
(
z
t
∣
x
t
,
z
1
:
t
−
1
,
u
1
:
t
)
=
p
(
z
t
∣
x
t
)
\begin{aligned}&p(z_t|x_t,z_{1:t-1},u_{1:t})=p(z_t|x_t)\\\end{aligned}
p(zt∣xt,z1:t−1,u1:t)=p(zt∣xt)
第三步:全概率公式 p ( x t ) = ∫ p ( x t ∣ x t − 1 ) p ( x t − 1 ) d y \mathrm{p}(x_t)=\int p(x_t|x_{t-1})p(x_{t-1})dy p(xt)=∫p(xt∣xt−1)p(xt−1)dy,加上条件即下式
p ( x t ∣ z 1 : t − 1 , u 1 : t ) = ∫ p ( x t ∣ x t − 1 , z 1 : t − 1 , u 1 : t ) p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t ) d x t − 1 = ∫ p ( x t ∣ x t − 1 , u t ) p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t − 1 ) d x t − 1 \begin{aligned}p(x_t|z_{1:t-1},u_{1:t})&=\int p(x_t|x_{t-1},z_{1:t-1},u_{1:t})p(x_{t-1}|z_{1:t-1},u_{1:t})dx_{t-1}\\&=\int p(x_t|x_{t-1},u_t)p(x_{t-1}|z_{1:t-1},u_{1:t-1})dx_{t-1}\end{aligned} p(xt∣z1:t−1,u1:t)=∫p(xt∣xt−1,z1:t−1,u1:t)p(xt−1∣z1:t−1,u1:t)dxt−1=∫p(xt∣xt−1,ut)p(xt−1∣z1:t−1,u1:t−1)dxt−1
注意 p ( x t ∣ x t − 1 ) p(x_t|x_{t-1}) p(xt∣xt−1)这两个时刻的位姿关系和前面的观测 z 1 : t − 1 z_{1:t-1} z1:t−1无关,与 u 1 : t − 1 u_{1:t-1} u1:t−1也无关,与 u t u_t ut有关,因为这里描述的是两个时刻间的位姿关系; p ( x t − 1 ) p(x_{t-1}) p(xt−1)与之前的观测输入
1:t-1
都有关,因为当前时刻的位姿是不断累积造成的。
第四步:令 b e l ( x t ) bel(x_t) bel(xt)为后验概率分布, b e l ‾ ( x t ) \overline{bel}(x_t) bel(xt)为预测概率分布(先验)
b e l ( x t ) = p ( x t ∣ z 1 : t , u 1 : t ) = η p ( z t ∣ x t ) b e l ‾ ( x t ) b e l ‾ ( x t ) = p ( x t ∣ z 1 : t − 1 , u 1 : t ) = b e l ‾ ( x t ) = ∫ p ( x t ∣ x t − 1 , u t ) p ( x t − 1 ∣ z 1 : t − 1 , u 1 : t − 1 ) d x t − 1 \begin{aligned}bel(x_t)&=p(x_t|z_{1:t},u_{1:t})=\eta p(z_t|x_t)\overline{bel}(x_t)\\\overline{bel}(x_t)&=p(x_t|z_{1:t-1},u_{1:t})=\overline{bel}(x_t)=\int p(x_t|x_{t-1},u_{t})p(x_{t-1}|z_{1:t-1},u_{1:t-1})dx_{t-1}\end{aligned} bel(xt)bel(xt)=p(xt∣z1:t,u1:t)=ηp(zt∣xt)bel(xt)=p(xt∣z1:t−1,u1:t)=bel(xt)=∫p(xt∣xt−1,ut)p(xt−1∣z1:t−1,u1:t−1)dxt−1
第五步:最终公式
b e l ( x t ) = p ( x t ∣ z 1 : t , u 1 : t ) = η p ( z t ∣ x t ) b e l ‾ ( x t ) = η p ( z t ∣ x t ) ∫ p ( x t ∣ x t − 1 , u t ) p ( x t − 1 ∣ z 1. t − 1 , u 1. t − 1 ) d x t − 1 bel(x_t)=p(x_{t}|z_{1:t},u_{1:t})=\eta p(z_t|x_t)\overline{bel}(x_t)=\eta p(z_t|x_t)\int p(x_t|x_{t-1},u_t)p(x_{t-1}|z_{1.t-1},u_{1.t-1})dx_{t-1} bel(xt)=p(xt∣z1:t,u1:t)=ηp(zt∣xt)bel(xt)=ηp(zt∣xt)∫p(xt∣xt−1,ut)p(xt−1∣z1.t−1,u1.t−1)dxt−1
总结:首先利用t-1时刻的位姿概率分布,预测t时刻的位姿概率分布,作为先验,然后结合激光/视觉观测模型,得到当前时刻即t时刻的位姿等参数概率分布。
1.2 状态估计的概率解释
SLAM过程可以由运动方程和观测方程来描述。一般来讲,观测>>运动方程。每个方程中变量都会受到噪声影响,所以这里位姿x和路标y不能看作一个数,而是服从某种概率分布。
{ x k = f ( x k − 1 , u k ) + w k z k , j = h ( y j , x k ) + v k , j k = 1 , … , N , j = 1 , … , M . \begin{cases}x_k=f\left(\boldsymbol{x}_{k-1},\boldsymbol{u}_k\right)+\boldsymbol{w}_k\\\boldsymbol{z}_{k,j}=h\left(\boldsymbol{y}_j,\boldsymbol{x}_k\right)+\boldsymbol{v}_{k,j}\end{cases}\quad k=1,\ldots,N,j=1,\ldots,M. {xk=f(xk−1,uk)+wkzk,j=h(yj,xk)+vk,jk=1,…,N,j=1,…,M.
在十四讲中,令 x k x_k xk表示当前时刻的相机位姿和路标点变量,把k时刻的观测记为 z k z_k zk,就可以得到下面新的方程:
x k = d e f { x k , y 1 , … , y m } . x_k\overset{\mathrm{def}}{\operatorname*{=}}\{x_k,y_1,\ldots,y_m\}. xk=def{xk,y1,…,ym}.
{ x k = f ( x k − 1 , u k ) + w k z k = h ( x k ) + v k k = 1 , … , N . \begin{cases}\boldsymbol{x}_k=f\left(\boldsymbol{x}_{k-1},\boldsymbol{u}_k\right)+\boldsymbol{w}_k\\\boldsymbol{z}_k=h\left(\boldsymbol{x}_k\right)+\boldsymbol{v}_k&\end{cases}\quad k=1,\ldots,N. {xk=f(xk−1,uk)+wkzk=h(xk)+vkk=1,…,N.
考虑第k时刻的情况,希望利用前面0-k时刻的数据来估计当前的状态分布(渐进式)
P
(
x
k
∣
x
0
,
u
1
:
k
,
z
1
:
k
)
.
P(\boldsymbol{x}_k|\boldsymbol{x}_0,\boldsymbol{u}_{1:k},\boldsymbol{z}_{1:k}).
P(xk∣x0,u1:k,z1:k).
利用1.1推导的贝叶斯滤波公式,就可以得到下面公式:
P
(
x
k
∣
x
0
,
u
1
:
k
,
z
1
:
k
)
∝
P
(
z
k
∣
x
k
)
P
(
x
k
∣
x
0
,
u
1
:
k
,
z
1
:
k
−
1
)
.
P\left(\boldsymbol{x}_{k}|\boldsymbol{x}_{0},\boldsymbol{u}_{1:k},\boldsymbol{z}_{1:k}\right)\propto P\left(\boldsymbol{z}_{k}|\boldsymbol{x}_{k}\right)P\left(\boldsymbol{x}_{k}|\boldsymbol{x}_{0},\boldsymbol{u}_{1:k},\boldsymbol{z}_{1:k-1}\right).
P(xk∣x0,u1:k,z1:k)∝P(zk∣xk)P(xk∣x0,u1:k,z1:k−1).
其中(这里是书上的公式,实际上可以按1.1推导简化):
P
(
x
k
∣
x
0
,
u
1
:
k
,
z
1
:
k
−
1
)
=
∫
P
(
x
k
∣
x
k
−
1
,
x
0
,
u
1
:
k
,
z
1
:
k
−
1
)
P
(
x
k
−
1
∣
x
0
,
u
1
:
k
,
z
1
:
k
−
1
)
d
x
k
−
1
.
P\left(\boldsymbol{x}_k|\boldsymbol{x}_0,\boldsymbol{u}_{1:k},\boldsymbol{z}_{1:k-1}\right)=\int P\left(\boldsymbol{x}_k|\boldsymbol{x}_{k-1},\boldsymbol{x}_0,\boldsymbol{u}_{1:k},\boldsymbol{z}_{1:k-1}\right)P\left(\boldsymbol{x}_{k-1}|\boldsymbol{x}_0,\boldsymbol{u}_{1:k},\boldsymbol{z}_{1:k-1}\right)\mathrm{d}\boldsymbol{x}_{k-1}.
P(xk∣x0,u1:k,z1:k−1)=∫P(xk∣xk−1,x0,u1:k,z1:k−1)P(xk−1∣x0,u1:k,z1:k−1)dxk−1.
如果考虑更久之前的状态,也可以继续对此式进行展开,但现在我们只关心k
时刻和k-1
时刻的情况。
对这一步的后续处理,存两种选择:
一种方法是假设马尔可夫性,简单的一阶马氏性认为,k
时刻状态只与k-1
时刻状态有关,而与再之前的无关。如果做出这样的假设,我们就会得到以**扩展卡尔曼滤波(EKF)**为代表的滤波器方法。在滤波方法中,我们会从某时刻的状态估计,推导到下一个时刻。
另一种方法是依然考虑k时刻状态与之前所有状态的关系,此时将得到非线性优化为主体的优化框架。
1.3 线性系统和KF
1.3.1 线性系统
根据马尔可夫性,当前时刻的状态只和上一个时刻有关,对书中的先验公式(预测)右边部分进行简化:
还是参考1.1推导,首先k-1
时刻状态到k
时刻状态与之前的输入u和观测皆无关(条件概率,k-1
时刻状态为条件)
P ( x k ∣ x k − 1 , x 0 , u 1 : k , z 1 : k − 1 ) = P ( x k ∣ x k − 1 , u k ) . P\left(\boldsymbol{x}_k|\boldsymbol{x}_{k-1},\boldsymbol{x}_0,\boldsymbol{u}_{1:k},\boldsymbol{z}_{1:k-1}\right)=P\left(\boldsymbol{x}_k|\boldsymbol{x}_{k-1},\boldsymbol{u}_k\right). P(xk∣xk−1,x0,u1:k,z1:k−1)=P(xk∣xk−1,uk).
其次,k-1
时刻的状态受到前面所有时刻的位姿、输入等影响,去掉
u
k
u_k
uk(先验概率)
P
(
x
k
−
1
∣
x
0
,
u
1
:
k
,
z
1
:
k
−
1
)
=
P
(
x
k
−
1
∣
x
0
,
u
1
:
k
−
1
,
z
1
:
k
−
1
)
.
P\left(\boldsymbol{x}_{k-1}|\boldsymbol{x}_0,\boldsymbol{u}_{1:k},\boldsymbol{z}_{1:k-1}\right)=P\left(\boldsymbol{x}_{k-1}|\boldsymbol{x}_0,\boldsymbol{u}_{1:k-1},\boldsymbol{z}_{1:k-1}\right).
P(xk−1∣x0,u1:k,z1:k−1)=P(xk−1∣x0,u1:k−1,z1:k−1).
于是,这一系列方程说明,我们实际在做的是“如何把k-1时刻的状态分布推导至k时刻”这样一件事。
也就是说,在程序运行期间,我们只要维护一个状态量,对它不断地进行迭代和更新即可。如果假设状态量服从高斯分布,那么我们只需考虑维护状态量的均值和协方差。
下面推导线性高斯系统,实际上就是线性系统的一种(线性系统理论中很详细讲解),线性高斯系统是指,运动方程和观测方程可以由线性方程来描述:
{
x
k
=
A
k
x
k
−
1
+
u
k
+
w
k
z
k
=
C
k
x
k
+
v
k
k
=
1
,
…
,
N
.
\begin{cases}x_k=\boldsymbol{A}_k\boldsymbol{x}_{k-1}+\boldsymbol{u}_k+\boldsymbol{w}_k\\\\\boldsymbol{z}_k=\boldsymbol{C}_k\boldsymbol{x}_k+\boldsymbol{v}_k&\end{cases}\quad k=1,\ldots,N.
⎩
⎨
⎧xk=Akxk−1+uk+wkzk=Ckxk+vkk=1,…,N.
噪声变量
w
k
{w}_{k}
wk、
v
k
{v}_{k}
vk都服从高斯分布:
w
k
∼
N
(
0
,
R
)
.
v
k
∼
N
(
0
,
Q
)
.
\boldsymbol{w}_{k}\sim N(\boldsymbol{0},\boldsymbol{R}).\quad\boldsymbol{v}_{k}\sim N(\boldsymbol{0},\boldsymbol{Q}).
wk∼N(0,R).vk∼N(0,Q).
书中简略了部分变量的下标,后续以上帽子 x ^ k \hat{x}_k x^k表示后验分布,下帽子 x ˇ k \check{x}_k xˇk表示先验分布
1.3.2 卡尔曼滤波KF
卡尔曼滤波器第1步:通过运动方程确定 x k x_k xk的先验分布,即预测
P ( x k ∣ x 0 , u 1 : k , z 1 : k − 1 ) = N ( A k x ^ k − 1 + u k , A k P ^ k − 1 A k ⊤ + R ) . P\left(\boldsymbol{x}_k|\boldsymbol{x}_0,\boldsymbol{u}_{1:k},\boldsymbol{z}_{1:k-1}\right)=N\left(\boldsymbol{A}_k\hat{\boldsymbol{x}}_{k-1}+\boldsymbol{u}_k,\boldsymbol{A}_k\hat{\boldsymbol{P}}_{k-1}\boldsymbol{A}_k^{\top}+\boldsymbol{R}\right). P(xk∣x0,u1:k,z1:k−1)=N(Akx^k−1+uk,AkP^k−1Ak⊤+R).
这里满足线性高斯变换,N(均值,协方差),记:
x
ˇ
k
=
A
k
x
^
k
−
1
+
u
k
,
P
ˇ
k
=
A
k
P
^
k
−
1
A
k
T
+
R
.
\check{\boldsymbol{x}}_k=A_k\hat{\boldsymbol{x}}_{k-1}+\boldsymbol{u}_k,\quad\check{\boldsymbol{P}}_k=A_k\hat{\boldsymbol{P}}_{k-1}\boldsymbol{A}_k^\mathrm{T}+\boldsymbol{R}.
xˇk=Akx^k−1+uk,Pˇk=AkP^k−1AkT+R.
卡尔曼滤波器第2步:根据观测方程,计算某个状态下应该产生怎样的观测数据
P ( z k ∣ x k ) = N ( C k x k , Q ) . P\left(\boldsymbol{z}_{k}|\boldsymbol{x}_{k}\right)=N\left(\boldsymbol{C}_{k}\boldsymbol{x}_{k},\boldsymbol{Q}\right). P(zk∣xk)=N(Ckxk,Q).
卡尔曼滤波器第3步:计算后验概率分布 x k ∼ N ( x ^ k , P ^ k ) \boldsymbol{x}_k\sim N(\hat{\boldsymbol{x}}_k,\hat{\boldsymbol{P}}_k) xk∼N(x^k,P^k) = 先验*观测(似然)
N ( x ^ k , P ^ k ) = η N ( C k x k , Q ) ⋅ N ( x ˇ k , P ˇ k ) . N(\hat{\boldsymbol{x}}_{k},\hat{\boldsymbol{P}}_{k})=\eta N(\boldsymbol{C}_{k}\boldsymbol{x}_{k},\boldsymbol{Q})\cdot N(\check{\boldsymbol{x}}_{k},\check{\boldsymbol{P}}_{k}). N(x^k,P^k)=ηN(Ckxk,Q)⋅N(xˇk,Pˇk).
下面是关于后验概率分布公式的具体推导
( x k − x ^ k ) T P ^ k − 1 ( x k − x ^ k ) = ( z k − C k x k ) T Q − 1 ( z k − C k x k ) + ( x k − x ^ k ) T P ˇ k − 1 ( x k − x ˇ k ) . \left(\boldsymbol{x}_k-\hat{\boldsymbol{x}}_k\right)^{\mathrm{T}}\hat{\boldsymbol{P}}_k^{-1}\left(\boldsymbol{x}_k-\hat{\boldsymbol{x}}_k\right)=\left(\boldsymbol{z}_k-\boldsymbol{C}_k\boldsymbol{x}_k\right)^{\mathrm{T}}\boldsymbol{Q}^{-1}\left(\boldsymbol{z}_k-\boldsymbol{C}_k\boldsymbol{x}_k\right)+\left(\boldsymbol{x}_k-\hat{\boldsymbol{x}}_k\right)^{\mathrm{T}}\check{\boldsymbol{P}}_k^{-1}\left(\boldsymbol{x}_k-\check{\boldsymbol{x}}_k\right). (xk−x^k)TP^k−1(xk−x^k)=(zk−Ckxk)TQ−1(zk−Ckxk)+(xk−x^k)TPˇk−1(xk−xˇk).
为了求左侧的
x
^
k
\hat{x}_k
x^k和
P
^
k
\hat{P}_k
P^k,我们把两边展开,并比较
x
k
{x}_k
xk的二次和一次系数。对于二次系数,有
P
^
k
−
1
=
C
k
T
Q
−
1
C
k
+
P
ˇ
k
−
1
.
\hat{P}_{k}^{-1}=C_{k}^{\mathrm{T}}Q^{-1}C_{k}+\check{P}_{k}^{-1}.
P^k−1=CkTQ−1Ck+Pˇk−1.
该式给出了协方差的计算过程。为了便于后面列写式子,定义一个中间变量K
K
=
P
^
k
C
k
T
Q
−
1
.
K=\hat{P}_{k}C_{k}^{\mathrm{T}}Q^{-1}.
K=P^kCkTQ−1.
I = P ^ k C k T Q − 1 C k + P ^ k P ˇ k − 1 = K C k + P ^ k P ˇ k − 1 . I=\hat{P}_{k}C_{k}^{\mathrm{T}}Q^{-1}C_{k}+\hat{P}_{k}\check{P}_{k}^{-1}=KC_{k}+\hat{P}_{k}\check{P}_{k}^{-1}. I=P^kCkTQ−1Ck+P^kPˇk−1=KCk+P^kPˇk−1.
P k ^ = ( I − K C k ) P k ˇ . \hat{P_k}=(I-KC_k)\check{P_k}. Pk^=(I−KCk)Pkˇ.
比较一次项的系数
−
2
x
^
k
T
P
^
k
−
1
x
k
=
−
2
z
k
T
Q
−
1
C
k
x
k
−
2
x
ˇ
k
T
P
ˇ
k
−
1
x
k
.
-2\hat{\boldsymbol{x}}_k^\mathrm{T}\hat{\boldsymbol{P}}_k^{-1}\boldsymbol{x}_k=-2\boldsymbol{z}_k^\mathrm{T}Q^{-1}C_k\boldsymbol{x}_k-2\check{\boldsymbol{x}}_k^\mathrm{T}\check{\boldsymbol{P}}_k^{-1}\boldsymbol{x}_k.
−2x^kTP^k−1xk=−2zkTQ−1Ckxk−2xˇkTPˇk−1xk.
P ^ k − 1 x ^ k = C k T Q − 1 z k + P ˇ k − 1 x ˇ k . \hat{\boldsymbol{P}}_k^{-1}\hat{\boldsymbol{x}}_k=C_k^{\mathrm{T}}Q^{-1}\boldsymbol{z}_k+\check{\boldsymbol{P}}_k^{-1}\check{\boldsymbol{x}}_k. P^k−1x^k=CkTQ−1zk+Pˇk−1xˇk.
两侧乘以 P ^ k \hat{P}_k P^k并代入
x ^ k = P ^ k C k T Q − 1 z k + P ^ k P ˇ k − 1 x ˇ k = K z k + ( I − K C k ) x ˇ k = x ˇ k + K ( z k − C k x ˇ k ) . \begin{aligned} \hat{x}_{k}& =\hat{\boldsymbol{P}}_k\boldsymbol{C}_k^\mathrm{T}\boldsymbol{Q}^{-1}\boldsymbol{z}_k+\hat{\boldsymbol{P}}_k\check{\boldsymbol{P}}_k^{-1}\check{\boldsymbol{x}}_k \\ &=\boldsymbol{K}\boldsymbol{z}_k+\left(\boldsymbol{I}-\boldsymbol{K}\boldsymbol{C}_k\right)\check{\boldsymbol{x}}_k=\check{\boldsymbol{x}}_k+\boldsymbol{K}\left(\boldsymbol{z}_k-\boldsymbol{C}_k\check{\boldsymbol{x}}_k\right). \end{aligned} x^k=P^kCkTQ−1zk+P^kPˇk−1xˇk=Kzk+(I−KCk)xˇk=xˇk+K(zk−Ckxˇk).
1.3.3 小结
预测:利用k-1时刻后验得到k时刻先验
更新:k时刻后验 = 观测*k时刻先验(后验 = 先验 + 卡尔曼增益K×不确定量)
1.4 非线性系统和EKF
SLAM中的运动方程和观测方程通常是非线性函数,尤其是视觉SLAM中的相机模型,需要使用相机内参模型及李代数表示的位姿,更不可能是一个线性系统。一个高斯分布,经过非线性变换后,往往不再是高斯分布,所以在非线性系统中,我们必须取一定的近似,将一个非高斯分布近似成高斯分布。
我们希望把卡尔曼滤波器的结果拓展到非线性系统中,称为扩展卡尔曼滤波器。通常的做法是,在某个点附近考虑运动方程及观测方程的一阶泰勒展开,只保留一阶项,即线性的部分,然后按照线性系统进行推导。令k-1
时刻的均值与协方差矩阵为
x
^
k
−
1
,
P
^
k
−
1
\hat{\boldsymbol{x}}_{k-1},\hat{\boldsymbol{P}}_{k-1}
x^k−1,P^k−1。在k
时刻,我们把运动方程和观测方程在
x
^
k
−
1
,
P
^
k
−
1
\hat{\boldsymbol{x}}_{k-1},\hat{\boldsymbol{P}}_{k-1}
x^k−1,P^k−1处进行线性化(相当于一阶泰勒展开):
运动方程一阶泰勒近似
x k ≈ f ( x ^ k − 1 , u k ) + ∂ f ∂ x k − 1 ∣ x ^ k − 1 ( x k − 1 − x ^ k − 1 ) + w k . x_k\approx f\left(\hat{\boldsymbol{x}}_{k-1},\boldsymbol{u}_k\right)+\left.\frac{\partial f}{\partial\boldsymbol{x}_{k-1}}\right|_{\hat{\boldsymbol{x}}_{k-1}}(\boldsymbol{x}_{k-1}-\hat{\boldsymbol{x}}_{k-1})+\boldsymbol{w}_k. xk≈f(x^k−1,uk)+∂xk−1∂f x^k−1(xk−1−x^k−1)+wk.
F = ∂ f ∂ x k − 1 ∣ x ^ k − 1 . \boldsymbol{F}=\left.\frac{\partial f}{\partial\boldsymbol{x}_{k-1}}\right|_{\hat{\boldsymbol{x}}_{k-1}}. F=∂xk−1∂f x^k−1.
观测方程一阶泰勒近似
z k ≈ h ( x ˇ k ) + ∂ h ∂ x k ∣ x ˇ k ( x k − x ˇ k ) + n k . \boldsymbol{z}_k\approx h\left(\check{\boldsymbol{x}}_k\right)+\left.\frac{\partial h}{\partial\boldsymbol{x}_k}\right|_{\check{\boldsymbol{x}}_k}\left(\boldsymbol{x}_k-\check{\boldsymbol{x}}_k\right)+\boldsymbol{n}_k. zk≈h(xˇk)+∂xk∂h xˇk(xk−xˇk)+nk.
H = ∂ h ∂ x k ∣ x ˇ k . H=\left.\frac{\partial h}{\partial\boldsymbol{x}_k}\right|_{\check{\boldsymbol{x}}_k}. H=∂xk∂h xˇk.
根据运动和观测方程的一阶泰勒近似,推导EKF公式中k时刻先验分布和似然
P ( x k ∣ x 0 , u 1 : k , z 0 : k − 1 ) = N ( f ( x ^ k − 1 , u k ) , F P ^ k − 1 F T + R k ) P ( z k ∣ x k ) = N ( h ( x ˇ k ) + H ( x k − x ˇ k ) , Q k ) . P\left(\boldsymbol{x}_{k}|\boldsymbol{x}_{0},\boldsymbol{u}_{1:k},\boldsymbol{z}_{0:k-1}\right)=N(f\left(\boldsymbol{\hat{x}}_{k-1},\boldsymbol{u}_{k}\right),\boldsymbol{F}\hat{P}_{k-1}\boldsymbol{F}^{\mathrm{T}}+\boldsymbol{R}_{k})\\ P\left(\boldsymbol{z}_{k}|\boldsymbol{x}_{k}\right)=N(h\left(\check{\boldsymbol{x}}_{k}\right)+H\left(\boldsymbol{x}_{k}-\check{\boldsymbol{x}}_{k}\right),\boldsymbol{Q}_{k}). P(xk∣x0,u1:k,z0:k−1)=N(f(x^k−1,uk),FP^k−1FT+Rk)P(zk∣xk)=N(h(xˇk)+H(xk−xˇk),Qk).
其中,k时刻的先验(均值和协方差)为:
x ˇ k = f ( x ^ k − 1 , u k ) , P ˇ k = F P ^ k − 1 F T + R k . \check{\boldsymbol{x}}_k=f\left(\hat{\boldsymbol{x}}_{k-1},\boldsymbol{u}_k\right),\quad\check{\boldsymbol{P}}_k=F\hat{\boldsymbol{P}}_{k-1}\boldsymbol{F}^\mathrm{T}+\boldsymbol{R}_k. xˇk=f(x^k−1,uk),Pˇk=FP^k−1FT+Rk.
根据运动和观测方程的一阶泰勒近似,推导EKF公式中k时刻后验分布
定义卡尔曼增益
K
k
K_k
Kk:
K
k
=
P
ˇ
k
H
T
(
H
P
ˇ
k
H
T
+
Q
k
)
−
1
.
K_{k}=\check{P}_{k}H^{\mathrm{T}}(H\check{P}_{k}H^{\mathrm{T}}+Q_{k})^{-1}.
Kk=PˇkHT(HPˇkHT+Qk)−1.
x ^ k = x ˇ k + K k ( z k − h ( x ˇ k ) ) , P ^ k = ( I − K k H ) P ˇ k . \hat{\boldsymbol{x}}_k=\check{\boldsymbol{x}}_k+\boldsymbol{K}_k\left(\boldsymbol{z}_k-h\left(\check{\boldsymbol{x}}_k\right)\right),\hat{\boldsymbol{P}}_k=\left(\boldsymbol{I}-\boldsymbol{K}_k\boldsymbol{H}\right)\check{\boldsymbol{P}}_k. x^k=xˇk+Kk(zk−h(xˇk)),P^k=(I−KkH)Pˇk.
2 BA和图优化
所谓的Bundle Adjustment(BA)
,是指从视觉图像中提炼出最优的3D模型和相机参数(内参数和外参数)。
2.1 投影模型和BA代价函数
投影模型也就是所谓的观测方程
e
=
z
−
h
(
T
,
p
)
e=\boldsymbol{z}-h(\boldsymbol{T},\boldsymbol{p})
e=z−h(T,p),将世界坐标系下的点投影到像素坐标系下。
观测方程的计算结果必然会与真实观测产生误差,这是数学模型不可避免的。
e
=
z
−
h
(
T
,
p
)
e=\boldsymbol{z}-h(\boldsymbol{T},\boldsymbol{p})
e=z−h(T,p)
一般来讲,优化指的是对整体误差的调整,而不能只是对单一路标点进行优化。如下式,设
z
i
j
z_{ij}
zij是位姿
T
i
T_{i}
Ti处观察路标
p
j
p_{j}
pj产生的数据,那么累计的代价函数为:
1
2
∑
i
=
1
m
∑
j
=
1
n
∥
e
i
j
∥
2
=
1
2
∑
i
=
1
m
∑
j
=
1
n
∥
z
i
j
−
h
(
T
i
,
p
j
)
∥
2
\dfrac{1}{2}\sum\limits_{i=1}^{m}\sum\limits_{j=1}^{n}\|\boldsymbol{e}_{ij}\|^2=\dfrac{1}{2}\sum\limits_{i=1}^{m}\sum\limits_{j=1}^{n}\|\boldsymbol{z}_{ij}-h(\boldsymbol{T}_i,\boldsymbol{p}_j)\|^2
21i=1∑mj=1∑n∥eij∥2=21i=1∑mj=1∑n∥zij−h(Ti,pj)∥2
2.2 BA求解
定义优化自变量参数如下:
x = [ T 1 , … , T m , p 1 , … , p n ] T \boldsymbol{x}=\left[\boldsymbol{T}_1, \ldots, \boldsymbol{T}_m, \boldsymbol{p}_1, \ldots, \boldsymbol{p}_n\right]^{\mathrm{T}} x=[T1,…,Tm,p1,…,pn]T
非线性优化中,我们求解的增量是对于整体自变量的增量
Δ
x
\Delta x
Δx。下面式子中,
F
i
j
F_{ij}
Fij表示整个代价函数在当前状态下对相机位姿的偏导,
E
i
j
E_{ij}
Eij表示整个代价函数在当前状态下对路标点的偏导。
1
2
∥
f
(
x
+
Δ
x
)
∥
2
≈
1
2
∑
i
=
1
m
∑
j
=
1
n
∥
e
i
j
+
F
i
j
Δ
ξ
i
+
E
i
j
Δ
p
j
∥
2
\frac{1}{2}\|f(\boldsymbol{x}+\Delta \boldsymbol{x})\|^2 \approx \frac{1}{2} \sum_{i=1}^m \sum_{j=1}^n\left\|\boldsymbol{e}_{i j}+\boldsymbol{F}_{i j} \Delta \boldsymbol{\xi}_i+\boldsymbol{E}_{i j} \Delta \boldsymbol{p}_j\right\|^2
21∥f(x+Δx)∥2≈21i=1∑mj=1∑n
eij+FijΔξi+EijΔpj
2
把位姿和路标点参数分开表示,即可得到新的代价函数表达式
x
c
=
[
ξ
1
,
ξ
2
,
…
,
ξ
m
]
T
∈
R
6
m
x
p
=
[
p
1
,
p
2
,
…
,
p
n
]
T
∈
R
3
n
\boldsymbol{x}_{\mathrm{c}}=\left[\boldsymbol{\xi}_1, \boldsymbol{\xi}_2, \ldots, \boldsymbol{\xi}_m\right]^{\mathrm{T}} \in \mathbb{R}^{6 m} \\ \boldsymbol{x}_p=\left[\boldsymbol{p}_1, \boldsymbol{p}_2, \ldots, \boldsymbol{p}_n\right]^{\mathrm{T}} \in \mathbb{R}^{3 n}
xc=[ξ1,ξ2,…,ξm]T∈R6mxp=[p1,p2,…,pn]T∈R3n
1 2 ∥ f ( x + Δ x ) ∥ 2 = 1 2 ∥ e + F Δ x c + E Δ x p ∥ 2 \frac{1}{2}\|f(\boldsymbol{x}+\Delta \boldsymbol{x})\|^2=\frac{1}{2}\left\|\boldsymbol{e}+\boldsymbol{F} \Delta \boldsymbol{x}_c+\boldsymbol{E} \Delta \boldsymbol{x}_p\right\|^2 21∥f(x+Δx)∥2=21∥e+FΔxc+EΔxp∥2
高斯牛顿法的解式:
H
Δ
x
=
g
\boldsymbol{H} \Delta \boldsymbol{x}=\boldsymbol{g}
HΔx=g
因为变量是位姿和空间点两种,我们可以对雅可比矩阵进行分块处理
J
=
[
F
E
]
\boldsymbol{J}=\left[\begin{array}{ll} \boldsymbol{F} & \boldsymbol{E} \end{array}\right]
J=[FE]
H = J T J = [ F T F F T E E T F E T E ] \boldsymbol{H}=\boldsymbol{J}^{\mathrm{T}} \boldsymbol{J}=\left[\begin{array}{ll} \boldsymbol{F}^{\mathrm{T}} \boldsymbol{F} & \boldsymbol{F}^{\mathrm{T}} \boldsymbol{E} \\ \boldsymbol{E}^{\mathrm{T}} \boldsymbol{F} & \boldsymbol{E}^{\mathrm{T}} \boldsymbol{E} \end{array}\right] H=JTJ=[FTFETFFTEETE]
2.3 稀疏性和边缘化
2.3.1 稀疏性
在SLAM中,相机位姿和特征点数量必然很大,也就造成矩阵H维度很大。直接去方程会耗费计算资源,但是H的稀疏结构简化了这一过程。
以下面的例子引出稀疏矩阵H
。假设一个场景内有2个相机位姿(C1,C2)
和6个路标点(P1,P2,P3,P4,P,P6)
。分别对应相机位姿
T
i
,
i
=
1
,
2
T_i,i=1,2
Ti,i=1,2和路标点
p
j
,
j
=
1
,
.
.
.
,
6
p_{j},j=1,...,6
pj,j=1,...,6。如果相机i
能够观测到j
点,就在它们对应的节点连上一条边。
据图推出当前的代价函数
1
2
(
∥
e
11
∥
2
+
∥
e
12
∥
2
+
∥
e
13
∥
2
+
∥
e
14
∥
2
+
∥
e
23
∥
2
+
∥
e
24
∥
2
+
∥
e
25
∥
2
+
∥
e
26
∥
2
)
\frac{1}{2}\left(\left\|\boldsymbol{e}_{11}\right\|^2+\left\|\boldsymbol{e}_{12}\right\|^2+\left\|\boldsymbol{e}_{13}\right\|^2+\left\|\boldsymbol{e}_{14}\right\|^2+\left\|\boldsymbol{e}_{23}\right\|^2+\left\|e_{24}\right\|^2+\left\|\boldsymbol{e}_{25}\right\|^2+\left\|\boldsymbol{e}_{26}\right\|^2\right)
21(∥e11∥2+∥e12∥2+∥e13∥2+∥e14∥2+∥e23∥2+∥e24∥2+∥e25∥2+∥e26∥2)
以
e
11
e_{11}
e11为例,其表示在
C
1
C_1
C1看到了
P
1
P_1
P1,与其它时刻的相机位姿和路标点无关!所以雅可比只在位姿1和路标点1处不为0,其余全为0.
J
11
=
∂
e
11
∂
x
=
(
∂
e
11
∂
ξ
1
,
0
2
×
6
,
∂
e
11
∂
p
1
,
0
2
×
3
,
0
2
×
3
,
0
2
×
3
,
0
2
×
3
,
0
2
×
3
)
\boldsymbol{J}_{11}=\frac{\partial \boldsymbol{e}_{11}}{\partial \boldsymbol{x}}=\left(\frac{\partial \boldsymbol{e}_{11}}{\partial \boldsymbol{\xi}_1}, \mathbf{0}_{2 \times 6}, \frac{\partial \boldsymbol{e}_{11}}{\partial \boldsymbol{p}_1}, \mathbf{0}_{2 \times 3}, \mathbf{0}_{2 \times 3}, \mathbf{0}_{2 \times 3}, \mathbf{0}_{2 \times 3}, \mathbf{0}_{2 \times 3}\right)
J11=∂x∂e11=(∂ξ1∂e11,02×6,∂p1∂e11,02×3,02×3,02×3,02×3,02×3)
同理可以得到其它的雅可比
J
i
j
(
x
)
=
(
0
2
×
6
,
…
0
2
×
6
,
∂
e
i
j
∂
T
i
,
0
2
×
6
,
…
0
2
×
3
,
…
0
2
×
3
,
∂
e
i
j
∂
p
j
,
0
2
×
3
,
…
0
2
×
3
)
\boldsymbol{J}_{i j}(\boldsymbol{x})=\left(\mathbf{0}_{2 \times 6}, \ldots \mathbf{0}_{2 \times 6}, \frac{\partial \boldsymbol{e}_{i j}}{\partial \boldsymbol{T}_i}, \mathbf{0}_{2 \times 6}, \ldots \mathbf{0}_{2 \times 3}, \ldots \boldsymbol{0}_{2 \times 3}, \frac{\partial \boldsymbol{e}_{i j}}{\partial \boldsymbol{p}_j}, \mathbf{0}_{2 \times 3}, \ldots \boldsymbol{0}_{2 \times 3}\right)
Jij(x)=(02×6,…02×6,∂Ti∂eij,02×6,…02×3,…02×3,∂pj∂eij,02×3,…02×3),利用图像矩阵来形象表示:
矩阵 H 8 ∗ 8 H_{8*8} H8∗8中,有值的地方表示存在了误差边 e i j e_{ij} eij。矩阵中非对角部分的非0矩阵块可以理解为其对应的两个变量之间存在约束关系!
扩展到任意维度,对其分块处理。其中
H
11
、
H_{11}、
H11、
H
22
H_{22}
H22都是对角阵,只是在对角线处有非0块。对于
H
12
H_{12}
H12和
H
21
H_{21}
H21,它们可能是稀疏的,也可能是稠密的,视具体的观测数据而定。
H
=
∑
i
,
j
J
i
j
T
J
i
j
=
[
H
11
H
12
H
21
H
22
]
\boldsymbol{H}=\sum_{i, j} \boldsymbol{J}_{i j}^{\mathrm{T}} \boldsymbol{J}_{i j}=\left[\begin{array}{ll} \boldsymbol{H}_{11} & \boldsymbol{H}_{12} \\ \boldsymbol{H}_{21} & \boldsymbol{H}_{22} \end{array}\right]
H=i,j∑JijTJij=[H11H21H12H22]
2.3.2 边缘化
对于具有这种稀疏结构的H,线性方程 H △ x = g H△x=g H△x=g的求解,书中介绍一种最常用的手段:Schur消元,也称为Marginalization(边缘化)。
为了更好的描述这个矩阵,将其按规律拆分为四个矩阵块,对应线性方程
H
Δ
x
=
g
H\Delta x=g
HΔx=g :
[
B
E
E
T
C
]
[
Δ
x
c
Δ
x
p
]
=
[
v
w
]
\left[\begin{array}{ll} \boldsymbol{B} & \boldsymbol{E} \\ \boldsymbol{E}^{\mathrm{T}} & \boldsymbol{C} \end{array}\right]\left[\begin{array}{l} \Delta \boldsymbol{x}_{\mathrm{c}} \\ \Delta \boldsymbol{x}_p \end{array}\right]=\left[\begin{array}{l} \boldsymbol{v} \\ \boldsymbol{w} \end{array}\right]
[BETEC][ΔxcΔxp]=[vw]
注意到B和C是对角块矩阵,每个对角块的维度一致。因为对角矩阵更容易求逆,所以对方程进行高斯消元(通过一系列的行变换,将增广矩阵化为上三角形式,这里消除左上角的E):
[
I
−
E
C
−
1
0
I
]
[
B
E
E
T
C
]
[
Δ
x
c
Δ
x
p
]
=
[
I
−
E
C
−
1
0
I
]
[
v
w
]
.
[
B
−
E
C
−
1
E
T
0
E
T
C
]
[
Δ
x
c
Δ
x
p
]
=
[
v
−
E
C
−
1
w
w
]
.
\begin{gathered} {\left[\begin{array}{cc} \boldsymbol{I} & -\boldsymbol{E} \boldsymbol{C}^{-1} \\ 0 & \boldsymbol{I} \end{array}\right]\left[\begin{array}{cc} \boldsymbol{B} & \boldsymbol{E} \\ \boldsymbol{E}^{\mathrm{T}} & C \end{array}\right]\left[\begin{array}{c} \Delta \boldsymbol{x}_{\mathrm{c}} \\ \Delta \boldsymbol{x}_p \end{array}\right]=\left[\begin{array}{cc} \boldsymbol{I} & -\boldsymbol{E} \boldsymbol{C}^{-1} \\ 0 & \boldsymbol{I} \end{array}\right]\left[\begin{array}{l} \boldsymbol{v} \\ \boldsymbol{w} \end{array}\right] .} \\ {\left[\begin{array}{cc} \boldsymbol{B}-\boldsymbol{E} \boldsymbol{C}^{-1} \boldsymbol{E}^{\mathrm{T}} & 0 \\ \boldsymbol{E}^{\mathrm{T}} & C \end{array}\right]\left[\begin{array}{c} \Delta \boldsymbol{x}_{\mathrm{c}} \\ \Delta \boldsymbol{x}_p \end{array}\right]=\left[\begin{array}{c} \boldsymbol{v}-\boldsymbol{E} \boldsymbol{C}^{-1} \boldsymbol{w} \\ \boldsymbol{w} \end{array}\right] .} \end{gathered}
[I0−EC−1I][BETEC][ΔxcΔxp]=[I0−EC−1I][vw].[B−EC−1ETET0C][ΔxcΔxp]=[v−EC−1ww].
① 取出第一行,求解位姿增量 Δ x c \Delta x_c Δxc
[ B − E C − 1 E T ] Δ x c = v − E C − 1 w \left[\boldsymbol{B}-\boldsymbol{E} \boldsymbol{C}^{-1} \boldsymbol{E}^{\mathrm{T}}\right] \Delta \boldsymbol{x}_{\mathrm{c}}=\boldsymbol{v}-\boldsymbol{E} \boldsymbol{C}^{-1} \boldsymbol{w} [B−EC−1ET]Δxc=v−EC−1w
② 利用求出的
Δ
x
c
\Delta x_c
Δxc,计算路标增量
Δ
x
p
\Delta x_p
Δxp
Δ
x
p
=
C
−
1
(
w
−
E
T
Δ
x
c
)
\Delta\boldsymbol{x}_p=\boldsymbol{C}^{-1}(\boldsymbol{w}-\boldsymbol{E}^\mathrm{T}\Delta\boldsymbol{x}_\mathrm{c})
Δxp=C−1(w−ETΔxc)
③ 这个方程的系数矩阵
[
B
−
E
C
−
1
E
T
]
\left[\boldsymbol{B}-\boldsymbol{E} \boldsymbol{C}^{-1} \boldsymbol{E}^{\mathrm{T}}\right]
[B−EC−1ET],记为S。S
矩阵的非对角线上的非0矩阵块,表示了该处对应的两个相机变量之间存在着共同观测的路标点。反之,如果相应矩阵块为0,表示两个相机没有共同观测。
[
B
−
E
C
−
1
E
T
]
Δ
x
c
=
v
−
E
C
−
1
w
\left[\boldsymbol{B}-\boldsymbol{E} \boldsymbol{C}^{-1} \boldsymbol{E}^{\mathrm{T}}\right] \Delta \boldsymbol{x}_{\mathrm{c}}=\boldsymbol{v}-\boldsymbol{E} \boldsymbol{C}^{-1} \boldsymbol{w}
[B−EC−1ET]Δxc=v−EC−1w
④ 从概率角度来看,我们称这一步为边缘化,是因为我们实际上把求
(
Δ
x
p
,
Δ
x
p
)
(\Delta x_p,\Delta x_p)
(Δxp,Δxp)的问题,转化成了先固定
Δ
x
p
\Delta x_p
Δxp,求出
Δ
x
c
\Delta x_c
Δxc,再求
Δ
x
p
\Delta x_p
Δxp的过程。这一步相当于做了条件概率展开:
P
(
x
c
,
x
p
)
=
P
(
x
c
∣
x
p
)
P
(
x
p
)
,
P(\boldsymbol{x}_{\mathbf{c}},\boldsymbol{x}_{p})=P(\boldsymbol{x}_{\mathbf{c}}|\boldsymbol{x}_{p})P(\boldsymbol{x}_{p}),
P(xc,xp)=P(xc∣xp)P(xp),
2.4 鲁棒核函数
为了避免出现很大误差边带来的干扰,引入了鲁棒核函数。
- Huber核
H ( e ) = { 1 2 e 2 当 ∣ e ∣ ⩽ δ , δ ( ∣ e ∣ − 1 2 δ ) 其他 H(e)= \begin{cases}\frac{1}{2} e^2 & \text { 当 }|e| \leqslant \delta, \\ \delta\left(|e|-\frac{1}{2} \delta\right) & \text { 其他 }\end{cases} H(e)={21e2δ(∣e∣−21δ) 当 ∣e∣⩽δ, 其他
- Cauchy核函数
K ( x ) = 1 1 + x 2 K(x)=\frac1{1+x^2} K(x)=1+x21
- Tukey’s Biweight核函数
K ( x ) = ( 1 − x 2 ) 2 ⋅ I ( ∣ x ∣ ≤ 1 ) K(x)=(1-x^2)^2\cdot I(|x|\leq1) K(x)=(1−x2)2⋅I(∣x∣≤1)
3 实践
3.1 BAL数据集
3.1.1 Available Datasets
3.1.2 Camera Model
- BAL的相机内参模型由焦距f和畸变参数
k1,k2
给出。f
类似于我们提到的fx
和fy
。由于照片像素基本上是正方形,所以在很多实际场合中fx
非常接近fy
,用同一个值也未尝不可。此外,这个模型中没有cx,cy,
因为存储的数据已经去掉了这两个值。 - 因为BAL数据在投影时假设投影平面在相机光心之后,所以按照我们之前用的模型计算,需要在投影之后乘以系数-1。不过,大部分数据集仍使用光心前面的投影平面。
We use a pinhole camera model(针孔相机模型); the parameters we estimate for each camera area rotation R, a translation t, a focal length f(焦距) and two radial distortion(径向畸变) parameters k1 and k2. The formula for projecting a 3D point X into a camera R,t,f,k1,k2 is(观测方程/投影方程):
P = R * X + t (conversion from world to camera coordinates)
p = -P / P.z (perspective division,X和Y分别处于X,转换归一化坐标系)
p' = f * r(p) * p (conversion to pixel coordinates)
where P.z is the third (z) coordinate of P. In the last equation, r(p) is a function that computes a scaling factor to undo the radial distortion:
r(p) = 1.0 + k1 * ||p||^2 + k2 * ||p||^4. 注意p是极坐标,也就是p^2=x^2+y^2
This gives a projection in pixels, where the origin of the image is the center of the image(图像的原点是图像的中心,所以没有cx,cy), the positive x-axis points right, and the positive y-axis points up (in addition, in the camera coordinate system, the positive z-axis points backwards, so the camera is looking down the negative z-axis, as in OpenGL).
3.1.3 Data Format
数据格式介绍网站,书中实验数据来源于下图第一个数据,可知为16张图片,共22160个路标点,83718观测结果。观测>>路标点是因为出现了许多共视点。后续给出了其余数据网址,以及数据的结构分析。
Each problem is provided as a bzip2 compressed text file in the following format(每个压缩文本文件都是下面的格式).
# 图像帧数 路标点数 观测点数
<num_cameras> <num_points> <num_observations>
# 图像帧索引 路标点索引 观测u 观测v
<camera_index_1> <point_index_1> <x_1> <y_1> # 第1行
...
<camera_index_num_observations> <point_index_num_observations> <x_num_observations> <y_num_observations> # 第num行
<camera_1>
... # 相机位姿信息,根据后面注释,Each camera 有9参数,即9行表示1个相机
<camera_num_cameras>
<point_1> # 表示点的具体信息,是一个点的坐标 3行表示1个点
...
<point_num_points>
Where, there camera and point indices start from 0. Each camera is a set of 9 parameters - R,t,f,k1 and k2. The rotation R is specified as a Rodrigues' vector角轴.
16 22106 83718 这里实验数据计算 83718 + 16*9 + 22106*3 = 150180行数据
3.2 关于环境配置补充
是关于Sophus
库的,在Ubuntu18.04
上按仓库的第三方软件源下载即可。在Ubuntu20
上,下载默认的Sophus
,他需要fmt
这个库,为了防止版本不兼容,利用apt search fmt
找到当前Ubuntu
下的库,
sudo apt install libfmt-dev
既然有这个需要,那么就需要在CMakeLists.txt
上面链接fmt
库文件(Ubuntu18
是不需要的)。
cmake_minimum_required(VERSION 2.8)
project(bundle_adjustment)
set(CMAKE_BUILD_TYPE "Release")
set(CMAKE_CXX_FLAGS "-O3 -std=c++11") # 支持c++11标准
LIST(APPEND CMAKE_MODULE_PATH ${PROJECT_SOURCE_DIR}/cmake) # 这里把寻找cmake moudle添加到cmake寻找路径
Find_Package(G2O REQUIRED)
Find_Package(Eigen3 REQUIRED)
Find_Package(Ceres REQUIRED)
Find_Package(Sophus REQUIRED)
Find_Package(CSparse REQUIRED)
SET(G2O_LIBS g2o_csparse_extension g2o_stuff g2o_core cxsparse)
include_directories(${PROJECT_SOURCE_DIR} # 包含项目源文件路径,eg random.h
${EIGEN3_INCLUDE_DIR}
${CSPARSE_INCLUDE_DIR})
add_library(bal_common common.cpp) # 添加库文件
add_executable(bundle_adjustment_g2o bundle_adjustment_g2o.cpp)
add_executable(bundle_adjustment_ceres bundle_adjustment_ceres.cpp)
target_link_libraries(bundle_adjustment_ceres ${CERES_LIBRARIES} bal_common fmt::fmt) # 添加fmt库
target_link_libraries(bundle_adjustment_g2o ${G2O_LIBS} bal_common fmt::fmt)
除此以外,还需要安装下面软件
sudo apt-get install meshlab
# 有可能在编译的时候找不到eigen库,先不要用作者写的findeigen3,一般都能找到。或者直接把那个文件删了即可
find_package(Eigen3 3.1.0 REQUIRED NO_MODULE)
3.3 基本函数
1 common.h
#pragma once
/// 从文件读入BAL dataset
class BALProblem {
public:
/// load bal data from text file
// explicit: 这是一个关键字,用于标记构造函数,表示它不能用于隐式转换
explicit BALProblem(const std::string &filename, bool use_quaternions = false);
~BALProblem() {
delete[] point_index_;
delete[] camera_index_;
delete[] observations_;
delete[] parameters_;
}
/// save results to text file
void WriteToFile(const std::string &filename) const;
/// save results to ply pointcloud
void WriteToPLYFile(const std::string &filename) const;
void Normalize();
void Perturb(const double rotation_sigma,
const double translation_sigma,
const double point_sigma);
// 相机参数数目:是否使用四元数 是10 否9--角轴
int camera_block_size() const { return use_quaternions_ ? 10 : 9; }
// 特征点参数数目:3
int point_block_size() const { return 3; }
// 图像帧数 16
int num_cameras() const { return num_cameras_; }
// 路标点数 22106
int num_points() const { return num_points_; }
// 观测点数 83718
int num_observations() const { return num_observations_; }
// num_parameters_ = 9 * num_cameras_ + 3 * num_points_; 相机和路标点的参数总(行)数
int num_parameters() const { return num_parameters_; }
// 每个observation对应的point index 地址
const int *point_index() const { return point_index_; }
// 每个observation对应的camera index 地址
const int *camera_index() const { return camera_index_; }
// u v
const double *observations() const { return observations_; }
// 后面总的参数地址 9 * num_cameras_ + 3 * num_points_
const double *parameters() const { return parameters_; }
// 后面总的参数地址 9 * num_cameras_ + 3 * num_points_
const double *cameras() const { return parameters_; }
// 后面点的首地址
const double *points() const { return parameters_ + camera_block_size() * num_cameras_; }
/// camera参数的起始地址
double *mutable_cameras() { return parameters_; }
// 路标点首地址
double *mutable_points() { return parameters_ + camera_block_size() * num_cameras_; }
// camera参数的起始地址+9*camera_index_[i]
double *mutable_camera_for_observation(int i) {
return mutable_cameras() + camera_index_[i] * camera_block_size();
}
// 路标点首地址+3*point_index_[i]
double *mutable_point_for_observation(int i) {
return mutable_points() + point_index_[i] * point_block_size();
}
const double *camera_for_observation(int i) const {
return cameras() + camera_index_[i] * camera_block_size();
}
const double *point_for_observation(int i) const {
return points() + point_index_[i] * point_block_size();
}
private:
void CameraToAngelAxisAndCenter(const double *camera,
double *angle_axis,
double *center) const;
void AngleAxisAndCenterToCamera(const double *angle_axis,
const double *center,
double *camera) const;
int num_cameras_; // 图像帧数 16
int num_points_; // 路标点数 22106
int num_observations_; // 观测点数 83718
int num_parameters_; // num_parameters_ = 9 * num_cameras_ + 3 * num_points_; 相机和路标点的参数总(行)数
bool use_quaternions_; // 是否使用四元数
int *point_index_; // 每个observation对应的point index 地址
int *camera_index_; // 每个observation对应的camera index 地址
double *observations_; // u v
double *parameters_; // 后面总的参数 9 * num_cameras_ + 3 * num_points_
};
2 common.cpp
读取信息,写入信息,写入点云可视化文件,相机信息<=>旋转信息&相机中心空间坐标 ,对空间点、旋转、平移加入噪声。
#include <cstdio>
#include <fstream>
#include <iostream>
#include <string>
#include <vector>
#include <Eigen/Core>
#include <Eigen/Dense>
#include "common.h"
#include "rotation.h"
#include "random.h"
typedef Eigen::Map<Eigen::VectorXd> VectorRef;
typedef Eigen::Map<const Eigen::VectorXd> ConstVectorRef;
/**
* @brief 模板函数,可以用于从文件中读取数据
* @param[in] fptr 一个指向文件的指针,用于从文件中读取数据
* @param[in] format 一个格式化字符串,它告诉函数如何读取文件中的数据
* @param[in] value 存储从文件中读取的值
*/
template<typename T>
void FscanfOrDie(FILE *fptr, const char *format, T *value) {
int num_scanned = fscanf(fptr, format, value); // FILE 指针在每次成功读取后会自动向前移动,以指向下一个可读取的位置,依次读取
if (num_scanned != 1)
std::cerr << "Invalid UW data file. ";
}
/**
* @brief 对该点的每个坐标分量(x、y、z)进行随机扰动,扰动的大小由 sigma 参数控制
* @param[in] sigma 决定扰动程度
* @param[in] point 坐标(x、y、z)
*/
void PerturbPoint3(const double sigma, double *point) {
for (int i = 0; i < 3; ++i)
point[i] += RandNormal() * sigma;
}
/**
* @brief 对数据进行部分排序,使中位数的位置处于正确的位置,返回中位数
* @param[in] data 动态数组容器
*/
double Median(std::vector<double> *data) {
int n = data->size();
std::vector<double>::iterator mid_point = data->begin() + n / 2;
std::nth_element(data->begin(), mid_point, data->end()); // nth_element 算法对数据进行部分排序,使中位数的位置处于正确的位置
return *mid_point;
}
/**
* @brief 类构造函数,读取文件
* @param[in] filename 文件名
* @param[in] use_quaternions
*/
BALProblem::BALProblem(const std::string &filename, bool use_quaternions) {
FILE *fptr = fopen(filename.c_str(), "r");
if (fptr == NULL) {
std::cerr << "Error: unable to open file " << filename;
return;
};
// This wil die horribly on invalid files. Them's the breaks.
FscanfOrDie(fptr, "%d", &num_cameras_); // 依次读取图像帧数 路标点数 观测点数
FscanfOrDie(fptr, "%d", &num_points_);
FscanfOrDie(fptr, "%d", &num_observations_);
std::cout << "Header: " << num_cameras_
<< " " << num_points_
<< " " << num_observations_;
point_index_ = new int[num_observations_]; // 一共有这么多观测关系,所以创建相同数量的位姿和路标点数组
camera_index_ = new int[num_observations_];
observations_ = new double[2 * num_observations_]; // u v
num_parameters_ = 9 * num_cameras_ + 3 * num_points_;
parameters_ = new double[num_parameters_]; // 总参数数组
for (int i = 0; i < num_observations_; ++i) {
FscanfOrDie(fptr, "%d", camera_index_ + i);
FscanfOrDie(fptr, "%d", point_index_ + i);
for (int j = 0; j < 2; ++j) {
FscanfOrDie(fptr, "%lf", observations_ + 2 * i + j);
}
}
for (int i = 0; i < num_parameters_; ++i) {
FscanfOrDie(fptr, "%lf", parameters_ + i);
}
fclose(fptr);
use_quaternions_ = use_quaternions;
if (use_quaternions) {
// Switch the angle-axis rotations to quaternions.
// 如果提供的数据是四元数,因为四元数的旋转部分为4个数,所以9变成10.前面的观测关系是没有变的,所以无需更改
num_parameters_ = 10 * num_cameras_ + 3 * num_points_;
double *quaternion_parameters = new double[num_parameters_];
double *original_cursor = parameters_;
double *quaternion_cursor = quaternion_parameters;
for (int i = 0; i < num_cameras_; ++i) {
// 角轴转换四元数
AngleAxisToQuaternion(original_cursor, quaternion_cursor);
// 更新地址索引
quaternion_cursor += 4;
original_cursor += 3;
// 剩余的6个参数一致
for (int j = 4; j < 10; ++j) {
*quaternion_cursor++ = *original_cursor++;
}
}
// Copy the rest of the points.
for (int i = 0; i < 3 * num_points_; ++i) {
*quaternion_cursor++ = *original_cursor++;
}
// Swap in the quaternion parameters.
delete[]parameters_;
parameters_ = quaternion_parameters;
}
}
// const 这个成员函数是一个常量成员函数,它承诺在函数内部不会修改类的任何成员变量
/**
* @brief 将数据写入文件
* @param[in] filename 文件名
*/
void BALProblem::WriteToFile(const std::string &filename) const {
FILE *fptr = fopen(filename.c_str(), "w");
if (fptr == NULL) {
std::cerr << "Error: unable to open file " << filename;
return;
}
fprintf(fptr, "%d %d %d %d\n", num_cameras_, num_cameras_, num_points_, num_observations_);
for (int i = 0; i < num_observations_; ++i) {
fprintf(fptr, "%d %d", camera_index_[i], point_index_[i]);
for (int j = 0; j < 2; ++j) {
fprintf(fptr, " %g", observations_[2 * i + j]);
}
fprintf(fptr, "\n");
}
for (int i = 0; i < num_cameras(); ++i) {
double angleaxis[9];
if (use_quaternions_) {
//OutPut in angle-axis format.
QuaternionToAngleAxis(parameters_ + 10 * i, angleaxis);
memcpy(angleaxis + 3, parameters_ + 10 * i + 4, 6 * sizeof(double));
} else {
memcpy(angleaxis, parameters_ + 9 * i, 9 * sizeof(double));
}
for (int j = 0; j < 9; ++j) {
fprintf(fptr, "%.16g\n", angleaxis[j]);
}
}
const double *points = parameters_ + camera_block_size() * num_cameras_;
for (int i = 0; i < num_points(); ++i) {
const double *point = points + i * point_block_size();
for (int j = 0; j < point_block_size(); ++j) {
fprintf(fptr, "%.16g\n", point[j]);
}
}
fclose(fptr);
}
// Write the problem to a PLY file for inspection in Meshlab or CloudCompare
/**
* @brief 将问题的数据写入一个 PLY(Polygon File Format)文件
* @param[in] filename 文件名
*/
void BALProblem::WriteToPLYFile(const std::string &filename) const {
std::ofstream of(filename.c_str());
of << "ply"
<< '\n' << "format ascii 1.0"
<< '\n' << "element vertex " << num_cameras_ + num_points_
<< '\n' << "property float x"
<< '\n' << "property float y"
<< '\n' << "property float z"
<< '\n' << "property uchar red"
<< '\n' << "property uchar green"
<< '\n' << "property uchar blue"
<< '\n' << "end_header" << std::endl;
// Export extrinsic data (i.e. camera centers) as green points. 绿色表示相机中心
double angle_axis[3];
double center[3];
for (int i = 0; i < num_cameras(); ++i) {
const double *camera = cameras() + camera_block_size() * i;
CameraToAngelAxisAndCenter(camera, angle_axis, center);
of << center[0] << ' ' << center[1] << ' ' << center[2]
<< " 0 255 0" << '\n';
}
// Export the structure (i.e. 3D Points) as white points. 白色表示路标点
const double *points = parameters_ + camera_block_size() * num_cameras_; // begin()+16*9
for (int i = 0; i < num_points(); ++i) {
const double *point = points + i * point_block_size();
for (int j = 0; j < point_block_size(); ++j) {
of << point[j] << ' ';
}
of << " 255 255 255\n";
}
of.close();
}
/**
* @brief compute Ow = -Rwc*tcw; // twc
* @param[in] camera 相机参数地址
* @param[in] angle_axis 角轴数组指针(数组名)
* @param[in] center 相机中心
*/
void BALProblem::CameraToAngelAxisAndCenter(const double *camera,
double *angle_axis,
double *center) const {
VectorRef angle_axis_ref(angle_axis, 3);
if (use_quaternions_) {
QuaternionToAngleAxis(camera, angle_axis);
} else {
angle_axis_ref = ConstVectorRef(camera, 3); // 起始地址 前三个
}
// c = -R't Ow = -Rwc*tcw; // twc
Eigen::VectorXd inverse_rotation = -angle_axis_ref;
AngleAxisRotatePoint(inverse_rotation.data(),
camera + camera_block_size() - 6, // tcw
center);
VectorRef(center, 3) *= -1.0;
}
/**
* @brief compute tcw
* @param[in] angle_axis 角轴
* @param[in] center 相机中心
* @param[in] camera 角轴数组指针(数组名)
*/
void BALProblem::AngleAxisAndCenterToCamera(const double *angle_axis,
const double *center,
double *camera) const {
ConstVectorRef angle_axis_ref(angle_axis, 3);
if (use_quaternions_) {
AngleAxisToQuaternion(angle_axis, camera);
} else {
VectorRef(camera, 3) = angle_axis_ref;
}
// t = -R * c
AngleAxisRotatePoint(angle_axis, center, camera + camera_block_size() - 6);
VectorRef(camera + camera_block_size() - 6, 3) *= -1.0;
}
/**
* @brief Normalize
*/
void BALProblem::Normalize() {
// Compute the marginal median of the geometry
std::vector<double> tmp(num_points_);
Eigen::Vector3d median;
double *points = mutable_points();
for (int i = 0; i < 3; ++i) {
for (int j = 0; j < num_points_; ++j) {
tmp[j] = points[3 * j + i];
}
median(i) = Median(&tmp); // 记录路标点的中值
}
for (int i = 0; i < num_points_; ++i) {
VectorRef point(points + 3 * i, 3);
tmp[i] = (point - median).lpNorm<1>(); // 减去均值后求1-范数
}
const double median_absolute_deviation = Median(&tmp);
// Scale so that the median absolute deviation of the resulting
// reconstruction is 100
const double scale = 100.0 / median_absolute_deviation;
// X = scale * (X - median)
for (int i = 0; i < num_points_; ++i) {
VectorRef point(points + 3 * i, 3);
point = scale * (point - median);
}
double *cameras = mutable_cameras();
double angle_axis[3];
double center[3];
for (int i = 0; i < num_cameras_; ++i) {
double *camera = cameras + camera_block_size() * i;
CameraToAngelAxisAndCenter(camera, angle_axis, center);
// center = scale * (center - median)
VectorRef(center, 3) = scale * (VectorRef(center, 3) - median);
AngleAxisAndCenterToCamera(angle_axis, center, camera);
}
}
/**
* @brief 数据进行扰动,扰动的方式包括对点的坐标、相机的旋转和平移进行随机变动
* @param[in] rotation_sigma 旋转扰动参数
* @param[in] translation_sigma 平移扰动参数
* @param[in] point_sigma 点的坐标扰动参数
*/
void BALProblem::Perturb(const double rotation_sigma,
const double translation_sigma,
const double point_sigma) {
assert(point_sigma >= 0.0);
assert(rotation_sigma >= 0.0);
assert(translation_sigma >= 0.0);
double *points = mutable_points();
if (point_sigma > 0) {
for (int i = 0; i < num_points_; ++i) {
PerturbPoint3(point_sigma, points + 3 * i);
}
}
for (int i = 0; i < num_cameras_; ++i) {
double *camera = mutable_cameras() + camera_block_size() * i;
double angle_axis[3];
double center[3];
// Perturb in the rotation of the camera in the angle-axis
// representation
CameraToAngelAxisAndCenter(camera, angle_axis, center);
if (rotation_sigma > 0.0) {
PerturbPoint3(rotation_sigma, angle_axis);
}
AngleAxisAndCenterToCamera(angle_axis, center, camera);
if (translation_sigma > 0.0)
PerturbPoint3(translation_sigma, camera + camera_block_size() - 6);
}
}
3 rotation.h旋转变换
#ifndef ROTATION_H
#define ROTATION_H
#include <algorithm>
#include <cmath>
#include <limits>
//
// 1. 旋转变换--数学函数 a*b axb
template<typename T>
inline T DotProduct(const T x[3], const T y[3]) {
return (x[0] * y[0] + x[1] * y[1] + x[2] * y[2]); // 向量点击dot production
}
template<typename T> // 向量叉积 cross production
inline void CrossProduct(const T x[3], const T y[3], T result[3]) {
result[0] = x[1] * y[2] - x[2] * y[1];
result[1] = x[2] * y[0] - x[0] * y[2];
result[2] = x[0] * y[1] - x[1] * y[0];
}
//
// 2. 手写旋转向量到四元数 旋转向量 θn 旋转角θ 旋转轴n-单位长度向量
// 参数:旋转向量angle_axis = [a0,a1,a2] 四元数:quaternion
// 所以 旋转向量/θ = 旋转轴n [a0,a1,a2]/theta=[n0,n1,n2] 单位长度向量n
// 这里θ=sqrt(a0*a0+a1*a1+a2*a2) 旋转向量模长即旋转角度θ
template<typename T>
inline void AngleAxisToQuaternion(const T *angle_axis, T *quaternion) {
const T &a0 = angle_axis[0];
const T &a1 = angle_axis[1];
const T &a2 = angle_axis[2];
const T theta_squared = a0 * a0 + a1 * a1 + a2 * a2; // θ^2
// theta != 0 因为要把θ当分母
// std::numeric_limits<double>::epsilon()运行编译程序的计算机所能识别的最小非零浮点数
if (theta_squared > T(std::numeric_limits<double>::epsilon())) {
const T theta = sqrt(theta_squared);
const T half_theta = theta * T(0.5);
const T k = sin(half_theta) / theta;
quaternion[0] = cos(half_theta); // 书上公式(3.44)
quaternion[1] = a0 * k; // a0/θ = n n* sin(half_theta) = q
quaternion[2] = a1 * k;
quaternion[3] = a2 * k;
}
// θ = 0
else { // in case if theta_squared is zero
const T k(0.5); // 这里是c++11标准,赋值 也就是k = 0.5
quaternion[0] = T(1.0);
quaternion[1] = a0 * k;
quaternion[2] = a1 * k;
quaternion[3] = a2 * k;
}
}
// 3. 四元数到旋转向量
template<typename T>
inline void QuaternionToAngleAxis(const T *quaternion, T *angle_axis) {
const T &q1 = quaternion[1];
const T &q2 = quaternion[2];
const T &q3 = quaternion[3];
const T sin_squared_theta = q1 * q1 + q2 * q2 + q3 * q3;
/*四元数[q0,q1,q2,q3]变角轴[a0,a1,a2]:
[a0,a1,a2]=[q1,q2,q3]*theta/sin(theta/2)
[a0,a1,a2]/theta=[n0,n1,n2]->对上式取模->得到sin(theta/2),为sqrt(q1*q1+q2*q2+q3*q3)*/
// For quaternions representing non-zero rotation, the conversion
// is numercially stable
if (sin_squared_theta > T(std::numeric_limits<double>::epsilon())) {
const T sin_theta = sqrt(sin_squared_theta);
const T &cos_theta = quaternion[0];
// If cos_theta is negative, theta is greater than pi/2, which
// means that angle for the angle_axis vector which is 2 * theta
// would be greater than pi...
// 真实θ
const T two_theta = T(2.0) * ((cos_theta < 0.0)
? atan2(-sin_theta, -cos_theta)
: atan2(sin_theta, cos_theta));
const T k = two_theta / sin_theta;//定义theta/sin(theta/2)为k
//atan2(y,x)函数:计算(x,y)与原点连线与x轴正方向夹角,返回值为弧度,在几何意义上,atan2(y,x)=atan2(y/x),y>0时,夹角为逆时针,y<0时,夹角为顺时针得到;值域为(-pi/2,pi/2)
angle_axis[0] = q1 * k;
angle_axis[1] = q2 * k;
angle_axis[2] = q3 * k;
} else {
// For zero rotation, sqrt() will produce NaN in derivative since
// the argument is zero. By approximating with a Taylor series,
// and truncating at one term, the value and first derivatives will be
// computed correctly when Jets are used..
const T k(2.0);
angle_axis[0] = q1 * k;
angle_axis[1] = q2 * k;
angle_axis[2] = q3 * k;
}
}
// 4. 旋转向量 旋转 特征点p=[x,y,z]
template<typename T>
// 参数:旋转向量、旋转轴、旋转结果
inline void AngleAxisRotatePoint(const T angle_axis[3], const T pt[3], T result[3]) {
// Away from zero, use the rodriguez formula
// Ow = -Rwc*tcw; 相当于是把角轴直接用罗德里格斯公式转R后直接乘以tcw
// result = pt costheta +
// (w x pt) * sintheta +
// w (w . pt) (1 - costheta)
//
// We want to be careful to only evaluate the square root if the
// norm of the angle_axis vector is greater than zero. Otherwise
// we get a division by zero.
// angle_axis的点击 即模长平方,也就是θ^2
const T theta2 = DotProduct(angle_axis, angle_axis);
// θ不为0
if (theta2 > T(std::numeric_limits<double>::epsilon())) {
// Away from zero, use the rodriguez formula
// 罗德里格斯公式 R*p = cos(θ)*p + (1-cos(θ))*n*n^T*p + sin(θ)*n x p
// result = pt costheta +
// (w x pt) * sintheta +
// w (w . pt) (1 - costheta)
//
// We want to be careful to only evaluate the square root if the
// norm of the angle_axis vector is greater than zero. Otherwise
// we get a division by zero.
//
const T theta = sqrt(theta2); // θ
const T costheta = cos(theta); // cos(θ)
const T sintheta = sin(theta); // sin(θ)
const T theta_inverse = 1.0 / theta; // 1/θ
// w旋转轴 = 旋转向量angle_axis/ θ
const T w[3] = {angle_axis[0] * theta_inverse,
angle_axis[1] * theta_inverse,
angle_axis[2] * theta_inverse};
// Explicitly inlined evaluation of the cross product for
// performance reasons.
/*const T w_cross_pt[3] = { w[1] * pt[2] - w[2] * pt[1],
w[2] * pt[0] - w[0] * pt[2],
w[0] * pt[1] - w[1] * pt[0] };*/
// 他这里其实是把罗德里格斯公式分开计算了
T w_cross_pt[3];
CrossProduct(w, pt, w_cross_pt); // 叉积 w叉积pt w是旋转轴
const T tmp = DotProduct(w, pt) * (T(1.0) - costheta);
// (w[0] * pt[0] + w[1] * pt[1] + w[2] * pt[2]) * (T(1.0) - costheta);
result[0] = pt[0] * costheta + w_cross_pt[0] * sintheta + w[0] * tmp;
result[1] = pt[1] * costheta + w_cross_pt[1] * sintheta + w[1] * tmp;
result[2] = pt[2] * costheta + w_cross_pt[2] * sintheta + w[2] * tmp;
} else {
// Near zero, the first order Taylor approximation of the rotation
// matrix R corresponding to a vector w and angle w is
//
// R = I + hat(w) * sin(theta)
//
// But sintheta ~ theta and theta * w = angle_axis, which gives us
//
// R = I + hat(w)
//
// and actually performing multiplication with the point pt, gives us
// R * pt = pt + w x pt.
//
// Switching to the Taylor expansion near zero provides meaningful
// derivatives when evaluated using Jets.
//
// Explicitly inlined evaluation of the cross product for
// performance reasons.
/*const T w_cross_pt[3] = { angle_axis[1] * pt[2] - angle_axis[2] * pt[1],
angle_axis[2] * pt[0] - angle_axis[0] * pt[2],
angle_axis[0] * pt[1] - angle_axis[1] * pt[0] };*/
T w_cross_pt[3];
CrossProduct(angle_axis, pt, w_cross_pt);
result[0] = pt[0] + w_cross_pt[0];
result[1] = pt[1] + w_cross_pt[1];
result[2] = pt[2] + w_cross_pt[2];
}
}
#endif // rotation.h
4 random.h 返回随机数
#ifndef RAND_H
#define RAND_H
#include <math.h>
#include <stdlib.h>
inline double RandDouble()
{
double r = static_cast<double>(rand()); // int rand (void) 无需参数 返回一个0到RAND_MAX间的随机整数int,static_cast<double>强制类型转换维double型
return r / RAND_MAX; // #define RAND_MAX 2147483647
}
/*
Marsaglia polar method算法: 得到满足N(0,1)正态分布的随机点
x和y是圆: x*x+y*y <=1内的随机点
RandNormal()处理后 x*w是服从标准正态分布的样本
*/
inline double RandNormal()
{
double x1, x2, w;
do{
x1 = 2.0 * RandDouble() - 1.0; // 调用RandDouble,返回随机数
x2 = 2.0 * RandDouble() - 1.0;
w = x1 * x1 + x2 * x2;
}while( w >= 1.0 || w == 0.0); // 是0 或者 大于1那么就接着循环
w = sqrt((-2.0 * log(w))/w); // 0 < w < 1
return x1 * w;
}
#endif // random.h
5 SnavelyReprojectionError.h
#ifndef SnavelyReprojection_H
#define SnavelyReprojection_H
#include <iostream>
#include "ceres/ceres.h"
#include "rotation.h"
class SnavelyReprojectionError {
public:
SnavelyReprojectionError(double observation_x, double observation_y) : observed_x(observation_x),
observed_y(observation_y) {}
template<typename T>
bool operator()(const T *const camera,
const T *const point,
T *residuals) const {
// camera[0,1,2] are the angle-axis rotation
T predictions[2];
CamProjectionWithDistortion(camera, point, predictions);
// 计算残差,像素坐标系uv上的残差
residuals[0] = predictions[0] - T(observed_x);
residuals[1] = predictions[1] - T(observed_y);
return true;
}
// camera : 9 dims array
// [0-2] : angle-axis rotation 旋转
// [3-5] : translateion
// [6-8] : camera parameter, [6] focal length, [7-8] second and forth order radial distortion2,4次径向畸变
// point : 3D location.
// predictions : 2D predictions with center of the image plane.像素坐标
template<typename T>
static inline bool CamProjectionWithDistortion(const T *camera, const T *point, T *predictions) {
// Rodrigues' formula
T p[3];
AngleAxisRotatePoint(camera, point, p);
// camera[3,4,5] are the translation
p[0] += camera[3]; // x
p[1] += camera[4]; // y
p[2] += camera[5]; // z
// Compute the center fo distortion
T xp = -p[0] / p[2]; // -x/z
T yp = -p[1] / p[2]; // -y/z
// Apply second and fourth order radial distortion
const T &l1 = camera[7]; // k1
const T &l2 = camera[8]; // k2
T r2 = xp * xp + yp * yp;
T distortion = T(1.0) + r2 * (l1 + l2 * r2);
const T &focal = camera[6];
predictions[0] = focal * distortion * xp; // 忽略cx和cy
predictions[1] = focal * distortion * yp; // u = f*x*distortion,v = f*y*distortion
return true;
}
/*使用 Ceres Solver 提供的 AutoDiffCostFunction 类创建一个自动微分的成本函数对象,它接受以下参数:
SnavelyReprojectionError:误差模型类的类型。
2:表示误差项的维度,这里是 2,因为我们计算了 x 和 y 方向上的误差。
9 和 3:表示优化变量的维度,分别代表了相机参数和三维点坐标的维度。*/
static ceres::CostFunction *Create(const double observed_x, const double observed_y) {
return (new ceres::AutoDiffCostFunction<SnavelyReprojectionError, 2, 9, 3>(
new SnavelyReprojectionError(observed_x, observed_y)));
}
private:
double observed_x;
double observed_y;
};
#endif // SnavelyReprojection.h
3.4 bundle_adjustment_ceres.cpp
#include <iostream>
#include <ceres/ceres.h>
#include "common.h"
#include "SnavelyReprojectionError.h"
using namespace std;
void SolveBA(BALProblem &bal_problem);
int main(int argc, char **argv) {
if (argc != 2) {
cout << "usage: bundle_adjustment_ceres bal_data.txt" << endl;
return 1;
}
BALProblem bal_problem(argv[1]);
bal_problem.Normalize();
bal_problem.Perturb(0.1, 0.5, 0.5);
bal_problem.WriteToPLYFile("initial.ply");
SolveBA(bal_problem);
bal_problem.WriteToPLYFile("final.ply");
return 0;
}
void SolveBA(BALProblem &bal_problem) {
const int point_block_size = bal_problem.point_block_size();
const int camera_block_size = bal_problem.camera_block_size();
double *points = bal_problem.mutable_points();
double *cameras = bal_problem.mutable_cameras();
// Observations is 2 * num_observations long array observations
// [u_1, u_2, ... u_n], where each u_i is two dimensional, the x
// and y position of the observation.
const double *observations = bal_problem.observations();
ceres::Problem problem;
for (int i = 0; i < bal_problem.num_observations(); ++i) {
ceres::CostFunction *cost_function;
// Each Residual block takes a point and a camera as input
// and outputs a 2 dimensional Residual
cost_function = SnavelyReprojectionError::Create(observations[2 * i + 0], observations[2 * i + 1]);
// If enabled use Huber's loss function.
ceres::LossFunction *loss_function = new ceres::HuberLoss(1.0);
// Each observation corresponds to a pair of a camera and a point
// which are identified by camera_index()[i] and point_index()[i]
// respectively.
double *camera = cameras + camera_block_size * bal_problem.camera_index()[i];
double *point = points + point_block_size * bal_problem.point_index()[i];
problem.AddResidualBlock(cost_function, loss_function, camera, point);
}
// show some information here ...
std::cout << "bal problem file loaded..." << std::endl;
std::cout << "bal problem have " << bal_problem.num_cameras() << " cameras and "
<< bal_problem.num_points() << " points. " << std::endl;
std::cout << "Forming " << bal_problem.num_observations() << " observations. " << std::endl;
std::cout << "Solving ceres BA ... " << endl;
ceres::Solver::Options options;
options.linear_solver_type = ceres::LinearSolverType::SPARSE_SCHUR;
options.minimizer_progress_to_stdout = true;
ceres::Solver::Summary summary;
ceres::Solve(options, &problem, &summary);
std::cout << summary.FullReport() << "\n";
}
3.5 bundle_adjustment_g2o.cpp
#include <g2o/core/base_vertex.h>
#include <g2o/core/base_binary_edge.h>
#include <g2o/core/block_solver.h>
#include <g2o/core/optimization_algorithm_levenberg.h>
#include <g2o/solvers/csparse/linear_solver_csparse.h>
#include <g2o/core/robust_kernel_impl.h>
#include <iostream>
#include "common.h"
#include "sophus/se3.hpp"
using namespace Sophus;
using namespace Eigen;
using namespace std;
// 定义了一个结构体,用来表示位姿和相机内参类
struct PoseAndIntrinsics {
PoseAndIntrinsics() {}
/// set from given data address
explicit PoseAndIntrinsics(double *data_addr) {
rotation = SO3d::exp(Vector3d(data_addr[0], data_addr[1], data_addr[2]));
translation = Vector3d(data_addr[3], data_addr[4], data_addr[5]);
focal = data_addr[6];
k1 = data_addr[7];
k2 = data_addr[8];
}
/// 将估计值放入内存
void set_to(double *data_addr) {
auto r = rotation.log();
for (int i = 0; i < 3; ++i) data_addr[i] = r[i];
for (int i = 0; i < 3; ++i) data_addr[i + 3] = translation[i];
data_addr[6] = focal;
data_addr[7] = k1;
data_addr[8] = k2;
}
SO3d rotation; // 李群
Vector3d translation = Vector3d::Zero();
double focal = 0;
double k1 = 0, k2 = 0;
};
/// 位姿加相机内参的顶点,9维,前三维为so3,接下去为t, f, k1, k2
// g2o::BaseVertex<9, PoseAndIntrinsics> 顶点维度 顶点类型
class VertexPoseAndIntrinsics : public g2o::BaseVertex<9, PoseAndIntrinsics> {
public:
EIGEN_MAKE_ALIGNED_OPERATOR_NEW;
VertexPoseAndIntrinsics() {}
virtual void setToOriginImpl() override {
_estimate = PoseAndIntrinsics();
}
virtual void oplusImpl(const double *update) override {
_estimate.rotation = SO3d::exp(Vector3d(update[0], update[1], update[2])) * _estimate.rotation;
_estimate.translation += Vector3d(update[3], update[4], update[5]);
_estimate.focal += update[6];
_estimate.k1 += update[7];
_estimate.k2 += update[8];
}
/// 根据估计值投影一个点
Vector2d project(const Vector3d &point) {
Vector3d pc = _estimate.rotation * point + _estimate.translation; // 转相机坐标系
pc = -pc / pc[2]; // 转到归一化坐标系
double r2 = pc.squaredNorm();
double distortion = 1.0 + r2 * (_estimate.k1 + _estimate.k2 * r2);
return Vector2d(_estimate.focal * distortion * pc[0],
_estimate.focal * distortion * pc[1]); // 估计的观测值
}
virtual bool read(istream &in) {}
virtual bool write(ostream &out) const {}
};
// 路标点顶点 维度3,类型vector3d
class VertexPoint : public g2o::BaseVertex<3, Vector3d> {
public:
EIGEN_MAKE_ALIGNED_OPERATOR_NEW;
VertexPoint() {}
virtual void setToOriginImpl() override {
_estimate = Vector3d(0, 0, 0);
}
// 更新量
virtual void oplusImpl(const double *update) override {
_estimate += Vector3d(update[0], update[1], update[2]);
}
virtual bool read(istream &in) {}
virtual bool write(ostream &out) const {}
};
// 误差边 <维度2 类型Vector2d 顶点类型>
class EdgeProjection :
public g2o::BaseBinaryEdge<2, Vector2d, VertexPoseAndIntrinsics, VertexPoint> {
public:
EIGEN_MAKE_ALIGNED_OPERATOR_NEW;
virtual void computeError() override {
auto v0 = (VertexPoseAndIntrinsics *) _vertices[0];
auto v1 = (VertexPoint *) _vertices[1];
auto proj = v0->project(v1->estimate());
_error = proj - _measurement; // 预测 - 观测
}
// use numeric derivatives
virtual bool read(istream &in) {}
virtual bool write(ostream &out) const {}
};
void SolveBA(BALProblem &bal_problem);
int main(int argc, char **argv) {
if (argc != 2) {
cout << "usage: bundle_adjustment_g2o bal_data.txt" << endl;
return 1;
}
BALProblem bal_problem(argv[1]);
bal_problem.Normalize();
bal_problem.Perturb(0.1, 0.5, 0.5);
bal_problem.WriteToPLYFile("initial.ply");
SolveBA(bal_problem);
bal_problem.WriteToPLYFile("final.ply");
return 0;
}
void SolveBA(BALProblem &bal_problem) {
const int point_block_size = bal_problem.point_block_size();
const int camera_block_size = bal_problem.camera_block_size();
double *points = bal_problem.mutable_points();
double *cameras = bal_problem.mutable_cameras();
// pose dimension 9, landmark is 3
typedef g2o::BlockSolver<g2o::BlockSolverTraits<9, 3>> BlockSolverType;
typedef g2o::LinearSolverCSparse<BlockSolverType::PoseMatrixType> LinearSolverType;
// use LM
auto solver = new g2o::OptimizationAlgorithmLevenberg(
g2o::make_unique<BlockSolverType>(g2o::make_unique<LinearSolverType>()));
g2o::SparseOptimizer optimizer;
optimizer.setAlgorithm(solver);
optimizer.setVerbose(true);
/// build g2o problem
const double *observations = bal_problem.observations();
// vertex
vector<VertexPoseAndIntrinsics *> vertex_pose_intrinsics;
vector<VertexPoint *> vertex_points;
for (int i = 0; i < bal_problem.num_cameras(); ++i) {
VertexPoseAndIntrinsics *v = new VertexPoseAndIntrinsics();
double *camera = cameras + camera_block_size * i;
v->setId(i);
v->setEstimate(PoseAndIntrinsics(camera)); // 赋值相机参数
optimizer.addVertex(v);
vertex_pose_intrinsics.push_back(v);
}
for (int i = 0; i < bal_problem.num_points(); ++i) {
VertexPoint *v = new VertexPoint();
double *point = points + point_block_size * i;
v->setId(i + bal_problem.num_cameras());
v->setEstimate(Vector3d(point[0], point[1], point[2]));
// g2o在BA中需要手动设置待Marg的顶点
v->setMarginalized(true);
optimizer.addVertex(v);
vertex_points.push_back(v);
}
// edge
for (int i = 0; i < bal_problem.num_observations(); ++i) {
EdgeProjection *edge = new EdgeProjection;
edge->setVertex(0, vertex_pose_intrinsics[bal_problem.camera_index()[i]]);
edge->setVertex(1, vertex_points[bal_problem.point_index()[i]]);
edge->setMeasurement(Vector2d(observations[2 * i + 0], observations[2 * i + 1]));
edge->setInformation(Matrix2d::Identity());
edge->setRobustKernel(new g2o::RobustKernelHuber());
optimizer.addEdge(edge);
}
optimizer.initializeOptimization();
optimizer.optimize(40);
// set to bal problem
for (int i = 0; i < bal_problem.num_cameras(); ++i) {
double *camera = cameras + camera_block_size * i;
auto vertex = vertex_pose_intrinsics[i];
auto estimate = vertex->estimate();
estimate.set_to(camera);
}
for (int i = 0; i < bal_problem.num_points(); ++i) {
double *point = points + point_block_size * i;
auto vertex = vertex_points[i];
for (int k = 0; k < 3; ++k) point[k] = vertex->estimate()[k];
}
}