第七讲:视觉里程计2
文章目录
2 2D-2D对极几何
作者应该是基于ORB-SLAM写的后续内容。
2.1 对极约束
在第一帧坐标系下,三维点P的坐标如下:
P = [ X , Y , Z ] T \boldsymbol{P}=[X, Y, Z]^{\mathrm{T}} P=[X,Y,Z]T
K是相机内参矩阵,把相机系得点转换为像素坐标系点。这里我们以第一帧坐标系为参考坐标系(世界系),把第1帧投影到第2帧当中。
s 1 p 1 = K P , s 2 p 2 = K ( R 21 P + t 21 ) s_1 \boldsymbol{p}_1=\boldsymbol{K} \boldsymbol{P}, \quad s_2 \boldsymbol{p}_2=\boldsymbol{K}(\boldsymbol{R}_{21} \boldsymbol{P}+\boldsymbol{t}_{21}) s1p1=KP,s2p2=K(R21P+t21)
在相机模型中,s一般都是指三维点P得深度Z,也就是一个常数,
s
p
和
p
sp和p
sp和p形成了投影关系,它们在尺度意义下是相等的!
s
p
≃
p
sp\simeq p
sp≃p
把投影方程改为尺度意义下等式:
p
1
≃
K
P
,
p
2
≃
K
(
R
P
+
t
)
p_1\simeq KP,\quad p_2\simeq K\left(RP+t\right)
p1≃KP,p2≃K(RP+t)
p是像素点坐标,乘以
K
−
1
K^{-1}
K−1后即为相机坐标系下得归一化坐标
x
x
x,
x
1
=
K
−
1
p
1
,
x
2
=
K
−
1
p
2
.
\begin{aligned} &\boldsymbol{x}_1=\boldsymbol{K}^{-1} \boldsymbol{p}_1, \quad \boldsymbol{x}_2=\boldsymbol{K}^{-1} \boldsymbol{p}_2 . \end{aligned}
x1=K−1p1,x2=K−1p2.
把第1帧下得归一化坐标投影到第二帧下,得到尺度意义下等式(或利用上面得式子带入到尺度意义下投影方程)
x 2 ≃ R x 1 + t \boldsymbol{x}_2 \simeq \boldsymbol{R} \boldsymbol{x}_1+\boldsymbol{t} x2≃Rx1+t
注意向量t反对称矩阵也即向量与向量得叉积,对上述式子左乘向量t,可得:
t ∧ x 2 ≃ t ∧ R x 1 \begin{aligned} &\boldsymbol{t}^{\wedge} \boldsymbol{x}_2 \simeq \boldsymbol{t}^{\wedge} \boldsymbol{R} \boldsymbol{x}_1 \end{aligned} t∧x2≃t∧Rx1
向量
t
∧
x
{t}^{\wedge}x
t∧x即叉积,得到得结果是一个垂直于t和x得向量,所以在乘以x,那么向量得点击必然为0.
x
2
T
t
∧
x
2
≃
x
2
T
t
∧
R
x
1
=
0
\boldsymbol{x}_2^{\mathrm{T}} \boldsymbol{t}^{\wedge} \boldsymbol{x}_2 \simeq \boldsymbol{x}_2^{\mathrm{T}} \boldsymbol{t}^{\wedge} \boldsymbol{R} \boldsymbol{x}_1 =0
x2Tt∧x2≃x2Tt∧Rx1=0
此时,得到得下面这两个式子就是对极约束,分别由本质矩阵E和基础矩阵F组成
x
2
T
t
∧
R
x
1
=
0
p
2
T
K
−
T
t
∧
R
K
−
1
p
1
=
0
\begin{aligned} &\boldsymbol{x}_2^{\mathrm{T}} \boldsymbol{t}^{\wedge} \boldsymbol{R} \boldsymbol{x}_1=0\\ &\boldsymbol{p}_2^{\mathrm{T}} \boldsymbol{K}^{-\mathrm{T}} \boldsymbol{t}^{\wedge} \boldsymbol{R} \boldsymbol{K}^{-1} \boldsymbol{p}_1=0 \end{aligned}
x2Tt∧Rx1=0p2TK−Tt∧RK−1p1=0
也能够推断出,本质矩阵E和基础矩阵F之间相差一个内参矩阵K
E
=
t
∧
R
,
F
=
K
−
T
E
K
−
1
,
x
2
T
E
x
1
=
p
2
T
F
p
1
=
0
\boldsymbol{E}=\boldsymbol{t}^{\wedge} \boldsymbol{R}, \quad \boldsymbol{F}=\boldsymbol{K}^{-\mathrm{T}} \boldsymbol{E} \boldsymbol{K}^{-1}, \quad \boldsymbol{x}_2^{\mathrm{T}} \boldsymbol{E} \boldsymbol{x}_1=\boldsymbol{p}_2^{\mathrm{T}} \boldsymbol{F} \boldsymbol{p}_1=0
E=t∧R,F=K−TEK−1,x2TEx1=p2TFp1=0
2.2 本质矩阵
2.2.1 本质矩阵性质
E = t ∧ R \boldsymbol{E}=\boldsymbol{t}^{\wedge} \boldsymbol{R} E=t∧R是一个3*3矩阵
- 对极约束得条件是 0 0 0,所以该等式乘以任意得常数后仍然为 0 0 0,也即对E乘以任意的常数后,对极约束依然成立。即 E E E在不同尺度下是等价的。
- 本质矩阵 E E E经过 S V D SVD SVD分解后的奇异值必然是 [ σ , σ , 0 ] T [\sigma,\sigma,0]^\mathrm{T} [σ,σ,0]T.
- 本质矩阵实际只有5个自由端(旋转3+偏移3-尺度1)
2.2.2 八点法求解本质矩阵
考虑一对匹配点,它们的归一化坐标为 x 1 = [ u 1 , v 1 , 1 ] T , x 2 = [ u 2 , v 2 , 1 ] T \boldsymbol{x}_1=[u_1,v_1,1]^\mathrm{T},\boldsymbol{x}_2=[u_2,v_2,1]^\mathrm{T} x1=[u1,v1,1]T,x2=[u2,v2,1]T。根据对极约束,有
( u 2 , v 2 , 1 ) ( e 1 e 2 e 3 e 4 e 5 e 6 e 7 e 8 e 9 ) ( u 1 v 1 1 ) = 0. \left(u_2, v_2, 1\right)\left(\begin{array}{ccc} e_1 & e_2 & e_3 \\ e_4 & e_5 & e_6 \\ e_7 & e_8 & e_9 \end{array}\right)\left(\begin{array}{c} u_1 \\ v_1 \\ 1 \end{array}\right)=0 . (u2,v2,1) e1e4e7e2e5e8e3e6e9 u1v11 =0.
把本质矩阵E写成一个向量的形式
e
=
[
e
1
,
e
2
,
e
3
,
e
4
,
e
5
,
e
6
,
e
7
,
e
8
,
e
9
]
T
\boldsymbol{e}=[e_1,e_2,e_3,e_4,e_5,e_6,e_7,e_8,e_9]^\mathrm{T}
e=[e1,e2,e3,e4,e5,e6,e7,e8,e9]T
则对极约束可以写成于e有关的线性形式:
[
u
2
u
1
,
u
2
v
1
,
u
2
,
v
2
u
1
,
v
2
v
1
,
v
2
,
u
1
,
v
1
,
1
]
⋅
e
=
0
[u_2u_1,u_2v_1,u_2,v_2u_1,v_2v_1,v_2,u_1,v_1,1]\cdot\boldsymbol{e}=0
[u2u1,u2v1,u2,v2u1,v2v1,v2,u1,v1,1]⋅e=0
就是说,一对点可以得到一个上面的式子,我们利用八对点就可得到下面的方程。这个方程可以看成9*9的矩阵,最后一行为0,就是这个矩阵不满秩,那么它一定有非0解且唯一!
(
u
2
1
u
1
1
u
2
1
v
1
1
u
2
1
v
2
1
u
1
1
v
2
1
v
1
1
v
2
1
u
1
1
v
1
1
1
u
2
2
u
1
2
u
2
2
v
1
2
u
2
2
v
2
2
u
1
2
v
2
2
v
1
2
v
2
2
u
1
2
v
1
2
1
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
u
2
8
u
1
8
u
2
8
v
1
8
u
2
8
v
2
8
u
1
8
v
2
8
v
1
8
v
2
8
u
1
8
v
1
8
1
)
(
e
1
e
2
e
3
e
4
e
5
e
6
e
7
e
8
e
9
)
=
0.
\left(\begin{array}{ccccccccc} u_2^1 u_1^1 & u_2^1 v_1^1 & u_2^1 & v_2^1 u_1^1 & v_2^1 v_1^1 & v_2^1 & u_1^1 & v_1^1 & 1 \\ u_2^2 u_1^2 & u_2^2 v_1^2 & u_2^2 & v_2^2 u_1^2 & v_2^2 v_1^2 & v_2^2 & u_1^2 & v_1^2 & 1 \\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \\ u_2^8 u_1^8 & u_2^8 v_1^8 & u_2^8 & v_2^8 u_1^8 & v_2^8 v_1^8 & v_2^8 & u_1^8 & v_1^8 & 1 \end{array}\right)\left(\begin{array}{l} e_1 \\ e_2 \\ e_3 \\ e_4 \\ e_5 \\ e_6 \\ e_7 \\ e_8 \\ e_9 \end{array}\right)=0 .
u21u11u22u12⋮u28u18u21v11u22v12⋮u28v18u21u22⋮u28v21u11v22u12⋮v28u18v21v11v22v12⋮v28v18v21v22⋮v28u11u12⋮u18v11v12⋮v18111
e1e2e3e4e5e6e7e8e9
=0.
2.2.3 怎么利用E来求解位姿
我们最关系的是利用
E
E
E来估计相机运动
R
,
t
R,t
R,t,通过奇异值分解计算
E
=
U
Σ
V
T
E=U\Sigma V^{\mathrm{T}}
E=UΣVT
参考计算机视觉中的多视图几何200页结论
t
1
∧
=
U
R
Z
(
π
2
)
Σ
U
T
,
R
1
=
U
R
Z
T
(
π
2
)
V
T
t
2
∧
=
U
R
Z
(
−
π
2
)
Σ
U
T
,
R
2
=
U
R
Z
T
(
−
π
2
)
V
T
\begin{aligned} \boldsymbol{t}_1^{\wedge} & =\boldsymbol{U} \boldsymbol{R}_Z\left(\frac{\pi}{2}\right) \boldsymbol{\Sigma} \boldsymbol{U}^{\mathrm{T}}, \quad \boldsymbol{R}_1=\boldsymbol{U} \boldsymbol{R}_Z^{\mathrm{T}}\left(\frac{\boldsymbol{\pi}}{2}\right) \boldsymbol{V}^{\mathrm{T}} \\ \boldsymbol{t}_2^{\wedge} & =\boldsymbol{U} \boldsymbol{R}_Z\left(-\frac{\pi}{2}\right) \boldsymbol{\Sigma} \boldsymbol{U}^{\mathrm{T}}, \quad \boldsymbol{R}_2=\boldsymbol{U} \boldsymbol{R}_Z^{\mathrm{T}}\left(-\frac{\pi}{2}\right) \boldsymbol{V}^{\mathrm{T}} \end{aligned}
t1∧t2∧=URZ(2π)ΣUT,R1=URZT(2π)VT=URZ(−2π)ΣUT,R2=URZT(−2π)VT
2.3 单应矩阵H
单应矩阵H是描述两个特征点对之间的关系
2.3.1 使用场景
在初始化中,以下两种情况更适合使用单应矩阵进行初始化位姿估计:
1 相机看到的场景是一个平面
2 连续两帧间没发生平移,只发生旋转
2.3.2 数学推导
特征匹配点分别为 p 1 = [ u 1 , v 1 , 1 ] T , p 2 = [ u 2 , v 2 , 1 ] T p1=[u_1,v_1,1]^T,p2=[u_2,v_2,1]^T p1=[u1,v1,1]T,p2=[u2,v2,1]T,三维空间点(地图点) P P P落在平面上,平面满足方程如下。之前忘记了平面方程写法,所以不知道 d d d是什么。 d d d可以理解为空间平面内的任意一点与法向量的乘积。
n
T
P
+
d
=
0.
n^\mathrm{T}P+d=0.
nTP+d=0.
上式中,
n
n
n是平面的法向量,
R
R
R 和
t
t
t 表示旋转和平移向量
−
n
T
P
d
=
1
-\frac{n^\mathrm{T}P}{d}=1
−dnTP=1
注意
P
P
P是相机坐标系下的点,从参考帧(1)投影到当前帧(2),
p
1
,
p
2
p_1,p_2
p1,p2是像素坐标系下的点
p
1
≃
K
P
,
p
2
≃
K
(
R
21
P
+
t
)
p_1\simeq KP,\quad p_2\simeq K\left(R_{21}P+t\right)
p1≃KP,p2≃K(R21P+t)
p 2 ≃ K ( R 21 P + t ) ≃ K ( R 21 P + t ⋅ ( − n T P d ) ) ≃ K ( R 21 − t n T d ) P ≃ K ( R 21 − t n T d ) K − 1 p 1 . \begin{aligned} p_2& \simeq\boldsymbol{K}(\boldsymbol{R_{21}P}+\boldsymbol{t}) \\ &\simeq K\left(R_{21}P+\boldsymbol{t}\cdot(-\frac{\boldsymbol{n}^\mathrm{T}\boldsymbol{P}}d)\right) \\ &\simeq K\left(R_{21}-\frac{t\boldsymbol{n^\mathrm{T}}}d\right)\boldsymbol{P} \\ &\simeq K\left(R_{21}-\frac{tn^\mathrm{T}}d\right)K^{-1}\boldsymbol{p}_1. \end{aligned} p2≃K(R21P+t)≃K(R21P+t⋅(−dnTP))≃K(R21−dtnT)P≃K(R21−dtnT)K−1p1.
用单应矩阵
H
21
H_{21}
H21描述特征点对之间的变换关系如下,我们也能看出单应矩阵
H
H
H的尺度不变性(尺度不变性—等式两边乘以非0的数,等式恒成立)。这里就是利用单应矩阵简化了上面的等式,注意并不是绝对相等。后面要利用单应矩阵求解位姿
T
T
T也是通过这里的等式!
p
2
=
H
21
p
1
p_{2}=H_{21}p_{1}
p2=H21p1
写为矩阵形式:
[
u
2
v
2
1
]
=
[
h
1
h
2
h
3
h
4
h
5
h
6
h
7
h
8
h
9
]
[
u
1
v
1
1
]
\begin{bmatrix}u_2\\v_2\\1\end{bmatrix}=\begin{bmatrix}h_1&h_2&h_3\\h_4&h_5&h_6\\h_7&h_8&h_9\end{bmatrix}\begin{bmatrix}u_1\\v_1\\1\end{bmatrix}
u2v21
=
h1h4h7h2h5h8h3h6h9
u1v11
上面等式坐标叉乘一个
p
2
p_2
p2
p
2
×
H
21
p
1
=
0
p_{2}\times H_{21}p_{1}=0
p2×H21p1=0
我们知道,叉积可以表示为反对称矩阵,所以就可以得到下面的矩阵乘法
[
0
−
1
v
2
1
0
−
u
2
−
v
2
u
2
0
]
[
h
1
h
2
h
3
h
4
h
5
h
6
h
7
h
8
h
9
]
[
u
1
v
1
1
]
=
0
\begin{bmatrix}0&-1&v_2\\1&0&-u_2\\-v_2&u_2&0\end{bmatrix}\begin{bmatrix}h_1&h_2&h_3\\h_4&h_5&h_6\\h_7&h_8&h_9\end{bmatrix}\begin{bmatrix}u_1\\v_1\\1\end{bmatrix}=0
01−v2−10u2v2−u20
h1h4h7h2h5h8h3h6h9
u1v11
=0
展开整理:
v
2
=
(
h
4
u
1
+
h
5
v
1
+
h
6
)
/
(
h
7
u
1
+
h
8
v
1
+
h
9
)
u
2
=
(
h
1
u
1
+
h
2
v
1
+
h
3
)
/
(
h
7
u
1
+
h
8
v
1
+
h
9
)
\begin{aligned}v_2&=(h_4u_1+h_5v_1+h_6)/(h_7u_1+h_8v_1+h_9)\\u_2&=(h_1u_1+h_2v_1+h_3)/(h_7u_1+h_8v_1+h_9)\end{aligned}
v2u2=(h4u1+h5v1+h6)/(h7u1+h8v1+h9)=(h1u1+h2v1+h3)/(h7u1+h8v1+h9)
写为齐次方程
−
(
h
4
u
1
+
h
5
v
1
+
h
6
)
+
(
h
7
u
1
v
2
+
h
8
v
1
v
2
+
h
9
v
2
)
=
0
h
1
u
1
+
h
2
v
1
+
h
3
−
(
h
7
u
1
u
2
+
h
8
v
1
u
2
+
h
9
u
2
)
=
0
\begin{aligned}-(h_4u_1+h_5v_1+h_6)+(h_7u_1v_2+h_8v_1v_2+h_9v_2)&=0\\h_1u_1+h_2v_1+h_3-(h_7u_1u_2+h_8v_1u_2+h_9u_2)&=0\end{aligned}
−(h4u1+h5v1+h6)+(h7u1v2+h8v1v2+h9v2)h1u1+h2v1+h3−(h7u1u2+h8v1u2+h9u2)=0=0
转化为矩阵形式:
[
0
0
0
−
u
1
−
v
1
−
1
u
1
v
2
v
1
v
2
v
2
u
1
v
1
1
0
0
0
−
u
1
u
2
−
v
1
u
2
−
u
2
]
[
h
1
h
2
h
3
h
4
h
5
h
6
h
7
h
8
h
9
]
=
0
\begin{bmatrix}0&0&0&-u_1&-v_1&-1&u_1v_2&v_1v_2&v_2\\u_1&v_1&1&0&0&0&-u_1u_2&-v_1u_2&-u_2\end{bmatrix}\begin{bmatrix}h_1\\h_2\\h_3\\h_4\\h_5\\h_6\\h_7\\h_8\\h_9\end{bmatrix}=0
[0u10v101−u10−v10−10u1v2−u1u2v1v2−v1u2v2−u2]
h1h2h3h4h5h6h7h8h9
=0
A h = 0 Ah=0 Ah=0
实际上,上面等式并不是严格为0的。所以我们可以等价定义(求最接近于0,最小值,利用最小二乘法)一个代价函数:
f
(
h
)
=
1
2
(
A
h
)
⊤
(
A
h
)
=
1
2
h
⊤
A
⊤
A
h
f(\boldsymbol{h})=\frac{1}{2}(\boldsymbol{A}\boldsymbol{h})^\top(\boldsymbol{A}\boldsymbol{h})=\frac{1}{2}\boldsymbol{h}^\top\boldsymbol{A}^\top\boldsymbol{A}\boldsymbol{h}
f(h)=21(Ah)⊤(Ah)=21h⊤A⊤Ah
等式**(矩阵)求导(分母布局):
d
f
(
h
)
d
h
=
0
A
⊤
A
h
=
0
\begin{gathered}\frac{\mathrm{d}f(\boldsymbol{h})}{\mathrm{d}\boldsymbol{h}}=0\\A^{\top}A\boldsymbol{h}=0\end{gathered}
dhdf(h)=0A⊤Ah=0
这个时候把
A
T
A
A^TA
ATA看作一个矩阵,再乘以一个列向量,我们可以联想到矩阵的特征值**,类似下面形式的
A
T
A
h
=
λ
h
A^TAh=\lambda h
ATAh=λh
也就是说,我们可以转化问题为求
A
T
A
A^TA
ATA的最小特征值对应的特征向量h
为了简化问题,对矩阵
A
A
A进行SVD分解(注意
U
、
V
U、V
U、V都是正交矩阵,
D
D
D是对角矩阵)
A
⊤
A
=
(
U
D
V
⊤
)
⊤
(
U
D
V
⊤
)
=
V
D
⊤
U
⊤
U
D
V
⊤
=
V
D
⊤
D
V
⊤
A^\top A=(UDV^\top)^\top(UDV^\top)=VD^\top U^\top UDV^\top=VD^\top DV^\top
A⊤A=(UDV⊤)⊤(UDV⊤)=VD⊤U⊤UDV⊤=VD⊤DV⊤
其中
D
T
D
D^TD
DTD后还是对角阵,一个矩阵可以由下面的特征分解表示,
X
X
X是特征向量组成矩阵。
B
=
X
Σ
X
−
1
B= XΣX^{-1}
B=XΣX−1
所以这里我们可以得到
A
T
A
A^TA
ATA的特征向量矩阵
V
V
V。然后前面已经推出,找到其最小特征值对应的特征向量就是最优解!而且,SVD分解的对角矩阵
D
D
D的奇异值是从左向右从大到小排列的,所以V的最后一列(对应最小奇异值)就是我们要找的最优解!
这里用到了很多的矩阵理论中知识,请参考这里,具体的代码实现可参考ORB-SLAM.
2.3.3 代码实现
/**
* @brief 用DLT方法求解单应矩阵H————DLT即直接线性变换
* 这里最少用4对点就能够求出来,不过这里为了统一还是使用了8对点求最小二乘解
*
* @param[in] vP1 参考帧中归一化后的特征点
* @param[in] vP2 当前帧中归一化后的特征点
* @return cv::Mat 计算的单应矩阵H
*/
cv::Mat Initializer::ComputeH21(
const vector<cv::Point2f> &vP1, //归一化后的点, in reference frame
const vector<cv::Point2f> &vP2) //归一化后的点, in current frame
{
//获取参与计算的特征点的数目
const int N = vP1.size();
// 构造用于计算的矩阵 A
cv::Mat A(2*N, //行,注意每一个点的数据对应两行
9, //列
CV_32F); //float数据类型
// 构造矩阵A,一个点对组成矩阵两行,一共8对,所以是16行,A_16*9
for(int i=0; i<N; i++)
{
//获取特征点对的像素坐标
const float u1 = vP1[i].x; // 参考帧点
const float v1 = vP1[i].y;
const float u2 = vP2[i].x; // 当前帧
const float v2 = vP2[i].y;
//生成这个点的第一行
A.at<float>(2*i,0) = 0.0;
A.at<float>(2*i,1) = 0.0;
A.at<float>(2*i,2) = 0.0;
A.at<float>(2*i,3) = -u1;
A.at<float>(2*i,4) = -v1;
A.at<float>(2*i,5) = -1;
A.at<float>(2*i,6) = v2*u1;
A.at<float>(2*i,7) = v2*v1;
A.at<float>(2*i,8) = v2;
//生成这个点的第二行
A.at<float>(2*i+1,0) = u1;
A.at<float>(2*i+1,1) = v1;
A.at<float>(2*i+1,2) = 1;
A.at<float>(2*i+1,3) = 0.0;
A.at<float>(2*i+1,4) = 0.0;
A.at<float>(2*i+1,5) = 0.0;
A.at<float>(2*i+1,6) = -u2*u1;
A.at<float>(2*i+1,7) = -u2*v1;
A.at<float>(2*i+1,8) = -u2;
}
// 定义输出变量,u是左边的正交矩阵U, w为奇异矩阵,vt中的t表示是右正交矩阵V的转置
cv::Mat u,w,vt;
//使用opencv提供的进行奇异值分解的函数
cv::SVDecomp(A, //输入,待进行奇异值分解的矩阵
w, //输出,奇异值矩阵
u, //输出,矩阵U
vt, //输出,矩阵V^T
cv::SVD::MODIFY_A | //输入,MODIFY_A是指允许计算函数可以修改待分解的矩阵,官方文档上说这样可以加快计算速度、节省内存
cv::SVD::FULL_UV); //FULL_UV=把U和VT补充成单位正交方阵
// 返回最小奇异值所对应的右奇异向量
// 注意前面说的是右奇异值矩阵的最后一列,但是在这里因为是vt,转置后了,所以是行;由于A有9列数据,故最后一列的下标为8(0-8)
return vt.row(8).reshape(0, // 转换后的通道数,这里设置为0表示是与前面相同
3); // 原来是1行9列 现在行设置为3 就变成3*3矩阵,即我们要求解的单应矩阵!!
// 真的很妙
}
- 对极约束
#include <iostream>
#include <opencv2/core/core.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/calib3d/calib3d.hpp>
// 特征匹配函数
void find_feature_matches(const cv::Mat &img_1, const cv::Mat &img_2,
std::vector<cv::KeyPoint>&keypoint_1,std::vector<cv::KeyPoint> &keypoint_2,
std::vector<cv::DMatch> &matches);
// 2D2D对极约束
void pose_estimation_2d2d(std::vector<cv::KeyPoint> keypoint_1, std::vector<cv::KeyPoint> keypoint_2,
std::vector<cv::DMatch> matches, cv::Mat &R, cv::Mat &t);
//
cv::Point2d pixel2cam(const cv::Point2d &p, const cv::Mat &K);
int main(int argc, char** argv)
{
if(argc != 3)
{
std::cout << "usage: pose_estimation_2d2d img1 img2" << std::endl;
return 1;
}
// 初始化图像矩阵
cv::Mat img_1 = cv::imread(argv[1], CV_LOAD_IMAGE_COLOR);
cv::Mat img_2 = cv::imread(argv[2], CV_LOAD_IMAGE_COLOR);
assert(img_1.data && img_2.data && "Can not load images!");
// 初始化关键点向量、匹配向量
std::vector<cv::KeyPoint> keypoint_1,keypoint_2;
std::vector<cv::DMatch> matches;
find_feature_matches(img_1, img_2, keypoint_1, keypoint_2, matches);
std::cout << "一共找到了" << matches.size() << "组匹配点" << std::endl;
// 初始化位姿变量
cv::Mat R, t;
pose_estimation_2d2d(keypoint_1, keypoint_2, matches, R, t);
// 验证本质矩阵正确与否
cv::Mat t_x = (cv::Mat_<double>(3, 3) << 0,-t.at<double>(2, 0), t.at<double>(1, 0), t.at<double>(2,0), 0, -t.at<double>(0, 0),
-t.at<double>(1, 0), t.at<double>(0, 0), 0 );
std::cout << "t^R=" << std::endl << t_x * R << std::endl;
//-- 验证对极约束
Mat K = (Mat_<double>(3, 3) << 520.9, 0, 325.1, 0, 521.0, 249.7, 0, 0, 1);
for (DMatch m: matches) {
Point2d pt1 = pixel2cam(keypoints_1[m.queryIdx].pt, K);
Mat y1 = (Mat_<double>(3, 1) << pt1.x, pt1.y, 1);
Point2d pt2 = pixel2cam(keypoints_2[m.trainIdx].pt, K);
Mat y2 = (Mat_<double>(3, 1) << pt2.x, pt2.y, 1);
Mat d = y2.t() * t_x * R * y1;
cout << "epipolar constraint = " << d << endl;
}
return 0;
}
// 一、特征匹配函数
void find_feature_matches(const cv::Mat &img_1, const cv::Mat &img_2,
std::vector<cv::KeyPoint> &keypoints_1,
std::vector<cv::KeyPoint> &keypoints_2,
std::vector<cv::DMatch> &matches)
{
//初始化描述子矩阵 关键点指针、描述子指针、匹配指针
cv::Mat descriptors_1,descriptors_2;
cv::Ptr<cv::FeatureDetector> detector = cv::ORB::create();
cv::Ptr<cv::DescriptorExtractor> descriptor = cv::ORB::create();
cv::Ptr<cv::DescriptorMatcher> matcher = cv::DescriptorMatcher::create("BruteForce-Hamming");
//1.检测 Oriented FAST 角点位置
detector->detect(img_1, keypoints_1);
detector->detect(img_2, keypoints_2);
//2.:根据角点位置计算 BRIEF 描述子
descriptor->compute(img_1, keypoints_1, descriptors_1);
descriptor->compute(img_2, keypoints_2, descriptors_2);
//3.对两幅图像中的BRIEF描述子进行匹配,使用 Hamming 距离
std::vector<cv::DMatch> match;
//BFMatcher matcher ( NORM_HAMMING );
matcher->match(descriptors_1, descriptors_2, match);
//4.匹配点对筛选
double min_dist = 10000, max_dist = 0;
// 选出上面匹配结果中,两帧图像匹配成功的特征点的的汉明距离 的最大与最小值(在所有特征点中)
for(int i = 0;i < descriptors_1.rows; i++)
{
double dist = match[i].distance;
if(dist < min_dist) {min_dist = dist;}
if(dist > max_dist) {max_dist = dist;}
}
printf("-- Max dist : %f \n", max_dist);
printf("-- Min dist : %f \n", min_dist);
// 排除匹配失败的特征点对
for(int i = 0; i < descriptors_1.rows; i ++)
{
if(match[i].distance <= std::max(2*min_dist, 30.0))
{
matches.push_back(match[i]);
}
}
}
// 二、像素坐标转相机归一化坐标
cv::Point2d pixel2cam(const cv::Point2d &p, const cv::Mat &K)
{
// 因为函数是Point类型,所以返回值也必须是Point2d,也就是坐标索引。
return cv::Point2d
(
// 根据公式推,X = (u -cx)/fx fx即内参矩阵K(0,0) fy即K(1,1)
(p.x - K.at<double>(0, 2)) / K.at<double>(0, 0),
(p.y - K.at<double>(1, 2)) / K.at<double>(1, 1)
);
}
void pose_estimation_2d2d(std::vector<cv::KeyPoint> keypoints_1,
std::vector<cv::KeyPoint> keypoints_2,
std::vector<cv::DMatch> matches,
cv::Mat &R, cv::Mat &t)
{
// 注意这里 Mat_ <<重载,输入参数 相机内参K
cv::Mat k = (cv::Mat_<double>(3, 3) << 520.9, 0, 325.1, 0, 521.0, 249.7, 0, 0, 1);
// 把匹配点转换为vector<Point2f>的形式
std::vector<cv::Point2f> points1;
std::vector<cv::Point2f> points2;
for(int i =0; i < (int) matches.size(); i++)
{
points1.push_back(keypoints_1[matches[i].queryIdx].pt);
points2.push_back(keypoints_2[matches[i].trainIdx].pt);
}
//-- 计算基础矩阵
cv::Mat fundamental_matrix; // 只需要关键点向量
fundamental_matrix = findFundamentalMat(points1, points2, CV_FM_8POINT);
std::cout << "fundamental_matrix is " << std::endl << fundamental_matrix << std::endl;
//-- 计算本质矩阵
cv::Point2d principal_point(325.1, 249.7); //相机光心, TUM dataset标定值
double focal_length = 521; //相机焦距, TUM dataset标定值
cv::Mat essential_matrix;
essential_matrix = findEssentialMat(points1, points2, focal_length, principal_point);
std::cout << "essential_matrix is " << std::endl << essential_matrix << std::endl;
//-- 计算单应矩阵
//-- 但是本例中场景不是平面,单应矩阵意义不大
cv::Mat homography_matrix;
homography_matrix = findHomography(points1, points2, cv::RANSAC, 3);
std::cout << "homography_matrix is " << std::endl << homography_matrix << std::endl;
//-- 从本质矩阵中恢复旋转和平移信息.
// 此函数仅在Opencv3中提供
recoverPose(essential_matrix, points1, points2, R, t, focal_length, principal_point);
std::cout << "R is " << std::endl << R << std::endl;
std::cout << "t is " << std::endl << t << std::endl;
}
cv::Mat k = (cv::Mat_<double>(3, 3) << 520.9, 0, 325.1, 0, 521.0, 249.7, 0, 0, 1);
3 3D - 2D PnP
PnP是求解3D到2D点对运动的方法。即当知道n个3D空间点及其投影位置时,如何估计相机的位姿。如果两张图像中的一张特征点的3D位置已知,那么最少只需3个点对(以及至少一个额外点验证结果)就可以估计相机运动。
在双目和RGB-D相机中,特征点的3D位置可以由三角化或者RGB-D相机的深度图确定,所以可以直接使用PnP来估计相机运动;在单目相机中,深度未知,所以必须初始化!
3.1 直接线性变化DLT
空间点P的齐次坐标 P = ( X , Y , Z , 1 ) T \boldsymbol{P}=(X,Y,Z,1)^\mathrm{T} P=(X,Y,Z,1)T,投影点的像素(齐次)坐标 x 1 = ( u 1 , v 1 , 1 ) T x_1=(u_1,v_1,1)^\mathrm{T} x1=(u1,v1,1)T
s ( u 1 v 1 1 ) = ( t 1 t 2 t 3 t 4 t 5 t 6 t 7 t 8 t 9 t 10 t 11 t 12 ) ( X Y Z 1 ) . s\left(\begin{array}{l} u_1 \\ v_1 \\ 1 \end{array}\right)=\left(\begin{array}{cccc} t_1 & t_2 & t_3 & t_4 \\ t_5 & t_6 & t_7 & t_8 \\ t_9 & t_{10} & t_{11} & t_{12} \end{array}\right)\left(\begin{array}{c} X \\ Y \\ Z \\ 1 \end{array}\right) . s u1v11 = t1t5t9t2t6t10t3t7t11t4t8t12 XYZ1 .
按行展开,消除s
u 1 = t 1 X + t 2 Y + t 3 Z + t 4 t 9 X + t 10 Y + t 11 Z + t 12 , v 1 = t 5 X + t 6 Y + t 7 Z + t 8 t 9 X + t 10 Y + t 11 Z + t 12 u_1=\frac{t_1 X+t_2 Y+t_3 Z+t_4}{t_9 X+t_{10} Y+t_{11} Z+t_{12}}, \quad v_1=\frac{t_5 X+t_6 Y+t_7 Z+t_8}{t_9 X+t_{10} Y+t_{11} Z+t_{12}} u1=t9X+t10Y+t11Z+t12t1X+t2Y+t3Z+t4,v1=t9X+t10Y+t11Z+t12t5X+t6Y+t7Z+t8
简化变换矩阵T
t 1 = ( t 1 , t 2 , t 3 , t 4 ) T , t 2 = ( t 5 , t 6 , t 7 , t 8 ) T , t 3 = ( t 9 , t 10 , t 11 , t 12 ) T , \boldsymbol{t}_1=\left(t_1, t_2, t_3, t_4\right)^{\mathrm{T}}, \boldsymbol{t}_2=\left(t_5, t_6, t_7, t_8\right)^{\mathrm{T}}, \boldsymbol{t}_3=\left(t_9, t_{10}, t_{11}, t_{12}\right)^{\mathrm{T}} \text {, } t1=(t1,t2,t3,t4)T,t2=(t5,t6,t7,t8)T,t3=(t9,t10,t11,t12)T,
可以得到
t 1 T P − t 3 T P u 1 = 0 , t 2 T P − t 3 T P v 1 = 0. \begin{aligned} & \boldsymbol{t}_1^{\mathrm{T}} \boldsymbol{P}-\boldsymbol{t}_3^{\mathrm{T}} \boldsymbol{P} u_1=0, \\ & \boldsymbol{t}_2^{\mathrm{T}} \boldsymbol{P}-\boldsymbol{t}_3^{\mathrm{T}} \boldsymbol{P} v_1=0 . \end{aligned} t1TP−t3TPu1=0,t2TP−t3TPv1=0.
一个路标点可以得到两个方程,如果有 N N N个点,那么就有 2 N 2N 2N个方程
( P 1 T 0 − u 1 P 1 T 0 P 1 T − v 1 P 1 T ⋮ ⋮ ⋮ P N T 0 − u N P N T 0 P N T − v N P N T ) ( t 1 t 2 t 3 ) = 0 \left(\begin{array}{ccc} \boldsymbol{P}_1^{\mathrm{T}} & 0 & -u_1 \boldsymbol{P}_1^{\mathrm{T}} \\ 0 & \boldsymbol{P}_1^{\mathrm{T}} & -v_1 \boldsymbol{P}_1^{\mathrm{T}} \\ \vdots & \vdots & \vdots \\ \boldsymbol{P}_N^{\mathrm{T}} & 0 & -u_N \boldsymbol{P}_N^{\mathrm{T}} \\ 0 & \boldsymbol{P}_N^{\mathrm{T}} & -v_N \boldsymbol{P}_N^{\mathrm{T}} \end{array}\right)\left(\begin{array}{l} \boldsymbol{t}_1 \\ \boldsymbol{t}_2 \\ \boldsymbol{t}_3 \end{array}\right)=0 P1T0⋮PNT00P1T⋮0PNT−u1P1T−v1P1T⋮−uNPNT−vNPNT t1t2t3 =0
t
t
t一共有12维,因此最少通过
6
6
6对匹配,点即可实现矩阵
T
T
T的线性求解,这种方法称为
D
L
T
DLT
DLT。当匹配点大于6对时,也可以使用SVD等方法对超定方程求最小二乘解。
在
D
L
T
DLT
DLT求解中,我们直接将T矩阵看成了12个未知数,忽略了它们之间的联系。因为旋转矩阵
R
∈
S
O
(
3
)
R∈SO(3)
R∈SO(3),用
D
L
T
DLT
DLT求出的解不一定满足该约束,它是一个一般矩阵。对于旋转矩阵
R
R
R,我们必须针对
D
L
T
DLT
DLT估计的
T
T
T左边3×3的矩阵块,寻找一个最好的旋转矩阵对它进行近似。可以这样来计算
R
←
(
R
R
T
)
−
1
2
R
.
\boldsymbol{R} \leftarrow\left(\boldsymbol{R R}^{\mathrm{T}}\right)^{-\frac{1}{2}} \boldsymbol{R} .
R←(RRT)−21R.
3.2 P3P
输入:3对3D-2D匹配点
输出:3D-3D点
本质:PnP问题转换为了ICP问题
缺点:易受噪声影响,容易误匹配,不能利用更多的点,ORB-SLAM采用了EPnP改进
△ O a b − △ O A B , △ O b c − △ O B C , △ O a c − △ O A C . \triangle O a b-\triangle O A B, \quad \triangle O b c-\triangle O B C, \quad \triangle O a c-\triangle O A C . △Oab−△OAB,△Obc−△OBC,△Oac−△OAC.
3.3 最小化重投影误差求解PnP
3.3.1 数学模型
根据投影模型,空间点
P
i
=
[
X
i
,
Y
i
,
Z
i
]
T
P_i=[X_i,Y_i,Z_i]^{\mathrm{T}}
Pi=[Xi,Yi,Zi]T投影到像素坐标系下
u
i
=
[
u
i
,
v
i
]
T
u_i=[u_i,v_i]^\mathrm{T}
ui=[ui,vi]T为:
s
i
[
u
i
v
i
1
]
=
K
T
[
X
i
Y
i
Z
i
1
]
.
s_i\left[\begin{array}{c} u_i \\ v_i \\ 1 \end{array}\right]=\boldsymbol{K} \boldsymbol{T}\left[\begin{array}{c} X_i \\ Y_i \\ Z_i \\ 1 \end{array}\right] .
si
uivi1
=KT
XiYiZi1
.
写为矩阵形式:
s i u i = K T P i s_i \boldsymbol{u}_i=\boldsymbol{K} \boldsymbol{T} \boldsymbol{P}_i siui=KTPi
相机的观测方程必然会存在误差,而且一般来讲,我们优化是对整体的优化,所以把误差求和,构建最小二乘问题,找到最优的位姿使误差最小:
T
∗
=
arg
min
T
1
2
∑
i
=
1
n
∥
u
i
−
1
s
i
K
T
P
i
∥
2
2
\boldsymbol{T}^*=\arg \min _{\boldsymbol{T}} \frac{1}{2} \sum_{i=1}^n\left\|\boldsymbol{u}_i-\frac{1}{s_i} \boldsymbol{K} \boldsymbol{T} \boldsymbol{P}_i\right\|_2^2
T∗=argTmin21i=1∑n
ui−si1KTPi
22
下面的图可以很形象的描述重投影问题。通过特征匹配已经知道
p
1
p_1
p1和
p
2
p_2
p2是同一个空间点P的投影,但是两帧间准确位姿关系未知,通过不断调整位姿T
来使得
p
^
2
\hat p_2
p^2和
p
2
p_2
p2的距离缩小。当然,优化是对整体来讲的,要兼顾每个点的误差来考虑。
3.3.2 模型推导—误差项对优化量的导数
e ( x + Δ x ) ≈ e ( x ) + J T Δ x \boldsymbol{e}(\boldsymbol{x}+\Delta x) \approx \boldsymbol{e}(\boldsymbol{x})+J^{\mathrm{T}} \Delta \boldsymbol{x} e(x+Δx)≈e(x)+JTΔx
这里的 J T J^T JT的形式是值得讨论的,甚至可以说是关键所在。我们固然可以使用数值导数,但如果能够推导出解析形式,则优先考虑解析导数。
① 位姿李代数的雅可比
相机的投影模型如下:
P
′
=
(
T
P
)
1
:
3
=
[
X
′
,
Y
′
,
Z
′
]
T
\boldsymbol{P}^{\prime}=(\boldsymbol{T} \boldsymbol{P})_{1: 3}=\left[X^{\prime}, Y^{\prime}, Z^{\prime}\right]^{\mathrm{T}}
P′=(TP)1:3=[X′,Y′,Z′]T
s u = K P ′ . [ s u s v s ] = [ f x 0 c x 0 f y c y 0 0 1 ] [ X ′ Y ′ Z ′ ] . \begin{aligned} &s \boldsymbol{u}=\boldsymbol{K} \boldsymbol{P}^{\prime} .\\ &\left[\begin{array}{c} s u \\ s v \\ s \end{array}\right]=\left[\begin{array}{ccc} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{array}\right]\left[\begin{array}{l} X^{\prime} \\ Y^{\prime} \\ Z^{\prime} \end{array}\right] . \end{aligned} su=KP′. susvs = fx000fy0cxcy1 X′Y′Z′ .
s
可以通过矩阵第三行计算,代入其他两行消除:
u = f x X ′ Z ′ + c x , v = f y Y ′ Z ′ + c y u=f_x \frac{X^{\prime}}{Z^{\prime}}+c_x, \quad v=f_y \frac{Y^{\prime}}{Z^{\prime}}+c_y u=fxZ′X′+cx,v=fyZ′Y′+cy
对位姿求导,这里也就是对李代数求导,使用左扰动,利用链式法则,如下:
∂ e ∂ δ ξ = lim δ ξ → 0 e ( δ ξ ⊕ ξ ) − e ( ξ ) δ ξ = ∂ e ∂ P ′ ∂ P ′ ∂ δ ξ \frac{\partial \boldsymbol{e}}{\partial \delta \boldsymbol{\xi}}=\lim _{\delta \boldsymbol{\xi} \rightarrow 0} \frac{\boldsymbol{e}(\delta \boldsymbol{\xi} \oplus \boldsymbol{\xi})-\boldsymbol{e}(\boldsymbol{\xi})}{\delta \boldsymbol{\xi}}=\frac{\partial \boldsymbol{e}}{\partial \boldsymbol{P}^{\prime}} \frac{\partial \boldsymbol{P}^{\prime}}{\partial \delta \boldsymbol{\xi}} ∂δξ∂e=δξ→0limδξe(δξ⊕ξ)−e(ξ)=∂P′∂e∂δξ∂P′
第一项:误差关于投影点P‘的导数,Δe = 观测 - 预测,所以求导后有个负号,然后根据相机模型中u、v的式子分别求导
∂ e ∂ P ′ = − [ ∂ u ∂ X ′ ∂ u ∂ Y ′ ∂ u ∂ Z ′ ∂ v ∂ X ′ ∂ v ∂ Y ′ ∂ v ∂ Z ′ ] = − [ f x Z ′ 0 − f x X ′ Z ′ 2 0 f y Z ′ − f y Y ′ Z ′ 2 ] . \frac{\partial \boldsymbol{e}}{\partial \boldsymbol{P}^{\prime}}=-\left[\begin{array}{ccc} \frac{\partial u}{\partial X^{\prime}} & \frac{\partial u}{\partial Y^{\prime}} & \frac{\partial u}{\partial Z^{\prime}} \\ \frac{\partial v}{\partial X^{\prime}} & \frac{\partial v}{\partial Y^{\prime}} & \frac{\partial v}{\partial Z^{\prime}} \end{array}\right]=-\left[\begin{array}{ccc} \frac{f_x}{Z^{\prime}} & 0 & -\frac{f_x X^{\prime}}{Z^{\prime 2}} \\ 0 & \frac{f_y}{Z^{\prime}} & -\frac{f_y Y^{\prime}}{Z^{\prime 2}} \end{array}\right] \text {. } ∂P′∂e=−[∂X′∂u∂X′∂v∂Y′∂u∂Y′∂v∂Z′∂u∂Z′∂v]=−[Z′fx00Z′fy−Z′2fxX′−Z′2fyY′].
第二项:变换后的点关于李代数的求导,见第四章SE(3)扰动求导,右上角是变换后点的反对称矩阵
∂ ( T P ) ∂ δ ξ = ( T P ) ⊙ = [ I − P ′ ∧ 0 T 0 T ] \frac{\partial(\boldsymbol{T} \boldsymbol{P})}{\partial \delta \boldsymbol{\xi}}=(\boldsymbol{T} \boldsymbol{P})^{\odot}=\left[\begin{array}{cc} \boldsymbol{I} & -\boldsymbol{P}^{\prime \wedge} \\ \mathbf{0}^{\mathrm{T}} & \mathbf{0}^{\mathrm{T}} \end{array}\right] ∂δξ∂(TP)=(TP)⊙=[I0T−P′∧0T]
取出前三维(行维,下面都是0,没有用),即3*6的矩阵
∂
P
′
∂
δ
ξ
=
[
I
,
−
P
′
∧
]
\frac{\partial \boldsymbol{P}^{\prime}}{\partial \delta \boldsymbol{\xi}}=\left[\boldsymbol{I},-\boldsymbol{P}^{\prime \wedge}\right]
∂δξ∂P′=[I,−P′∧]
两项之积,2*6矩阵,可自行检验
∂ e ∂ δ ξ = − [ f x Z ′ 0 − f x X ′ Z ′ 2 − f x X ′ Y ′ Z ′ 2 f x + f x X ′ 2 Z ′ 2 − f x Y ′ Z ′ 0 f y Z ′ − f y Y ′ Z ′ 2 − f y − f y Y ′ 2 Z ′ 2 f y X ′ Y ′ Z ′ 2 f y X ′ Z ′ ] \frac{\partial \boldsymbol{e}}{\partial \delta \boldsymbol{\xi}}=-\left[\begin{array}{cccccc} \frac{f_x}{Z^{\prime}} & 0 & -\frac{f_x X^{\prime}}{Z^{\prime 2}} & -\frac{f_x X^{\prime} Y^{\prime}}{Z^{\prime 2}} & f_x+\frac{f_x X^{\prime 2}}{Z^{\prime 2}} & -\frac{f_x Y^{\prime}}{Z^{\prime}} \\ 0 & \frac{f_y}{Z^{\prime}} & -\frac{f_y Y^{\prime}}{Z^{\prime 2}} & -f_y-\frac{f_y Y^{\prime 2}}{Z^{\prime 2}} & \frac{f_y X^{\prime} Y^{\prime}}{Z^{\prime 2}} & \frac{f_y X^{\prime}}{Z^{\prime}} \end{array}\right] ∂δξ∂e=−[Z′fx00Z′fy−Z′2fxX′−Z′2fyY′−Z′2fxX′Y′−fy−Z′2fyY′2fx+Z′2fxX′2Z′2fyX′Y′−Z′fxY′Z′fyX′]
注意,这里的李代数se(3)形式 ξ = [ ρ ϕ ] \boldsymbol{\xi}=\begin{bmatrix}\rho\\\\\phi\end{bmatrix} ξ= ρϕ ,所以如果se(3)中旋转在前,平移在后,那么结果矩阵前三行核后三列对调
② 空间点P的雅可比
∂ e ∂ P = ∂ e ∂ P ′ ∂ P ′ ∂ P \frac{\partial \boldsymbol{e}}{\partial \boldsymbol{P}}=\frac{\partial \boldsymbol{e}}{\partial \boldsymbol{P}^{\prime}} \frac{\partial \boldsymbol{P}^{\prime}}{\partial \boldsymbol{P}} ∂P∂e=∂P′∂e∂P∂P′
第一项:误差关于投影点P‘的导数,Δe = 观测 - 预测,所以求导后有个负号,然后根据相机模型中u、v的式子分别求导
∂ e ∂ P ′ = − [ ∂ u ∂ X ′ ∂ u ∂ Y ′ ∂ u ∂ Z ′ ∂ v ∂ X ′ ∂ v ∂ Y ′ ∂ v ∂ Z ′ ] = − [ f x Z ′ 0 − f x X ′ Z ′ 2 0 f y Z ′ − f y Y ′ Z ′ 2 ] \frac{\partial \boldsymbol{e}}{\partial \boldsymbol{P}^{\prime}}=-\left[\begin{array}{ccc} \frac{\partial u}{\partial X^{\prime}} & \frac{\partial u}{\partial Y^{\prime}} & \frac{\partial u}{\partial Z^{\prime}} \\ \frac{\partial v}{\partial X^{\prime}} & \frac{\partial v}{\partial Y^{\prime}} & \frac{\partial v}{\partial Z^{\prime}} \end{array}\right]=-\left[\begin{array}{ccc} \frac{f_x}{Z^{\prime}} & 0 & -\frac{f_x X^{\prime}}{Z^{\prime 2}} \\ 0 & \frac{f_y}{Z^{\prime}} & -\frac{f_y Y^{\prime}}{Z^{\prime 2}} \end{array}\right] \text { } ∂P′∂e=−[∂X′∂u∂X′∂v∂Y′∂u∂Y′∂v∂Z′∂u∂Z′∂v]=−[Z′fx00Z′fy−Z′2fxX′−Z′2fyY′]
第二项:很明显求导后只有R
P ′ = ( T P ) 1 : 3 = R P + t \boldsymbol{P}^{\prime}=(\boldsymbol{T P})_{1: 3}=\boldsymbol{R P}+\boldsymbol{t} P′=(TP)1:3=RP+t
两项之积
∂ e ∂ P = − [ f x Z ′ 0 − f x X ′ Z ′ 2 0 f y Z ′ − f y Y ′ Z ′ 2 ] R \frac{\partial \boldsymbol{e}}{\partial \boldsymbol{P}}=-\left[\begin{array}{ccc} \frac{f_x}{Z^{\prime}} & 0 & -\frac{f_x X^{\prime}}{Z^{\prime 2}} \\ 0 & \frac{f_y}{Z^{\prime}} & -\frac{f_y Y^{\prime}}{Z^{\prime 2}} \end{array}\right] \boldsymbol{R} ∂P∂e=−[Z′fx00Z′fy−Z′2fxX′−Z′2fyY′]R
3.3.2 实验程序
- pose_estimation_3d2d.cpp
// 函数声明1:特征匹配
void find_feature_matches(
const Mat &img_1, const Mat &img_2,
std::vector<KeyPoint> &keypoints_1,
std::vector<KeyPoint> &keypoints_2,
std::vector<DMatch> &matches);
// 函数声明2:像素坐标转相机归一化坐标
Point2d pixel2cam(const Point2d &p, const Mat &K);
// BA by g2o
typedef vector<Eigen::Vector2d, Eigen::aligned_allocator<Eigen::Vector2d>> VecVector2d;
typedef vector<Eigen::Vector3d, Eigen::aligned_allocator<Eigen::Vector3d>> VecVector3d;
// 函数声明3:调用G2o库的BA法 光束法平差 最小化重投影误差PnP
void bundleAdjustmentG2O(
const VecVector3d &points_3d,
const VecVector2d &points_2d,
const Mat &K,
Sophus::SE3d &pose
);
// 函数声明4:手写高斯牛顿的BA法
void bundleAdjustmentGaussNewton(
const VecVector3d &points_3d,
const VecVector2d &points_2d,
const Mat &K,
Sophus::SE3d &pose
);
int main(int argc, char **argv) {
if (argc != 5) {
cout << "usage: pose_estimation_3d2d img1 img2 depth1 depth2" << endl;
return 1;
}
//1.读取图像
Mat img_1 = imread(argv[1], CV_LOAD_IMAGE_COLOR);
Mat img_2 = imread(argv[2], CV_LOAD_IMAGE_COLOR);
assert(img_1.data && img_2.data && "Can not load images!");
//2.特征匹配
vector<KeyPoint> keypoints_1, keypoints_2;
vector<DMatch> matches;
find_feature_matches(img_1, img_2, keypoints_1, keypoints_2, matches);
cout << "一共找到了" << matches.size() << "组匹配点" << endl;
//3.建立3D点
Mat d1 = imread(argv[3], CV_LOAD_IMAGE_UNCHANGED); // 深度图为16位无符号数,单通道图像
Mat K = (Mat_<double>(3, 3) << 520.9, 0, 325.1, 0, 521.0, 249.7, 0, 0, 1);
vector<Point3f> pts_3d;
vector<Point2f> pts_2d;
// queryIdx代表的特征点序列是keypoints1中的,trainIdx代表的特征点序列是keypoints2中的
for (DMatch m:matches) {
ushort d = d1.ptr<unsigned short>(int(keypoints_1[m.queryIdx].pt.y))[int(keypoints_1[m.queryIdx].pt.x)];
if (d == 0) // bad depth
continue;
float dd = d / 5000.0;
Point2d p1 = pixel2cam(keypoints_1[m.queryIdx].pt, K);
pts_3d.push_back(Point3f(p1.x * dd, p1.y * dd, dd));
pts_2d.push_back(keypoints_2[m.trainIdx].pt);
}
cout << "3d-2d pairs: " << pts_3d.size() << endl;
// 4. 直接调用opencv库
chrono::steady_clock::time_point t1 = chrono::steady_clock::now();
Mat r, t;
solvePnP(pts_3d, pts_2d, K, Mat(), r, t, false); // 调用OpenCV 的 PnP 求解,可选择EPNP,DLS等方法
Mat R;
cv::Rodrigues(r, R); // r为旋转向量形式,用Rodrigues公式转换为矩阵
chrono::steady_clock::time_point t2 = chrono::steady_clock::now();
chrono::duration<double> time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1);
cout << "solve pnp in opencv cost time: " << time_used.count() << " seconds." << endl;
cout << "R=" << endl << R << endl;
cout << "t=" << endl << t << endl;
// 5. 自己写G2o库和直接高斯牛顿
// 为了与上面3D2D作区别,我们换个名字,然后将pts_3d,pts_2d数组的值一一拷贝到这个新的数组当中去
VecVector3d pts_3d_eigen;
VecVector2d pts_2d_eigen;
for (size_t i = 0; i < pts_3d.size(); ++i) {
pts_3d_eigen.push_back(Eigen::Vector3d(pts_3d[i].x, pts_3d[i].y, pts_3d[i].z));
pts_2d_eigen.push_back(Eigen::Vector2d(pts_2d[i].x, pts_2d[i].y));
}
// 手写高斯牛顿函数迭代
Sophus::SE3d pose_gn; // 位姿T
bundleAdjustmentGaussNewton(pts_3d_eigen, pts_2d_eigen, K, pose_gn);
// 图优化 调用G2o库
Sophus::SE3d pose_g2o;
bundleAdjustmentG2O(pts_3d_eigen, pts_2d_eigen, K, pose_g2o);
return 0;
}
// 1 特征匹配函数
void find_feature_matches(const Mat &img_1, const Mat &img_2,
std::vector<KeyPoint> &keypoints_1,
std::vector<KeyPoint> &keypoints_2,
std::vector<DMatch> &matches) {
Mat descriptors_1, descriptors_2;
Ptr<FeatureDetector> detector = ORB::create();
Ptr<DescriptorExtractor> descriptor = ORB::create();
// use this if you are in OpenCV2
// Ptr<FeatureDetector> detector = FeatureDetector::create ( "ORB" );
// Ptr<DescriptorExtractor> descriptor = DescriptorExtractor::create ( "ORB" );
Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("BruteForce-Hamming");
//-- 1.检测 Oriented FAST 角点位置
detector->detect(img_1, keypoints_1);
detector->detect(img_2, keypoints_2);
//-- 2.根据角点位置计算 BRIEF 描述子
descriptor->compute(img_1, keypoints_1, descriptors_1);
descriptor->compute(img_2, keypoints_2, descriptors_2);
//-- 3.对两幅图像中的BRIEF描述子进行匹配,使用 Hamming 距离
vector<DMatch> match;
// BFMatcher matcher ( NORM_HAMMING );
matcher->match(descriptors_1, descriptors_2, match);
//-- 4.匹配点对筛选
double min_dist = 10000, max_dist = 0;
//找出所有匹配之间的最小距离和最大距离, 即是最相似的和最不相似的两组点之间的距离
for (int i = 0; i < descriptors_1.rows; i++) {
double dist = match[i].distance;
if (dist < min_dist) min_dist = dist;
if (dist > max_dist) max_dist = dist;
}
printf("-- Max dist : %f \n", max_dist);
printf("-- Min dist : %f \n", min_dist);
//当描述子之间的距离大于两倍的最小距离时,即认为匹配有误.但有时候最小距离会非常小,设置一个经验值30作为下限.
for (int i = 0; i < descriptors_1.rows; i++) {
if (match[i].distance <= max(2 * min_dist, 30.0)) {
matches.push_back(match[i]);
}
}
}
// 2 像素转归一化函数
Point2d pixel2cam(const Point2d &p, const Mat &K) {
return Point2d
(
// 公式,第5章、2d2d 都有这个过程
(p.x - K.at<double>(0, 2)) / K.at<double>(0, 0),
(p.y - K.at<double>(1, 2)) / K.at<double>(1, 1)
);
}
// 3 手写高斯牛顿函数
void bundleAdjustmentGaussNewton(
const VecVector3d &points_3d,
const VecVector2d &points_2d,
const Mat &K,
Sophus::SE3d &pose)
{
typedef Eigen::Matrix<double, 6, 1> Vector6d;
const int iterations = 10;
double cost = 0, lastCost = 0;
// 内参K
double fx = K.at<double>(0, 0);
double fy = K.at<double>(1, 1);
double cx = K.at<double>(0, 2);
double cy = K.at<double>(1, 2);
for (int iter = 0; iter < iterations; iter++) {
Eigen::Matrix<double, 6, 6> H = Eigen::Matrix<double, 6, 6>::Zero();
Vector6d b = Vector6d::Zero();
cost = 0;
// compute cost
for (int i = 0; i < points_3d.size(); i++) {
Eigen::Vector3d pc = pose * points_3d[i];
double inv_z = 1.0 / pc[2];
double inv_z2 = inv_z * inv_z;
Eigen::Vector2d proj(fx * pc[0] / pc[2] + cx, fy * pc[1] / pc[2] + cy);
Eigen::Vector2d e = points_2d[i] - proj;
cost += e.squaredNorm();
Eigen::Matrix<double, 2, 6> J;
J << -fx * inv_z,
0,
fx * pc[0] * inv_z2,
fx * pc[0] * pc[1] * inv_z2,
-fx - fx * pc[0] * pc[0] * inv_z2,
fx * pc[1] * inv_z,
0,
-fy * inv_z,
fy * pc[1] * inv_z2,
fy + fy * pc[1] * pc[1] * inv_z2,
-fy * pc[0] * pc[1] * inv_z2,
-fy * pc[0] * inv_z;
H += J.transpose() * J;
b += -J.transpose() * e;
}
Vector6d dx;
dx = H.ldlt().solve(b);
if (isnan(dx[0])) {
cout << "result is nan!" << endl;
break;
}
if (iter > 0 && cost >= lastCost) {
// cost increase, update is not good
cout << "cost: " << cost << ", last cost: " << lastCost << endl;
break;
}
// update your estimation
pose = Sophus::SE3d::exp(dx) * pose;
lastCost = cost;
cout << "iteration " << iter << " cost=" << std::setprecision(12) << cost << endl;
if (dx.norm() < 1e-6) {
// converge
break;
}
}
cout << "pose by g-n: \n" << pose.matrix() << endl;
}
// 4.优化参数 -- 点 g2o点函数重定义
class VertexPose : public g2o::BaseVertex<6, Sophus::SE3d> {
public:
EIGEN_MAKE_ALIGNED_OPERATOR_NEW;
virtual void setToOriginImpl() override {
_estimate = Sophus::SE3d();
}
/// left multiplication on SE3
virtual void oplusImpl(const double *update) override {
Eigen::Matrix<double, 6, 1> update_eigen;
update_eigen << update[0], update[1], update[2], update[3], update[4], update[5];
_estimate = Sophus::SE3d::exp(update_eigen) * _estimate;
}
virtual bool read(istream &in) override {}
virtual bool write(ostream &out) const override {}
};
// 5.优化参数 -- 边 g2o边函数重定义
class EdgeProjection : public g2o::BaseUnaryEdge<2, Eigen::Vector2d, VertexPose> {
public:
EIGEN_MAKE_ALIGNED_OPERATOR_NEW;
EdgeProjection(const Eigen::Vector3d &pos, const Eigen::Matrix3d &K) : _pos3d(pos), _K(K) {}
virtual void computeError() override {
const VertexPose *v = static_cast<VertexPose *> (_vertices[0]);
Sophus::SE3d T = v->estimate();
Eigen::Vector3d pos_pixel = _K * (T * _pos3d);
pos_pixel /= pos_pixel[2];
_error = _measurement - pos_pixel.head<2>();
}
virtual void linearizeOplus() override {
const VertexPose *v = static_cast<VertexPose *> (_vertices[0]);
Sophus::SE3d T = v->estimate();
Eigen::Vector3d pos_cam = T * _pos3d;
double fx = _K(0, 0);
double fy = _K(1, 1);
double cx = _K(0, 2);
double cy = _K(1, 2);
double X = pos_cam[0];
double Y = pos_cam[1];
double Z = pos_cam[2];
double Z2 = Z * Z;
_jacobianOplusXi
<< -fx / Z, 0, fx * X / Z2, fx * X * Y / Z2, -fx - fx * X * X / Z2, fx * Y / Z,
0, -fy / Z, fy * Y / (Z * Z), fy + fy * Y * Y / Z2, -fy * X * Y / Z2, -fy * X / Z;
}
virtual bool read(istream &in) override {}
virtual bool write(ostream &out) const override {}
private:
Eigen::Vector3d _pos3d;
Eigen::Matrix3d _K;
};
// 6.BA算法
void bundleAdjustmentG2O(
const VecVector3d &points_3d,
const VecVector2d &points_2d,
const Mat &K,
Sophus::SE3d &pose) {
// 构建图优化,先设定g2o
typedef g2o::BlockSolver<g2o::BlockSolverTraits<6, 3>> BlockSolverType; // pose is 6, landmark is 3
typedef g2o::LinearSolverDense<BlockSolverType::PoseMatrixType> LinearSolverType; // 线性求解器类型
// 梯度下降方法,可以从GN, LM, DogLeg 中选
auto solver = new g2o::OptimizationAlgorithmGaussNewton(
g2o::make_unique<BlockSolverType>(g2o::make_unique<LinearSolverType>()));
g2o::SparseOptimizer optimizer; // 图模型
optimizer.setAlgorithm(solver); // 设置求解器
optimizer.setVerbose(true); // 打开调试输出
// vertex
VertexPose *vertex_pose = new VertexPose(); // camera vertex_pose
vertex_pose->setId(0);
vertex_pose->setEstimate(Sophus::SE3d());
optimizer.addVertex(vertex_pose);
// K
Eigen::Matrix3d K_eigen;
K_eigen <<
K.at<double>(0, 0), K.at<double>(0, 1), K.at<double>(0, 2),
K.at<double>(1, 0), K.at<double>(1, 1), K.at<double>(1, 2),
K.at<double>(2, 0), K.at<double>(2, 1), K.at<double>(2, 2);
// edges
int index = 1;
for (size_t i = 0; i < points_2d.size(); ++i) {
auto p2d = points_2d[i];
auto p3d = points_3d[i];
EdgeProjection *edge = new EdgeProjection(p3d, K_eigen);
edge->setId(index);
edge->setVertex(0, vertex_pose);
edge->setMeasurement(p2d);
edge->setInformation(Eigen::Matrix2d::Identity());
optimizer.addEdge(edge);
index++;
}
chrono::steady_clock::time_point t1 = chrono::steady_clock::now();
optimizer.setVerbose(true);
optimizer.initializeOptimization();
optimizer.optimize(10);
chrono::steady_clock::time_point t2 = chrono::steady_clock::now();
chrono::duration<double> time_used = chrono::duration_cast<chrono::duration<double>>(t2 - t1);
cout << "optimization costs time: " << time_used.count() << " seconds." << endl;
cout << "pose estimated by g2o =\n" << vertex_pose->estimate().matrix() << endl;
pose = vertex_pose->estimate();
}