[笔记]回顾机器学习中降维方法——主成分分析(PCA)
[笔记]回顾机器学习中降维方法——主成分分析(PCA)
概述
主成分分析(Principal Component Analysis,简称PCA)是最常用的一种降维方法。之前分析的LOAM代码laserMapping部分中也有用到。它究竟是一种怎么样的算法呢?
数据降维
为了说明什么是数据的主成分,先从数据降维说起。
数据降维是怎么回事儿?
假设三维空间中有一系列点,这些点分布在一个过原点的斜面上,如果用三维直角坐标系
x
,
y
,
z
x,y,z
x,y,z这三个轴来表示这组数据的话,需要使用三个维度,而事实上,这些点的分布仅仅是在一个二维的平面上,那么,问题出在哪里?
能不能把
x
,
y
,
z
x,y,z
x,y,z坐标系旋转一下,使数据所在平面与
x
O
y
xOy
xOy平面重合?
如果把旋转后的坐标系记为 x ’ , y ’ , z ’ x’,y’,z’ x’,y’,z’,那么这组数据的表示只用 x ’ x’ x’和 y ’ y’ y’两个维度表示即可!当然了,如果想恢复原来的表示方式,那就得把这两个坐标之间的变换矩阵存下来。这样就能把数据维度降下来了!
但是,我们要看到这个过程的本质,如果把这些数据按行或者按列排成一个矩阵,那么这个矩阵的秩就是2!这些数据之间是有相关性的,这些数据构成的过原点的向量的最大线性无关组包含2个向量,这就是为什么一开始就假设平面过原点的原因!那么如果平面不过原点呢?这就是数据中心化的缘故!将坐标原点平移到数据中心,这样原本不相关的数据在这个新坐标系中就有相关性了!有趣的是,三点一定共面,也就是说三维空间中任意三点中心化后都是线性相关的,一般来讲 n n n维空间中的 n n n个点一定能在一个 n − 1 n-1 n−1维子空间中分析!所以,不要说数据不相关,那是因为坐标没选对!
上面这个例子里把数据降维后并没有丢弃任何东西,因为这些数据在平面以外的第三个维度的分量都为0。现在,我假设这些数据在 z ’ z’ z’轴有一个很小的抖动,那么我们仍然用上述的二维表示这些数据,理由是我认为这两个轴的信息是数据的主成分,而这些信息对于我们的分析已经足够了, z ’ z’ z’轴上的抖动很有可能是噪声,也就是说本来这组数据是有相关性的,噪声的引入,导致了数据不完全相关,但是,这些数据在 z ’ z’ z’轴上的分布与原点构成的夹角非常小,也就是说在 z ’ z’ z’轴上有很大的相关性,综合这些考虑,就可以认为数据在 x ’ , y ’ x’,y’ x’,y’轴上的投影构成了数据的主成分。
PCA算法推导
我们从两方面对PCA算法进行推导:
- 最近重构性:样本点到这个平面的距离都足够近
- 最大可分性:样本点在这个平面上的投影尽可能分开
基于上述两种推导,能分别得到PCA的两种等价推导。
(如无说明,以下推导中出现的数据都是默认是列向量)
给定一组数据(包含
n
n
n个样本):
X
=
[
x
1
T
x
2
T
⋮
x
n
T
]
,
x
i
∈
R
p
X=\begin{bmatrix} x_1^T\\ x_2^T\\ \vdots \\ x_n^T \end{bmatrix} ,x_i\in \mathbb{R}^p
X=⎣⎢⎢⎢⎡x1Tx2T⋮xnT⎦⎥⎥⎥⎤,xi∈Rp
假设投影变换后得到的新坐标系为
W
=
[
ω
1
T
ω
2
T
⋮
ω
q
T
]
∈
R
q
×
p
W=\begin{bmatrix} \omega_1^T\\ \omega_2^T\\ \vdots \\ \omega_q^T \end{bmatrix}\in \mathbb{R}^{q \times p}
W=⎣⎢⎢⎢⎡ω1Tω2T⋮ωqT⎦⎥⎥⎥⎤∈Rq×p,其中
ω
i
\omega_i
ωi为标准正交基向量,
∥
ω
i
∥
2
=
1
,
ω
i
T
ω
j
=
0
(
i
≠
j
)
.
\left \| {\omega_i}\right \|_2=1,\omega_i^T\omega_j=0 (i\neq j).
∥ωi∥2=1,ωiTωj=0(i=j).
若丢弃新坐标系中的部分坐标,即将维度降低到
q
<
p
{q}<{p}
q<p,则样本点
x
i
x_i
xi在低维坐标系中的投影,可以使用一个矩阵进行变换得到:
y
i
=
W
x
i
(
y
i
∈
R
q
,
q
<
p
)
y_i=Wx_i(y_i\in\mathbb{R}^{q},{q}<{p})
yi=Wxi(yi∈Rq,q<p)
这样,降维的问题就变成了如何得到
W
W
W了?首先,试着将
y
y
y展开来看看:
y
=
W
x
=
w
1
T
x
+
w
2
T
x
+
.
.
.
+
w
q
T
x
y=Wx=w_1^Tx+w_2^Tx+...+w_q^Tx
y=Wx=w1Tx+w2Tx+...+wqTx每一个
w
i
w_i
wi同
x
x
x一样是
p
p
p维向量,那么得到这些向量就得到了
W
W
W。且因为由于
y
y
y是每一个向量与样本进行点积的总和,所以我们应该是出于同样的理由来确定每一个
w
i
w_i
wi.
现在假设有一个
w
i
w_i
wi,将它与样本的点积展开:
w
i
T
x
=
w
i
⋅
x
=
∣
w
i
∣
∣
x
∣
c
o
s
θ
=
∣
x
∣
c
o
s
θ
w_i^Tx=w_i\cdot x = \left | w_i \right |\left | x\right |cos \theta =\left | x\right |cos \theta
wiTx=wi⋅x=∣wi∣∣x∣cosθ=∣x∣cosθ
s
t
.
∥
ω
∥
2
=
1
,
即
ω
T
ω
=
1
st. \left \| {\omega}\right \|_2=1,即\omega^T\omega=1
st.∥ω∥2=1,即ωTω=1这样,
w
i
w_i
wi与样本的点积就是样本在
w
i
w_i
wi上的投影了。
为了下面的推导,先定义几个变量:
样本均值:
x
ˉ
=
1
n
∑
i
=
0
n
x
i
\bar{x}=\frac{1}{n}\sum_{i=0}^{n}{x_i}
xˉ=n1i=0∑nxi
样本投影的均值:
μ
=
1
n
∑
i
n
w
T
x
i
\mu = \frac{1}{n}\sum_i^nw^Tx_i
μ=n1i∑nwTxi
样本的协方差矩阵:
c
o
v
(
X
)
=
Σ
=
1
n
∑
i
n
(
x
i
−
x
ˉ
)
(
x
i
−
x
ˉ
)
T
cov(X)=\Sigma =\frac{1}{n}\sum_{i}^{n}(x_i-\bar{x})(x_i-\bar{x})^T
cov(X)=Σ=n1i∑n(xi−xˉ)(xi−xˉ)T
接着,对样本在
w
w
w上的投影后的方差进行推导化简:
σ
2
=
1
n
∑
i
n
(
ω
T
x
i
−
μ
)
2
\sigma^2=\frac{1}{n}\sum_{i}^{n}(\omega^Tx_i-\mu)^2
σ2=n1i∑n(ωTxi−μ)2
=
1
n
∑
i
n
(
ω
T
x
i
−
1
n
∑
i
n
w
T
x
i
)
2
=\frac{1}{n}\sum_{i}^{n}(\omega^Tx_i-\frac{1}{n}\sum_i^nw^Tx_i)^2
=n1i∑n(ωTxi−n1i∑nwTxi)2
=
1
n
∑
i
n
(
ω
T
x
i
−
w
T
(
1
n
∑
i
n
x
i
)
)
2
=\frac{1}{n}\sum_{i}^{n}(\omega^Tx_i-w^T(\frac{1}{n}\sum_i^nx_i))^2
=n1i∑n(ωTxi−wT(n1i∑nxi))2
=
1
n
∑
i
n
(
ω
T
(
x
i
−
x
ˉ
)
)
2
{\color{Red}=\frac{1}{n}\sum_{i}^{n}(\omega^T(x_i-\bar{x}))^2}
=n1i∑n(ωT(xi−xˉ))2
=
1
n
∑
i
n
(
ω
T
(
x
i
−
x
ˉ
)
)
(
ω
T
(
x
i
−
x
ˉ
)
)
T
=\frac{1}{n}\sum_{i}^{n}(\omega^T(x_i-\bar{x}))(\omega^T(x_i-\bar{x}))^T
=n1i∑n(ωT(xi−xˉ))(ωT(xi−xˉ))T
=
1
n
∑
i
n
ω
T
(
x
i
−
x
ˉ
)
(
x
i
−
x
ˉ
)
T
ω
=\frac{1}{n}\sum_{i}^{n}\omega^T(x_i-\bar{x})(x_i-\bar{x})^T\omega
=n1i∑nωT(xi−xˉ)(xi−xˉ)Tω
=
ω
T
(
1
n
∑
i
n
(
x
i
−
x
ˉ
)
(
x
i
−
x
ˉ
)
T
)
ω
=\omega^T(\frac{1}{n}\sum_{i}^{n}(x_i-\bar{x})(x_i-\bar{x})^T)\omega
=ωT(n1i∑n(xi−xˉ)(xi−xˉ)T)ω
=
ω
T
Σ
ω
=\omega^T\Sigma\omega
=ωTΣω因为
(
ω
T
(
x
i
−
x
ˉ
)
)
2
(\omega^T(x_i-\bar{x}))^2
(ωT(xi−xˉ))2的结果是一个实数
(
[
.
.
.
]
1
×
p
[
⋮
]
p
×
1
=
c
)
([...]_{1 \times p}[\vdots]_{p\times 1}=c)
([...]1×p[⋮]p×1=c),是一个
1
×
1
1\times 1
1×1的矩阵,所以它自身等同于它的转置,所以可以进行上述推导。
若基于最近重构性来推导,得:
w
^
=
a
r
g
m
i
n
w
−
(
w
T
Σ
w
)
\hat{w}=arg\underset{w}{min}-(w^T\Sigma w)
w^=argwmin−(wTΣw)
s
t
.
w
T
w
=
1
st. w^Tw=1
st.wTw=1公式(1)
若基于最大可分性来推导,得:
w
^
=
a
r
g
m
a
x
w
w
T
Σ
w
\hat{w}=arg\underset{w}{max}w^T\Sigma w
w^=argwmaxwTΣw
s
t
.
w
T
w
=
1
st. w^Tw=1
st.wTw=1公式(2)
显然,公式(1)和公式(2)是等价的。
使用拉格朗日乘子法构建一个目标函数:
L
(
w
,
λ
)
=
w
T
Σ
w
+
λ
(
1
−
w
T
w
)
L(w,\lambda)=w^T\Sigma w+\lambda(1-w^Tw)
L(w,λ)=wTΣw+λ(1−wTw)对该函数求偏导:
∂
L
(
w
,
λ
)
∂
w
=
2
Σ
w
−
2
λ
w
\frac{\partial L(w,\lambda)}{\partial w}=2\Sigma w - 2\lambda w
∂w∂L(w,λ)=2Σw−2λw令偏导为0,可得:
Σ
w
=
λ
w
\Sigma w=\lambda w
Σw=λw
观察上式可得,
λ
\lambda
λ是
Σ
\Sigma
Σ的特征值,
w
w
w是对应的特征向量。我们只要将所有的特征值排序,然后选择
q
q
q个最大的特征值对应的特征向量,即可得到
W
W
W.
为什么要取最大的那几个特征值呢?
先回顾一下我们的目的:投影后的方差最大化:
σ
2
=
w
T
Σ
w
\sigma^2=w^T\Sigma w
σ2=wTΣw然后将拉格朗日函数对
w
w
w的偏导的结果代入到上式中:
σ
2
=
w
T
Σ
w
=
λ
w
T
w
=
λ
\sigma^2=w^T\Sigma w=\lambda w^Tw=\lambda
σ2=wTΣw=λwTw=λ由此推出,投影后的方差就是协方差矩阵的特征值。
另外,由于
Σ
\Sigma
Σ是一个对称矩阵,所以可以得到如下特征分解:
Σ
=
G
Λ
G
T
\Sigma = G\Lambda G^T
Σ=GΛGT
G
G
G是一个
p
×
p
p\times p
p×p的矩阵,正好也是正交矩阵,每一列都是一个模为1的特征向量
G
=
[
g
1
,
g
2
,
.
.
.
,
g
p
]
G=[g_1,g_2,...,g_p]
G=[g1,g2,...,gp]我们可以取最大
q
q
q个列向量来组成
W
W
W
W
=
[
g
1
,
g
2
,
.
.
.
,
g
q
]
T
W=[g_1,g_2,...,g_q]^T
W=[g1,g2,...,gq]T从而得到降维后的数据
Y
T
=
W
X
T
Y^T=WX^T
YT=WXT
从上面的推导中得知,降维的本质就是特征分解。
伪代码:
输入:样本集X(数据矩阵,每一行是一个样本);
要降到的维数q(q应小于X的列数).
过程:
1、对所有样本进行中心化;
2、计算样本的协方差矩阵Sigma;
3、对协方差矩阵Sigma做特征值分解;
4:取最大的q个特征值所对应的特征向量w1,w2,...,wq;
输出:投影矩阵W=(w1,w2,...,wq)^T
SVD奇异值分解
下面,从奇异值分解(Singular Value Decomposition,SVD)的角度来进行降维(SVD在eigen和OpenCV中有函数可用,文末附录也有手工计算方法)。首先,将协方差矩阵进行拆解:
Σ
=
1
n
∑
i
n
(
x
i
−
x
ˉ
)
(
x
i
−
x
ˉ
)
T
\Sigma=\frac{1}{n}\sum_{i}^{n}(x_i-\bar{x})(x_i-\bar{x})^T
Σ=n1i∑n(xi−xˉ)(xi−xˉ)T
=
1
n
[
x
1
−
x
ˉ
,
x
2
−
x
ˉ
,
⋯
,
x
n
−
x
ˉ
]
[
x
1
−
x
ˉ
,
x
2
−
x
ˉ
,
⋯
,
x
n
−
x
ˉ
]
T
=\frac{1}{n}{\color{Red}[{x_1-\bar{x}},{x_2-\bar{x}},\cdots,x_n-\bar{x}] }[{x_1-\bar{x}},{x_2-\bar{x}},\cdots,x_n-\bar{x}]^T
=n1[x1−xˉ,x2−xˉ,⋯,xn−xˉ][x1−xˉ,x2−xˉ,⋯,xn−xˉ]T将上式红色部分提出来化简:
[
x
1
−
x
ˉ
,
x
2
−
x
ˉ
,
⋯
,
x
n
−
x
ˉ
]
=
[
x
1
,
x
2
,
⋯
,
x
n
]
−
[
x
ˉ
]
n
T
[{x_1-\bar{x}},{x_2-\bar{x}},\cdots,x_n-\bar{x}]=[x_1,x_2,\cdots,x_n]-[\bar{x}]_n^T
[x1−xˉ,x2−xˉ,⋯,xn−xˉ]=[x1,x2,⋯,xn]−[xˉ]nT
=
X
T
−
x
ˉ
[
1
]
n
T
=X^T-\bar{x} [1]_n^T
=XT−xˉ[1]nT
=
X
T
−
1
n
X
T
[
1
]
n
[
1
]
n
T
=X^T-\frac{1}{n}X^T[1]_n[1]_n^T
=XT−n1XT[1]n[1]nT
=
X
T
(
I
n
−
1
n
[
1
]
n
[
1
]
n
T
)
=X^T{\color{Red}(I_n-\frac{1}{n}[1]_n[1]_n^T)}
=XT(In−n1[1]n[1]nT)
=
X
T
H
=X^T{\color{Red}H}
=XTH其中
[
x
ˉ
]
n
[\bar{x}]_n
[xˉ]n表示含有
n
n
n个值的向量,并且所有元素的值都为
x
ˉ
\bar{x}
xˉ,然后
H
H
H是中心化矩阵,代入协方差矩阵的公式,得到
Σ
=
1
n
X
T
H
(
X
T
H
)
T
{\color{Red}\Sigma=\frac{1}{n}X^TH(X^TH)^T}
Σ=n1XTH(XTH)T然后,对矩阵
X
T
H
X^TH
XTH进行奇异值分解:
X
T
H
=
U
K
V
T
X^TH=UKV^T
XTH=UKVT并且构造矩阵
S
S
S:
S
=
X
T
H
(
X
T
H
)
T
S=X^TH(X^TH)^T
S=XTH(XTH)T
=
U
K
V
T
V
K
U
T
=UKV^TVKU^T
=UKVTVKUT
=
U
K
2
U
T
=UK^2U^T
=UK2UT
从
S
S
S与
Σ
\Sigma
Σ的关系可以的到:
S
=
n
Σ
S=n\Sigma
S=nΣ
U
K
2
U
T
=
G
n
Λ
G
T
{\color{Red}U}K^2{\color{Red}U^T}={\color{Red}G}n\Lambda {\color{Red}G^T}
UK2UT=GnΛGT可得
U
=
G
{\color{Red}U=G}
U=G而
U
U
U就是
X
T
H
X^TH
XTH的左奇异矩阵。
根据上文的推导:
G
G
G是一个
p
×
p
p\times p
p×p的矩阵,正好也是正交矩阵,每一列都是一个模为1的特征向量
G
=
[
g
1
,
g
2
,
.
.
.
,
g
p
]
=
U
=
[
u
1
,
u
2
,
.
.
.
,
u
p
]
G=[g_1,g_2,...,g_p]=U=[u_1,u_2,...,u_p]
G=[g1,g2,...,gp]=U=[u1,u2,...,up]我们可以取最大
q
q
q个列向量来组成
W
W
W
W
=
[
u
1
,
u
2
,
.
.
.
,
u
q
]
T
W=[u_1,u_2,...,u_q]^T
W=[u1,u2,...,uq]T从而得到降维后的数据
Y
T
=
W
X
T
Y^T=WX^T
YT=WXT
伪代码:
输入:样本集X(数据矩阵,每一行是一个样本);
要降到的维数q(q应小于X的列数).
过程:
1、对所有样本进行中心化;XH^T = X - meanX
2、计算样本的协方差矩阵Sigma;Sigma = XH^T * XH^T /n
3、对样本中心化矩阵XH_T.T做SVD分解;得左奇异矩阵U
4:取最大的q个奇异值所对应的特征向量w1,w2,...,wq;
输出:投影矩阵W=(w1,w2,...,wq)^T
奇异值分解推导
与特征分解不同,对于任意一个矩阵
A
A
A,不一定能够进行特征分解,但是一定能够进行奇异值分解:
A
=
U
Σ
V
T
A=U\Sigma V^T
A=UΣVT
设
A
∈
R
m
×
n
A\in \mathbb{R}^{m \times n}
A∈Rm×n,则
U
∈
R
m
×
m
U\in \mathbb{R}^{m \times m}
U∈Rm×m,
Σ
∈
R
m
×
n
\Sigma \in \mathbb{R}^{m \times n}
Σ∈Rm×n和
V
∈
R
n
×
n
V\in \mathbb{R}^{n \times n}
V∈Rn×n都是正交矩阵,
U
U
U叫左奇异矩阵,
V
V
V叫右奇异矩阵,而
Σ
\Sigma
Σ是一个对角矩阵,它对角线上的元素的是
A
A
A的奇异值。
可以通过如下方法来进行矩阵的奇异值分解,首先构造两个矩阵
S
=
A
T
A
S=A^TA
S=ATA
T
=
A
A
T
T=AA^T
T=AAT可知这两个矩阵都是对称矩阵,所以能够对它们进行特征分解
S
=
P
Λ
S
P
T
S=P\Lambda_SP^T
S=PΛSPT
T
=
Q
Λ
T
Q
T
T=Q\Lambda_TQ^T
T=QΛTQT接着,再对这两个矩阵进行特征分解:
S
=
A
T
A
{\color{Red}S=A^TA}
S=ATA
=
U
Σ
V
T
(
U
Σ
V
T
)
T
=U\Sigma V^T(U\Sigma V^T)^T
=UΣVT(UΣVT)T
=
U
Σ
V
T
V
Σ
T
U
T
=U\Sigma V^TV\Sigma ^TU^T
=UΣVTVΣTUT
=
U
Σ
Σ
T
U
T
=U\Sigma \Sigma ^TU^T
=UΣΣTUT
=
P
Λ
S
P
T
=P\Lambda_SP^T
=PΛSPT可知,
P
=
U
{\color{Red}P=U}
P=U
Λ
S
=
Σ
Σ
T
{\color{Red}\Lambda _S= \Sigma \Sigma ^T}
ΛS=ΣΣT
同样,对
T
T
T进行奇异值分解:
T
=
A
A
T
{\color{Red}T=AA^T}
T=AAT
=
V
Σ
T
Σ
V
T
=V\Sigma ^T\Sigma V^T
=VΣTΣVT
=
Q
Λ
T
Q
T
=Q\Lambda_TQ^T
=QΛTQT可知,
Q
=
V
{\color{Red}Q=V}
Q=V
Λ
T
=
Σ
T
Σ
{\color{Red}\Lambda _T= \Sigma ^T\Sigma}
ΛT=ΣTΣ
LOAM源码中的PCA代码分析
拿出KD树,来寻找最邻近的5个点。
对点云协方差矩阵进行主成分分析(PCA):
- 若这五个点分布在直线上,协方差矩阵的特征值包含一个元素显著大于其余两个,与该特征值相关的特征向量表示所处直线的方向;
- 若这五个点分布在平面上,协方差矩阵的特征值存在一个显著小的元素,与该特征值相关的特征向量表示所处平面的法线方向。
因此我们可以很轻易的根据特征向量找到直线上两点从而利用论文中点到直线的距离公式构建优化问题。平面特征也是相同的思路。
- 5个点求均值:
float cx = 0;
float cy = 0;
float cz = 0;
for (int j = 0; j < 5; j++)
{
cx += laserCloudCornerFromMap->points[pointSearchInd[j]].x;
cy += laserCloudCornerFromMap->points[pointSearchInd[j]].y;
cz += laserCloudCornerFromMap->points[pointSearchInd[j]].z;
}
cx /= 5;
cy /= 5;
cz /= 5;
- 去中心化并构建协方差矩阵
//求均方差
float a11 = 0;
float a12 = 0;
float a13 = 0;
float a22 = 0;
float a23 = 0;
float a33 = 0;
for (int j = 0; j < 5; j++)
{
float ax = laserCloudCornerFromMap->points[pointSearchInd[j]].x - cx;
float ay = laserCloudCornerFromMap->points[pointSearchInd[j]].y - cy;
float az = laserCloudCornerFromMap->points[pointSearchInd[j]].z - cz;
a11 += ax * ax;
a12 += ax * ay;
a13 += ax * az;
a22 += ay * ay;
a23 += ay * az;
a33 += az * az;
}
a11 /= 5;
a12 /= 5;
a13 /= 5;
a22 /= 5;
a23 /= 5;
a33 /= 5;
//构建矩阵
matA1.at<float>(0, 0) = a11;
matA1.at<float>(0, 1) = a12;
matA1.at<float>(0, 2) = a13;
matA1.at<float>(1, 0) = a12;
matA1.at<float>(1, 1) = a22;
matA1.at<float>(1, 2) = a23;
matA1.at<float>(2, 0) = a13;
matA1.at<float>(2, 1) = a23;
matA1.at<float>(2, 2) = a33;
- 使用OpenCV函数进行特征值分解
//特征值分解
cv::eigen(matA1, matD1, matV1);
OpenCV的特征值分解和SVD分解运算时间比较长,可以改为使用Eigen的API。
matD1为特征值对角阵
Λ
\Lambda
Λ,matV1为特征向量组成的矩阵
G
G
G.
- 选取最大特征值并处理
if (matD1.at<float>(0, 0) > 3 * matD1.at<float>(0, 1))
{//如果最大的特征值大于第二大的特征值三倍以上
//当前点
float x0 = pointSel.x;
float y0 = pointSel.y;
float z0 = pointSel.z;
//去中心化
float x1 = cx + 0.1 * matV1.at<float>(0, 0);
float y1 = cy + 0.1 * matV1.at<float>(0, 1);
float z1 = cz + 0.1 * matV1.at<float>(0, 2);
float x2 = cx - 0.1 * matV1.at<float>(0, 0);
float y2 = cy - 0.1 * matV1.at<float>(0, 1);
float z2 = cz - 0.1 * matV1.at<float>(0, 2);
//计算
float a012 = sqrt(((x0 - x1)*(y0 - y2) - (x0 - x2)*(y0 - y1))
* ((x0 - x1)*(y0 - y2) - (x0 - x2)*(y0 - y1))
+ ((x0 - x1)*(z0 - z2) - (x0 - x2)*(z0 - z1))
* ((x0 - x1)*(z0 - z2) - (x0 - x2)*(z0 - z1))
+ ((y0 - y1)*(z0 - z2) - (y0 - y2)*(z0 - z1))
* ((y0 - y1)*(z0 - z2) - (y0 - y2)*(z0 - z1)));
float l12 = sqrt((x1 - x2)*(x1 - x2) + (y1 - y2)*(y1 - y2) + (z1 - z2)*(z1 - z2));
float la = ((y1 - y2)*((x0 - x1)*(y0 - y2) - (x0 - x2)*(y0 - y1))
+ (z1 - z2)*((x0 - x1)*(z0 - z2) - (x0 - x2)*(z0 - z1))) / a012 / l12;
float lb = -((x1 - x2)*((x0 - x1)*(y0 - y2) - (x0 - x2)*(y0 - y1))
- (z1 - z2)*((y0 - y1)*(z0 - z2) - (y0 - y2)*(z0 - z1))) / a012 / l12;
float lc = -((x1 - x2)*((x0 - x1)*(z0 - z2) - (x0 - x2)*(z0 - z1))
+ (y1 - y2)*((y0 - y1)*(z0 - z2) - (y0 - y2)*(z0 - z1))) / a012 / l12;
float ld2 = a012 / l12;
//unused
pointProj = pointSel;
pointProj.x -= la * ld2;
pointProj.y -= lb * ld2;
pointProj.z -= lc * ld2;
//权重系数计算
float s = 1 - 0.9 * fabs(ld2);
coeff.x = s * la;
coeff.y = s * lb;
coeff.z = s * lc;
coeff.intensity = s * ld2;
if (s > 0.1)
{//距离足够小才使用
laserCloudOri->push_back(pointOri);
coeffSel->push_back(coeff);
}
- 这一部分的完整代码:
if (laserCloudCornerFromMapNum > 10 && laserCloudSurfFromMapNum > 100)
{
kdtreeCornerFromMap->setInputCloud(laserCloudCornerFromMap);//构建kd-tree
kdtreeSurfFromMap->setInputCloud(laserCloudSurfFromMap);
for (int iterCount = 0; iterCount < 10; iterCount++)
{//最多迭代10次
laserCloudOri->clear();
coeffSel->clear();
for (int i = 0; i < laserCloudCornerStackNum; i++)
{
pointOri = laserCloudCornerStack->points[i];
//转换回世界坐标系
pointAssociateToMap(&pointOri, &pointSel);
kdtreeCornerFromMap->nearestKSearch(pointSel, 5, pointSearchInd, pointSearchSqDis);//寻找最近距离五个点
if (pointSearchSqDis[4] < 1.0)
{//5个点中最大距离不超过1才处理
//将五个最近点的坐标加和求平均
float cx = 0;
float cy = 0;
float cz = 0;
for (int j = 0; j < 5; j++)
{
cx += laserCloudCornerFromMap->points[pointSearchInd[j]].x;
cy += laserCloudCornerFromMap->points[pointSearchInd[j]].y;
cz += laserCloudCornerFromMap->points[pointSearchInd[j]].z;
}
cx /= 5;
cy /= 5;
cz /= 5;
//求均方差
float a11 = 0;
float a12 = 0;
float a13 = 0;
float a22 = 0;
float a23 = 0;
float a33 = 0;
for (int j = 0; j < 5; j++)
{
float ax = laserCloudCornerFromMap->points[pointSearchInd[j]].x - cx;
float ay = laserCloudCornerFromMap->points[pointSearchInd[j]].y - cy;
float az = laserCloudCornerFromMap->points[pointSearchInd[j]].z - cz;
a11 += ax * ax;
a12 += ax * ay;
a13 += ax * az;
a22 += ay * ay;
a23 += ay * az;
a33 += az * az;
}
a11 /= 5;
a12 /= 5;
a13 /= 5;
a22 /= 5;
a23 /= 5;
a33 /= 5;
//构建矩阵
matA1.at<float>(0, 0) = a11;
matA1.at<float>(0, 1) = a12;
matA1.at<float>(0, 2) = a13;
matA1.at<float>(1, 0) = a12;
matA1.at<float>(1, 1) = a22;
matA1.at<float>(1, 2) = a23;
matA1.at<float>(2, 0) = a13;
matA1.at<float>(2, 1) = a23;
matA1.at<float>(2, 2) = a33;
//特征值分解
cv::eigen(matA1, matD1, matV1);
if (matD1.at<float>(0, 0) > 3 * matD1.at<float>(0, 1))
{//如果最大的特征值大于第二大的特征值三倍以上
float x0 = pointSel.x;
float y0 = pointSel.y;
float z0 = pointSel.z;
float x1 = cx + 0.1 * matV1.at<float>(0, 0);
float y1 = cy + 0.1 * matV1.at<float>(0, 1);
float z1 = cz + 0.1 * matV1.at<float>(0, 2);
float x2 = cx - 0.1 * matV1.at<float>(0, 0);
float y2 = cy - 0.1 * matV1.at<float>(0, 1);
float z2 = cz - 0.1 * matV1.at<float>(0, 2);
float a012 = sqrt(((x0 - x1)*(y0 - y2) - (x0 - x2)*(y0 - y1))
* ((x0 - x1)*(y0 - y2) - (x0 - x2)*(y0 - y1))
+ ((x0 - x1)*(z0 - z2) - (x0 - x2)*(z0 - z1))
* ((x0 - x1)*(z0 - z2) - (x0 - x2)*(z0 - z1))
+ ((y0 - y1)*(z0 - z2) - (y0 - y2)*(z0 - z1))
* ((y0 - y1)*(z0 - z2) - (y0 - y2)*(z0 - z1)));
float l12 = sqrt((x1 - x2)*(x1 - x2) + (y1 - y2)*(y1 - y2) + (z1 - z2)*(z1 - z2));
float la = ((y1 - y2)*((x0 - x1)*(y0 - y2) - (x0 - x2)*(y0 - y1))
+ (z1 - z2)*((x0 - x1)*(z0 - z2) - (x0 - x2)*(z0 - z1))) / a012 / l12;
float lb = -((x1 - x2)*((x0 - x1)*(y0 - y2) - (x0 - x2)*(y0 - y1))
- (z1 - z2)*((y0 - y1)*(z0 - z2) - (y0 - y2)*(z0 - z1))) / a012 / l12;
float lc = -((x1 - x2)*((x0 - x1)*(z0 - z2) - (x0 - x2)*(z0 - z1))
+ (y1 - y2)*((y0 - y1)*(z0 - z2) - (y0 - y2)*(z0 - z1))) / a012 / l12;
float ld2 = a012 / l12;
//unused
pointProj = pointSel;
pointProj.x -= la * ld2;
pointProj.y -= lb * ld2;
pointProj.z -= lc * ld2;
//权重系数计算
float s = 1 - 0.9 * fabs(ld2);
coeff.x = s * la;
coeff.y = s * lb;
coeff.z = s * lc;
coeff.intensity = s * ld2;
if (s > 0.1)
{//距离足够小才使用
laserCloudOri->push_back(pointOri);
coeffSel->push_back(coeff);
}
}
}
}
for (int i = 0; i < laserCloudSurfStackNum; i++)
{
pointOri = laserCloudSurfStack->points[i];
pointAssociateToMap(&pointOri, &pointSel);
kdtreeSurfFromMap->nearestKSearch(pointSel, 5, pointSearchInd, pointSearchSqDis);
if (pointSearchSqDis[4] < 1.0)
{
//构建五个最近点的坐标矩阵
for (int j = 0; j < 5; j++)
{
matA0.at<float>(j, 0) = laserCloudSurfFromMap->points[pointSearchInd[j]].x;
matA0.at<float>(j, 1) = laserCloudSurfFromMap->points[pointSearchInd[j]].y;
matA0.at<float>(j, 2) = laserCloudSurfFromMap->points[pointSearchInd[j]].z;
}
//求解matA0*matX0=matB0
cv::solve(matA0, matB0, matX0, cv::DECOMP_QR);
float pa = matX0.at<float>(0, 0);
float pb = matX0.at<float>(1, 0);
float pc = matX0.at<float>(2, 0);
float pd = 1;
float ps = sqrt(pa * pa + pb * pb + pc * pc);
pa /= ps;
pb /= ps;
pc /= ps;
pd /= ps;
bool planeValid = true;
for (int j = 0; j < 5; j++)
{
if (fabs(pa * laserCloudSurfFromMap->points[pointSearchInd[j]].x +
pb * laserCloudSurfFromMap->points[pointSearchInd[j]].y +
pc * laserCloudSurfFromMap->points[pointSearchInd[j]].z + pd) > 0.2)
{
planeValid = false;
break;
}
}
if (planeValid)
{
float pd2 = pa * pointSel.x + pb * pointSel.y + pc * pointSel.z + pd;
//unused
pointProj = pointSel;
pointProj.x -= pa * pd2;
pointProj.y -= pb * pd2;
pointProj.z -= pc * pd2;
float s = 1 - 0.9 * fabs(pd2) / sqrt(sqrt(pointSel.x * pointSel.x
+ pointSel.y * pointSel.y + pointSel.z * pointSel.z));
coeff.x = s * pa;
coeff.y = s * pb;
coeff.z = s * pc;
coeff.intensity = s * pd2;
if (s > 0.1)
{
laserCloudOri->push_back(pointOri);
coeffSel->push_back(coeff);
}
}
}
}
float srx = sin(transformTobeMapped[0]);
float crx = cos(transformTobeMapped[0]);
float sry = sin(transformTobeMapped[1]);
float cry = cos(transformTobeMapped[1]);
float srz = sin(transformTobeMapped[2]);
float crz = cos(transformTobeMapped[2]);
int laserCloudSelNum = laserCloudOri->points.size();
if (laserCloudSelNum < 50)
{//如果特征点太少
continue;
}
cv::Mat matA(laserCloudSelNum, 6, CV_32F, cv::Scalar::all(0));
cv::Mat matAt(6, laserCloudSelNum, CV_32F, cv::Scalar::all(0));
cv::Mat matAtA(6, 6, CV_32F, cv::Scalar::all(0));
cv::Mat matB(laserCloudSelNum, 1, CV_32F, cv::Scalar::all(0));
cv::Mat matAtB(6, 1, CV_32F, cv::Scalar::all(0));
cv::Mat matX(6, 1, CV_32F, cv::Scalar::all(0));
for (int i = 0; i < laserCloudSelNum; i++)
{
pointOri = laserCloudOri->points[i];
coeff = coeffSel->points[i];
float arx = (crx*sry*srz*pointOri.x + crx*crz*sry*pointOri.y - srx*sry*pointOri.z) * coeff.x
+ (-srx*srz*pointOri.x - crz*srx*pointOri.y - crx*pointOri.z) * coeff.y
+ (crx*cry*srz*pointOri.x + crx*cry*crz*pointOri.y - cry*srx*pointOri.z) * coeff.z;
float ary = ((cry*srx*srz - crz*sry)*pointOri.x
+ (sry*srz + cry*crz*srx)*pointOri.y + crx*cry*pointOri.z) * coeff.x
+ ((-cry*crz - srx*sry*srz)*pointOri.x
+ (cry*srz - crz*srx*sry)*pointOri.y - crx*sry*pointOri.z) * coeff.z;
float arz = ((crz*srx*sry - cry*srz)*pointOri.x + (-cry*crz-srx*sry*srz)*pointOri.y)*coeff.x
+ (crx*crz*pointOri.x - crx*srz*pointOri.y) * coeff.y
+ ((sry*srz + cry*crz*srx)*pointOri.x + (crz*sry-cry*srx*srz)*pointOri.y)*coeff.z;
matA.at<float>(i, 0) = arx;
matA.at<float>(i, 1) = ary;
matA.at<float>(i, 2) = arz;
matA.at<float>(i, 3) = coeff.x;
matA.at<float>(i, 4) = coeff.y;
matA.at<float>(i, 5) = coeff.z;
matB.at<float>(i, 0) = -coeff.intensity;
}
cv::transpose(matA, matAt);
matAtA = matAt * matA;
matAtB = matAt * matB;
cv::solve(matAtA, matAtB, matX, cv::DECOMP_QR);
//退化场景判断与处理
if (iterCount == 0)
{
cv::Mat matE(1, 6, CV_32F, cv::Scalar::all(0));
cv::Mat matV(6, 6, CV_32F, cv::Scalar::all(0));
cv::Mat matV2(6, 6, CV_32F, cv::Scalar::all(0));
cv::eigen(matAtA, matE, matV);
matV.copyTo(matV2);
isDegenerate = false;
float eignThre[6] = {100, 100, 100, 100, 100, 100};
for (int i = 5; i >= 0; i--)
{
if (matE.at<float>(0, i) < eignThre[i])
{
for (int j = 0; j < 6; j++)
{
matV2.at<float>(i, j) = 0;
}
isDegenerate = true;
}
else
{
break;
}
}
matP = matV.inv() * matV2;
}
if (isDegenerate)
{
cv::Mat matX2(6, 1, CV_32F, cv::Scalar::all(0));
matX.copyTo(matX2);
matX = matP * matX2;
}
//积累每次的调整量
transformTobeMapped[0] += matX.at<float>(0, 0);
transformTobeMapped[1] += matX.at<float>(1, 0);
transformTobeMapped[2] += matX.at<float>(2, 0);
transformTobeMapped[3] += matX.at<float>(3, 0);
transformTobeMapped[4] += matX.at<float>(4, 0);
transformTobeMapped[5] += matX.at<float>(5, 0);
float deltaR = sqrt(
pow(rad2deg(matX.at<float>(0, 0)), 2) +
pow(rad2deg(matX.at<float>(1, 0)), 2) +
pow(rad2deg(matX.at<float>(2, 0)), 2));
float deltaT = sqrt(
pow(matX.at<float>(3, 0) * 100, 2) +
pow(matX.at<float>(4, 0) * 100, 2) +
pow(matX.at<float>(5, 0) * 100, 2));
//旋转平移量足够小就停止迭代
if (deltaR < 0.05 && deltaT < 0.05)
{
break;
}
}
//迭代结束更新相关的转移矩阵
transformUpdate();
}