ICP(Iterative Closest Point)算法推導
前言
本篇整理自深藍學院三維點雲處理課程的Lecture 9 – Registration。推導過程中大量地用到了矩陣跡(trace)及Frobenius norm的性質,建議與矩陣的跡(matrix trace)的特性以及證明及Frobenius norm的特性以及證明對照參看。
算法流程
- 尋找(最近鄰)點對,拒絕當中距離過大者
- 估計旋轉矩陣 R R R和平移向量 t t t,使得 轉換後的原始點雲 與 目標點雲 距離最小化(此即Procrustes Transformation Problem)
- 檢查 損失函數 f f f是否夠小 或 R , t R,t R,t的變換量是否夠小,若非,則再次執行上述所有步驟
損失函數
兩個已配對的 m m m維空間中的點集, A A A和 B B B中各有 N N N個點:
A
=
[
a
1
,
.
.
.
a
N
]
∈
R
m
×
N
A = [a_1,...a_N] \in \R^{m\times N}
A=[a1,...aN]∈Rm×N
B
=
[
b
1
,
.
.
.
b
N
]
∈
R
m
×
N
B = [b_1,...b_N] \in \R^{m\times N}
B=[b1,...bN]∈Rm×N
並定義如下矩陣和向量:
1
=
[
1
,
.
.
.
,
1
]
T
∈
R
N
×
1
\bold{1} = [1,...,1]^T \in \R^{N \times 1}
1=[1,...,1]T∈RN×1
R
∈
R
m
×
m
m維空間中的旋轉矩陣
\begin{aligned}R \in \R^{m \times m} && \text{m維空間中的旋轉矩陣}\end{aligned}
R∈Rm×mm維空間中的旋轉矩陣
t
∈
R
m
×
1
m維空間中的平移向量
\begin{aligned}t \in \R^{m \times 1} && \text{m維空間中的平移向量}\end{aligned}
t∈Rm×1m維空間中的平移向量
損失函數為:
f ( R , t ) = 1 N ∑ i = 1 N ∥ b i − R a i − t ∥ 2 = ∥ B − ( R A + t 1 T ) ∥ F 2 f(R,t) = \frac{1}{N} \sum\limits_{i=1}^N \|b_i - Ra_i-t\|^2 = \|B-(RA+t\bold{1^T})\|_F^2 f(R,t)=N1i=1∑N∥bi−Rai−t∥2=∥B−(RA+t1T)∥F2
(最後一項好像少了 1 N \frac{1}{N} N1?)
這裡的目標是尋找一組 R R R和 t t t使得 f f f最小化。
求解t
化簡
f ( R , t ) = ∥ B − ( R A + t 1 T ) ∥ F 2 = ∥ ( B − R A ) + ( − t 1 T ) ∥ F 2 = ∥ B − R A ∥ F 2 + ∥ − t 1 T ∥ F 2 + 2 ⟨ B − R A , − t 1 T ⟩ F Frobenius decomposition \begin{aligned}f(R,t) &= \|B-(RA+t\bold{1^T})\|_F^2 \\&= \|(B-RA)+(-t\bold{1^T})\|^2_F \\&= \|B-RA\|^2_F + \|-t\bold{1^T}\|^2_F + 2\langle B-RA,-t\bold{1^T}\rangle_F && \text{Frobenius decomposition} \end{aligned} f(R,t)=∥B−(RA+t1T)∥F2=∥(B−RA)+(−t1T)∥F2=∥B−RA∥F2+∥−t1T∥F2+2⟨B−RA,−t1T⟩FFrobenius decomposition
單獨化簡第二項:
∥ − t 1 T ∥ F 2 = ∥ t 1 T ∥ F 2 = ∥ t [ 1 1 . . . 1 ] T T ∥ F 2 = ∥ [ t t . . . t ] ∥ F 2 = tr ( [ t t . . . t ] T [ t t . . . t ] ) Forbenius norm: ∥ A ∥ F = tr ( A A T ) = tr ( A T A ) = tr ( [ t T t t T t . . . t T t t T t t T t . . . t T t . . . t T t t T t . . . t T t ] ) = N t T t \begin{aligned}\|-t\bold{1^T}\|^2_F &= \|t\bold{1^T}\|^2_F \\&= \|t{\begin{bmatrix}1 & 1 & ... & 1 \end{bmatrix}^T}^T\|^2_F \\&= \|\begin{bmatrix}t & t & ... & t \end{bmatrix}\|_F^2 \\&= \text{tr}(\begin{bmatrix}t & t & ... & t \end{bmatrix}^T \begin{bmatrix}t & t & ... & t \end{bmatrix}) && \text{Forbenius norm:}\|A\|_F = \sqrt{\text{tr}(A A^T)} = \sqrt{\text{tr}(A^T A)} \\&= \text{tr}(\begin{bmatrix}t^Tt & t^Tt & ... & t^Tt \\ t^Tt & t^Tt & ... & t^Tt \\ ... \\t^Tt & t^Tt & ... & t^Tt \\ \end{bmatrix})\\&= Nt^Tt\end{aligned} ∥−t1T∥F2=∥t1T∥F2=∥t[11...1]TT∥F2=∥[tt...t]∥F2=tr([tt...t]T[tt...t])=tr(⎣⎢⎢⎡tTttTt...tTttTttTttTt.........tTttTttTt⎦⎥⎥⎤)=NtTtForbenius norm:∥A∥F=tr(AAT)=tr(ATA)
代回損失函數:
f ( R , t ) = ∥ B − R A ∥ F 2 + ∥ − t 1 T ∥ F 2 + 2 ⟨ B − R A , − t 1 T ⟩ F = ∥ B − R A ∥ F 2 + N t T t − 2 ⟨ B − R A , t 1 T ⟩ F = ∥ B − R A ∥ F 2 + N t T t − 2 tr ( ( B − R A ) 1 t T ) Frobenius inner product \begin{aligned}f(R,t) &= \|B-RA\|^2_F + \|-t\bold{1^T}\|^2_F + 2\langle B-RA,-t\bold{1^T}\rangle_F \\&= \|B-RA\|^2_F + Nt^Tt - 2\langle B-RA,t\bold{1^T}\rangle_F \\&= \|B-RA\|^2_F + Nt^Tt -2\text{tr}((B-RA)\bold{1}t^T) && \text{Frobenius inner product} \end{aligned} f(R,t)=∥B−RA∥F2+∥−t1T∥F2+2⟨B−RA,−t1T⟩F=∥B−RA∥F2+NtTt−2⟨B−RA,t1T⟩F=∥B−RA∥F2+NtTt−2tr((B−RA)1tT)Frobenius inner product
令導數為0
將 f f f對 t t t微分:
∂ f ∂ t = ∂ ( ∥ B − R A ∥ F 2 + N t T t − 2 tr ( ( B − R A ) 1 t T ) ) ∂ t = ∂ ∥ B − R A ∥ F 2 ∂ t + ∂ N t T t ∂ t − 2 ∂ tr ( ( B − R A ) 1 t T ) ∂ t = 0 + ∂ N t T t ∂ t − 2 ∂ tr ( ( B − R A ) 1 t T ) ∂ t = 2 N t − 2 ∂ tr ( ( B − R A ) 1 t T ) ∂ t ∂ x T x ∂ x = 2 x = 2 N t − 2 ( B − R A ) 1 ∂ tr ( A X T ) ∂ X = A \begin{aligned}\frac{\partial f}{\partial t} &=\frac{\partial (\|B-RA\|^2_F + Nt^Tt -2\text{tr}((B-RA)\bold{1}t^T))}{\partial t} \\&=\frac{\partial \|B-RA\|^2_F}{\partial t} + \frac{\partial Nt^Tt}{\partial t} -2\frac{\partial \text{tr}((B-RA)\bold{1}t^T)}{\partial t} \\&= 0 + \frac{\partial Nt^Tt}{\partial t} -2\frac{\partial \text{tr}((B-RA)\bold{1}t^T)}{\partial t} \\&= 2Nt -2\frac{\partial \text{tr}((B-RA)\bold{1}t^T)}{\partial t} && \frac{\partial x^Tx}{\partial x} = 2x \\&= 2Nt -2(B-RA)\bold{1} && \frac{\partial \text{tr}(\bold{A}\bold{X}^T)}{\partial \bold{X}} = \bold{A}\end{aligned} ∂t∂f=∂t∂(∥B−RA∥F2+NtTt−2tr((B−RA)1tT))=∂t∂∥B−RA∥F2+∂t∂NtTt−2∂t∂tr((B−RA)1tT)=0+∂t∂NtTt−2∂t∂tr((B−RA)1tT)=2Nt−2∂t∂tr((B−RA)1tT)=2Nt−2(B−RA)1∂x∂xTx=2x∂X∂tr(AXT)=A
令 ∂ f ∂ t = 2 N t − 2 ( B − R A ) 1 \frac{\partial f}{\partial t} = 2Nt -2(B-RA)\bold{1} ∂t∂f=2Nt−2(B−RA)1為0。
可以得知取 t = 1 N ( B − R A ) 1 t = \frac{1}{N}(B-RA) \bold{1} t=N1(B−RA)1時能使得 f f f最小化。
註一: ∂ x T x ∂ x \frac{\partial x^Tx}{\partial x} ∂x∂xTx可參考x^Tx對x微分公式證明。
註二: ∂ tr ( A X T ) ∂ X = A \frac{\partial \text{tr}(\bold{A}\bold{X}^T)}{\partial \bold{X}} = \bold{A} ∂X∂tr(AXT)=A可參考兩矩陣乘積的跡對矩陣微分公式證明。
求解R
回代t
之前化簡得到 f ( R , t ) = ∥ B − R A ∥ F 2 + N t T t − 2 tr ( ( B − R A ) 1 t T ) f(R,t)= \|B-RA\|^2_F + Nt^Tt -2\text{tr}((B-RA)\bold{1}t^T) f(R,t)=∥B−RA∥F2+NtTt−2tr((B−RA)1tT)。
將剛剛計算出來的 t = 1 N ( B − R A ) 1 t = \frac{1}{N}(B-RA) \bold{1} t=N1(B−RA)1代入第二項:
N t T t = N ( 1 N 1 T ( B − R A ) T ) ( 1 N ( B − R A ) 1 ) = 1 N 1 T ( B − R A ) T ( B − R A ) 1 = 1 N [ 1 . . . 1 ] [ ( b 1 − R a 1 ) T . . . ( b N − R a N ) T ] [ ( b 1 − R a 1 ) . . . ( b N − R a N ) ] [ 1 . . . 1 ] = 1 N ( ( b 1 − R a 1 ) T + . . . + ( b N − R a N ) T ) ( ( b 1 − R a 1 ) + . . . + ( b N − R a N ) ) 矩陣乘法的結合律 = 1 N ∑ i = 1 N ( b i − R a i ) T ∑ i = 1 N ( b i − R a i ) = 1 N N ( μ b − R μ a ) T N ( μ b − R μ a ) = N ( μ b − R μ a ) T ( μ b − R μ a ) = N ∥ μ b − R μ a ∥ 2 \begin{aligned}Nt^Tt &= N(\frac{1}{N}\bold{1^T}(B-RA)^T)(\frac{1}{N}(B-RA)\bold{1}) \\&= \frac{1}{N}\bold{1^T}(B-RA)^T(B-RA)\bold{1} \\ &= \frac{1}{N}\begin{bmatrix}1 & ... & 1\end{bmatrix}\begin{bmatrix}(b_1 - Ra_1)^T \\ ... \\ (b_N - Ra_N)^T\end{bmatrix}\begin{bmatrix}(b_1 - Ra_1) & ... & (b_N - Ra_N)\end{bmatrix}\begin{bmatrix}1 \\ ... \\ 1\end{bmatrix} \\&= \frac{1}{N}((b_1 - Ra_1)^T + ... + (b_N - Ra_N)^T)((b_1 - Ra_1) + ... + (b_N - Ra_N)) && \text{矩陣乘法的結合律} \\&= \frac{1}{N}\sum\limits_{i=1}^N(b_i-Ra_i)^T\sum\limits_{i=1}^N(b_i-Ra_i) \\&= \frac{1}{N}N(\mu_b-R\mu_a)^TN(\mu_b-R\mu_a) \\&= N(\mu_b-R\mu_a)^T(\mu_b-R\mu_a) \\&= N \|\mu_b-R\mu_a\|^2\end{aligned} NtTt=N(N11T(B−RA)T)(N1(B−RA)1)=N11T(B−RA)T(B−RA)1=N1[1...1]⎣⎡(b1−Ra1)T...(bN−RaN)T⎦⎤[(b1−Ra1)...(bN−RaN)]⎣⎡1...1⎦⎤=N1((b1−Ra1)T+...+(bN−RaN)T)((b1−Ra1)+...+(bN−RaN))=N1i=1∑N(bi−Rai)Ti=1∑N(bi−Rai)=N1N(μb−Rμa)TN(μb−Rμa)=N(μb−Rμa)T(μb−Rμa)=N∥μb−Rμa∥2矩陣乘法的結合律
將
t
=
1
N
(
B
−
R
A
)
1
t = \frac{1}{N}(B-RA) \bold{1}
t=N1(B−RA)1代入第三項:
2
tr
(
(
B
−
R
A
)
1
t
T
)
=
2
tr
(
(
B
−
R
A
)
1
1
N
1
T
(
B
−
R
A
)
T
)
=
2
N
tr
(
[
b
1
−
R
a
1
.
.
.
b
N
−
R
a
N
]
[
1
.
.
.
1
]
[
1
.
.
.
1
]
[
(
b
1
−
R
a
1
)
T
.
.
.
(
b
N
−
R
a
N
)
T
]
)
=
2
N
tr
(
(
(
b
1
−
R
a
1
)
+
.
.
.
+
(
b
N
−
R
a
N
)
)
(
(
b
1
−
R
a
1
)
T
+
.
.
.
+
(
b
N
−
R
a
N
)
T
)
)
=
2
N
tr
(
(
∑
i
=
1
N
b
i
−
R
a
i
)
(
∑
i
=
1
N
b
i
−
R
a
i
)
T
)
=
2
N
tr
(
N
(
μ
b
−
R
μ
a
)
N
(
μ
b
−
R
μ
a
)
T
)
=
2
N
tr
(
(
μ
b
−
R
μ
a
)
(
μ
b
−
R
μ
a
)
T
)
tr
(
c
A
)
=
c
tr
(
A
)
=
2
N
∥
μ
b
−
R
μ
a
∥
2
tr
(
v
1
×
v
2
)
=
v
1
⋅
v
2
\begin{aligned}2\text{tr}((B-RA)\bold{1}t^T) &= 2\text{tr}((B-RA)\bold{1}\frac{1}{N}\bold{1^T}(B-RA)^T) \\&= \frac{2}{N}\text{tr}(\begin{bmatrix}b_1-Ra_1 & ... & b_N-Ra_N\end{bmatrix}\begin{bmatrix}1 \\ ... \\1\end{bmatrix}\begin{bmatrix}1 & ... &1\end{bmatrix}\begin{bmatrix}(b_1-Ra_1)^T \\ ... \\ (b_N-Ra_N)^T\end{bmatrix}) \\ &= \frac{2}{N}\text{tr}(((b_1-Ra_1)+...+(b_N-Ra_N))((b_1-Ra_1)^T+...+(b_N-Ra_N)^T)) \\&= \frac{2}{N}\text{tr}((\sum\limits_{i=1}^Nb_i-Ra_i)(\sum\limits_{i=1}^Nb_i-Ra_i)^T) \\&= \frac{2}{N}\text{tr}(N(\mu_b-R\mu_a)N(\mu_b-R\mu_a)^T) \\&= 2N\text{tr}((\mu_b-R\mu_a)(\mu_b-R\mu_a)^T) && \text{tr}(c\bold{A}) = c \text{tr}(\bold{A})\\&= 2N\|\mu_b-R\mu_a\|^2 && \text{tr}(v_1 \times v_2) = v_1 \cdot v_2\end{aligned}
2tr((B−RA)1tT)=2tr((B−RA)1N11T(B−RA)T)=N2tr([b1−Ra1...bN−RaN]⎣⎡1...1⎦⎤[1...1]⎣⎡(b1−Ra1)T...(bN−RaN)T⎦⎤)=N2tr(((b1−Ra1)+...+(bN−RaN))((b1−Ra1)T+...+(bN−RaN)T))=N2tr((i=1∑Nbi−Rai)(i=1∑Nbi−Rai)T)=N2tr(N(μb−Rμa)N(μb−Rμa)T)=2Ntr((μb−Rμa)(μb−Rμa)T)=2N∥μb−Rμa∥2tr(cA)=ctr(A)tr(v1×v2)=v1⋅v2
將f轉化為x2+y2-2xy的形式
為了接下來推導方便,將第一項轉化成如下形式:
∥
B
−
R
A
∥
F
2
=
∑
i
=
1
m
∑
j
=
1
N
(
B
−
R
A
)
i
j
2
=
∑
j
=
1
N
∑
i
=
1
m
(
B
−
R
A
)
i
j
2
=
∑
j
=
1
N
∥
b
j
−
R
a
j
∥
2
=
∑
i
=
1
N
∥
b
i
−
R
a
i
∥
2
\begin{aligned}\|B-RA\|^2_F &= \sum\limits_{i=1}^m\sum\limits_{j=1}^N(B-RA)_{ij}^2 \\&= \sum\limits_{j=1}^N\sum\limits_{i=1}^m(B-RA)_{ij}^2 \\&= \sum\limits_{j=1}^N\|b_j-Ra_j\|^2 \\&= \sum\limits_{i=1}^N\|b_i-Ra_i\|^2\end{aligned}
∥B−RA∥F2=i=1∑mj=1∑N(B−RA)ij2=j=1∑Ni=1∑m(B−RA)ij2=j=1∑N∥bj−Raj∥2=i=1∑N∥bi−Rai∥2
第二項由 N ∥ μ b − R μ a ∥ 2 N \|\mu_b-R\mu_a\|^2 N∥μb−Rμa∥2化為 ∑ i = 1 N ∥ μ b − R μ a ∥ 2 \sum\limits_{i=1}^N\|\mu_b-R\mu_a\|^2 i=1∑N∥μb−Rμa∥2。
第三項由
2
N
∥
μ
b
−
R
μ
a
∥
2
2N\|\mu_b-R\mu_a\|^2
2N∥μb−Rμa∥2轉化為:
2
N
∥
μ
b
−
R
μ
a
∥
2
=
2
N
(
μ
b
−
R
μ
a
)
T
(
μ
b
−
R
μ
a
)
=
2
∑
i
=
1
N
(
b
i
−
R
a
i
)
T
(
μ
b
−
R
μ
a
)
\begin{aligned}2N\|\mu_b-R\mu_a\|^2 &= 2N(\mu_b-R\mu_a)^T(\mu_b-R\mu_a) \\&= 2\sum\limits_{i=1}^N(b_i-Ra_i)^T(\mu_b-R\mu_a)\end{aligned}
2N∥μb−Rμa∥2=2N(μb−Rμa)T(μb−Rμa)=2i=1∑N(bi−Rai)T(μb−Rμa)
至此, f ( R , t ) = ∑ i = 1 N ∥ b i − R a i ∥ 2 + ∑ i = 1 N ∥ μ b − R μ a ∥ 2 − 2 ∑ i = 1 N ( b i − R a i ) T ( μ b − R μ a ) f(R,t) = \sum\limits_{i=1}^N\|b_i-Ra_i\|^2 + \sum\limits_{i=1}^N\|\mu_b-R\mu_a\|^2 - 2\sum\limits_{i=1}^N(b_i-Ra_i)^T(\mu_b-R\mu_a) f(R,t)=i=1∑N∥bi−Rai∥2+i=1∑N∥μb−Rμa∥2−2i=1∑N(bi−Rai)T(μb−Rμa)
化簡並引入去除均值的點雲
f ( R , t ) = ∑ i = 1 N ∥ b i − R a i ∥ 2 + ∑ i = 1 N ∥ μ b − R μ a ∥ 2 − 2 ∑ i = 1 N ( b i − R a i ) T ( μ b − R μ a ) = ∑ i = 1 N ( ∥ b i − R a i ∥ 2 + ∥ μ b − R μ a ∥ 2 − 2 ( b i − R a i ) T ( μ b − R μ a ) ) = ∑ i = 1 N ( ∥ b i − R a i ∥ 2 + ∥ μ b − R μ a ∥ 2 − 2 tr ( ( b i − R a i ) T ( μ b − R μ a ) ) ) ( b i − R a i ) T ( μ b − R μ a ) 是純量 = ∑ i = 1 N ( ∥ b i − R a i ∥ 2 + ∥ μ b − R μ a ∥ 2 − 2 ⟨ b i − R a i , μ b − R μ a ⟩ F ) Frobenius inner product = ∑ i = 1 N ∥ ( b i − R a i ) − ( μ b − R μ a ) ∥ 2 反向套用Frobenius decomposition = ∑ i = 1 N ∥ ( b i − μ b ) − R ( a i − μ a ) ∥ 2 = ∑ i = 1 N ∥ b i ′ − R a i ′ ∥ 2 b i ′ ≜ b i − μ b , a i ′ ≜ a i − μ a = ∥ B ′ − R A ′ ∥ F 2 A ′ ≜ A ( I N − 1 N 1 1 T ) , B ′ ≜ B ( I N − 1 N 1 1 T ) \begin{aligned}f(R,t) &= \sum\limits_{i=1}^N\|b_i-Ra_i\|^2 + \sum\limits_{i=1}^N\|\mu_b-R\mu_a\|^2 - 2\sum\limits_{i=1}^N(b_i-Ra_i)^T(\mu_b-R\mu_a) \\&= \sum\limits_{i=1}^N(\|b_i-Ra_i\|^2 + \|\mu_b-R\mu_a\|^2 -2(b_i-Ra_i)^T(\mu_b-R\mu_a)) \\&= \sum\limits_{i=1}^N(\|b_i-Ra_i\|^2 + \|\mu_b-R\mu_a\|^2 -2\text{tr}((b_i-Ra_i)^T(\mu_b-R\mu_a))) && (b_i-Ra_i)^T(\mu_b-R\mu_a)\text{是純量}\\&= \sum\limits_{i=1}^N(\|b_i-Ra_i\|^2 + \|\mu_b-R\mu_a\|^2 -2\langle b_i-Ra_i,\mu_b-R\mu_a\rangle_F ) && \text{Frobenius inner product} \\&= \sum\limits_{i=1}^N\|(b_i-Ra_i)-(\mu_b-R\mu_a)\|^2 && \text{反向套用Frobenius decomposition} \\&= \sum\limits_{i=1}^N \|(b_i-\mu_b)-R(a_i-\mu_a)\|^2\\&= \sum\limits_{i=1}^N \|b_i'-Ra_i'\|^2 && b_i' \triangleq b_i-\mu_b, a_i' \triangleq a_i-\mu_a\\&= \|B'-RA'\|^2_F && A' \triangleq A(I_N - \frac{1}{N}\bold{1}\bold{1}^T), B' \triangleq B(I_N - \frac{1}{N}\bold{1}\bold{1}^T)\end{aligned} f(R,t)=i=1∑N∥bi−Rai∥2+i=1∑N∥μb−Rμa∥2−2i=1∑N(bi−Rai)T(μb−Rμa)=i=1∑N(∥bi−Rai∥2+∥μb−Rμa∥2−2(bi−Rai)T(μb−Rμa))=i=1∑N(∥bi−Rai∥2+∥μb−Rμa∥2−2tr((bi−Rai)T(μb−Rμa)))=i=1∑N(∥bi−Rai∥2+∥μb−Rμa∥2−2⟨bi−Rai,μb−Rμa⟩F)=i=1∑N∥(bi−Rai)−(μb−Rμa)∥2=i=1∑N∥(bi−μb)−R(ai−μa)∥2=i=1∑N∥bi′−Rai′∥2=∥B′−RA′∥F2(bi−Rai)T(μb−Rμa)是純量Frobenius inner product反向套用Frobenius decompositionbi′≜bi−μb,ai′≜ai−μaA′≜A(IN−N111T),B′≜B(IN−N111T)
關於 A ′ A' A′的推導:
A ′ = A ( I n − 1 N 1 1 T ) = A − 1 N A 1 1 T = [ A 11 A 12 . . . A 1 N . . . A m 1 A m 2 . . . A m N ] − 1 N [ A 11 A 12 . . . A 1 N . . . A m 1 A m 2 . . . A m N ] [ 1 1 . . . 1 1 1 . . . 1 . . . 1 1 . . . 1 ] = [ A 11 A 12 . . . A 1 N . . . A m 1 A m 2 . . . A m N ] − 1 N [ ∑ j = 1 N A 1 j ∑ j = 1 N A 1 j . . . ∑ j = 1 N A 1 j . . . ∑ j = 1 N A m j ∑ j = 1 N A m j . . . ∑ j = 1 N A m j ] = [ A 11 A 12 . . . A 1 N . . . A m 1 A m 2 . . . A m N ] − [ μ 1 μ 1 . . . μ 1 . . . μ m μ m . . . μ m ] μ i ≜ A第i個row的平均值 = [ A 11 − μ 1 A 12 − μ 1 . . . A 1 N − μ 1 . . . A m 1 − μ m A m 2 − μ m . . . A m N − μ m ] \begin{aligned}A' &= A(I_n - \frac{1}{N}\bold{1}\bold{1}^T) \\&= A - \frac{1}{N}A\bold{1}\bold{1}^T \\&= \begin{bmatrix}A_{11} & A_{12} & ... & A_{1N} \\ ... \\ A_{m1} & A_{m2} & ... & A_{mN}\end{bmatrix} - \frac{1}{N}\begin{bmatrix}A_{11} & A_{12} & ... & A_{1N} \\ ... \\ A_{m1} & A_{m2} & ... & A_{mN}\end{bmatrix}\begin{bmatrix}1 & 1 & ... & 1 \\ 1 & 1 & ... & 1 \\ ... \\1 & 1 & ... & 1\end{bmatrix} \\&= \begin{bmatrix}A_{11} & A_{12} & ... & A_{1N} \\ ... \\ A_{m1} & A_{m2} & ... & A_{mN}\end{bmatrix} - \frac{1}{N} \begin{bmatrix}\sum\limits_{j=1}^{N}A_{1j} & \sum\limits_{j=1}^{N}A_{1j} & ... & \sum\limits_{j=1}^{N}A_{1j} \\ ... \\ \sum\limits_{j=1}^{N}A_{mj} & \sum\limits_{j=1}^{N}A_{mj} & ... & \sum\limits_{j=1}^{N}A_{mj}\end{bmatrix} \\&= \begin{bmatrix}A_{11} & A_{12} & ... & A_{1N} \\ ... \\ A_{m1} & A_{m2} & ... & A_{mN}\end{bmatrix} - \begin{bmatrix}\mu_1 & \mu_1 & ... & \mu_1 \\ ... \\ \mu_m & \mu_m & ... & \mu_m\end{bmatrix} && \mu_i \triangleq \text{A第i個row的平均值} \\&= \begin{bmatrix}A_{11}-\mu_1 & A_{12}-\mu_1 & ... & A_{1N}-\mu_1 \\ ... \\ A_{m1}-\mu_m & A_{m2}-\mu_m & ... & A_{mN}-\mu_m\end{bmatrix} \end{aligned} A′=A(In−N111T)=A−N1A11T=⎣⎡A11...Am1A12Am2......A1NAmN⎦⎤−N1⎣⎡A11...Am1A12Am2......A1NAmN⎦⎤⎣⎢⎢⎡11...1111.........111⎦⎥⎥⎤=⎣⎡A11...Am1A12Am2......A1NAmN⎦⎤−N1⎣⎢⎢⎢⎢⎡j=1∑NA1j...j=1∑NAmjj=1∑NA1jj=1∑NAmj......j=1∑NA1jj=1∑NAmj⎦⎥⎥⎥⎥⎤=⎣⎡A11...Am1A12Am2......A1NAmN⎦⎤−⎣⎡μ1...μmμ1μm......μ1μm⎦⎤=⎣⎡A11−μ1...Am1−μmA12−μ1Am2−μm......A1N−μ1AmN−μm⎦⎤μi≜A第i個row的平均值
也就是在各個維度(1~m)去除均值的 A A A。
SVD矩陣分解
拆開並做SVD矩陣分解:
f ( R , t ) = ∥ B ′ − R A ′ ∥ F 2 = ∥ B ′ ∥ F 2 + ∥ R A ′ ∥ F 2 − 2 ⟨ B ′ , R A ′ ⟩ F Frobenius decomposition = ∥ B ′ ∥ F 2 + ∥ R A ′ ∥ F 2 − 2 tr ( R A ′ B ′ T ) ⟨ A , B ⟩ F = tr ( A B T ) = tr ( A T B ) = tr ( B A T ) = tr ( B T A ) = ∥ B ′ ∥ F 2 + ∥ A ′ ∥ F 2 − 2 tr ( R A ′ B ′ T ) 旋轉矩陣R不會改變矩陣的norm = ∥ B ′ ∥ F 2 + ∥ A ′ ∥ F 2 − 2 tr ( R V Σ U T ) SVD矩陣分解: B ′ A ′ T = U Σ V T , A ′ B ′ T = V Σ U T = ∥ B ′ ∥ F 2 + ∥ A ′ ∥ F 2 − 2 tr ( U T R V Σ ) tr ( A B C D ) = tr ( B C D A ) = tr ( C D A B ) = tr ( D A B C ) = ∥ B ′ ∥ F 2 + ∥ A ′ ∥ F 2 − 2 tr ( T Σ ) 在此令 T ≜ U T R V = ∥ B ′ ∥ F 2 + ∥ A ′ ∥ F 2 − 2 ∑ i = 1 m T i i σ i Σ 是 m × m 對角矩陣, Σ = [ σ 1 0 . . . 0 0 σ 2 . . . 0 0 0 . . . σ m ] \begin{aligned}f(R,t) &= \|B'-RA'\|^2_F \\&= \|B'\|^2_F+\|RA'\|^2_F - 2\langle B',RA'\rangle_F && \text{Frobenius decomposition}\\&=\|B'\|^2_F+\|RA'\|^2_F - 2\text{tr}(RA'B'^T) && \langle A,B\rangle_F = \text{tr}(AB^T) = \text{tr}(A^TB) = \text{tr}(BA^T) = \text{tr}(B^TA) \\&=\|B'\|^2_F+\|A'\|^2_F - 2\text{tr}(RA'B'^T) && \text{旋轉矩陣R不會改變矩陣的norm} \\&= \|B'\|^2_F+\|A'\|^2_F- 2\text{tr}(RV\Sigma U^T) && \text{SVD矩陣分解:}B'A'^T = U\Sigma V^T, A'B'^T = V\Sigma U^T \\&= \|B'\|^2_F+\|A'\|^2_F- 2\text{tr}(U^T RV\Sigma ) && \text{tr}(\bold{ABCD}) = \text{tr}(\bold{BCDA}) = \text{tr}(\bold{CDAB}) = \text{tr}(\bold{DABC}) \\ &= \|B'\|^2_F+\|A'\|^2_F - 2\text{tr}(T\Sigma) && \text{在此令}T \triangleq U^TRV \\&= \|B'\|^2_F+\|A'\|^2_F-2\sum\limits_{i=1}^mT_{ii}\sigma_i && \Sigma \text{是}m \times m \text{對角矩陣,}\Sigma = \begin{bmatrix}\sigma_1 & 0 & ... & 0 \\ 0 & \sigma_2 & ... & 0 \\ 0 & 0 & ... & \sigma_m\end{bmatrix}\end{aligned} f(R,t)=∥B′−RA′∥F2=∥B′∥F2+∥RA′∥F2−2⟨B′,RA′⟩F=∥B′∥F2+∥RA′∥F2−2tr(RA′B′T)=∥B′∥F2+∥A′∥F2−2tr(RA′B′T)=∥B′∥F2+∥A′∥F2−2tr(RVΣUT)=∥B′∥F2+∥A′∥F2−2tr(UTRVΣ)=∥B′∥F2+∥A′∥F2−2tr(TΣ)=∥B′∥F2+∥A′∥F2−2i=1∑mTiiσiFrobenius decomposition⟨A,B⟩F=tr(ABT)=tr(ATB)=tr(BAT)=tr(BTA)旋轉矩陣R不會改變矩陣的normSVD矩陣分解:B′A′T=UΣVT,A′B′T=VΣUTtr(ABCD)=tr(BCDA)=tr(CDAB)=tr(DABC)在此令T≜UTRVΣ是m×m對角矩陣,Σ=⎣⎡σ1000σ20.........00σm⎦⎤
尋找最優的R
欲使
f
(
R
,
t
)
=
∥
B
′
∥
F
2
+
∥
A
′
∥
F
2
−
2
∑
i
=
1
m
T
i
i
σ
i
f(R,t) = \|B'\|^2_F+\|A'\|^2_F-2\sum\limits_{i=1}^mT_{ii}\sigma_i
f(R,t)=∥B′∥F2+∥A′∥F2−2i=1∑mTiiσi最小化,等同於求解
max
R
,
t
(
∑
i
=
1
m
T
i
i
σ
i
)
\max\limits_{R,t}(\sum\limits^m_{i=1}T_{ii}\sigma_i)
R,tmax(i=1∑mTiiσi)。
因為
U
U
U,
R
R
R,
V
V
V皆為正交矩陣,故
T
≜
U
T
R
V
T \triangleq U^TRV
T≜UTRV亦為正交矩陣,所以有
T
T
T
=
I
m
TT^T = I_m
TTT=Im且
∣
T
i
j
∣
<
=
1
|T_{ij}| <= 1
∣Tij∣<=1(參考None element of orthogonal matrix can’t have unit modulus larger then 1)。
在 T = I m T=I_m T=Im時 T i i = 1 T_{ii}=1 Tii=1,所以 ∑ i − 1 m T i i σ i \sum\limits^m_{i-1}T_{ii}\sigma_i i−1∑mTiiσi有最大值,此時 T ≜ U T R V = I m T \triangleq U^TRV = I_m T≜UTRV=Im,這等價於取 R = U V T R = UV^T R=UVT。
所以 R = U V T R = UV^T R=UVT, t = 1 N ( B − R A ) 1 t = \frac{1}{N}(B-RA) \bold{1} t=N1(B−RA)1即為Procrustes Transformation Problem的最優解。