Title: 旋转矩阵约束下的朗格朗日乘子 —— Umeyama 算法推导的数学准备 (II)
文章目录
I. 背景
Umeyama 算法[1]要解决的问题
给定两个 m m m 维空间内的点集 { x i \mathbf{x}_i xi} 和 { y i \mathbf{y}_i yi} (其中 i = 1 , 2 , … , n i=1,2,\ldots,n i=1,2,…,n) 找出最小误差平方意义下的相似变换参数 (包括旋转 R \mathbf{R} R、平移 t \mathbf{t} t、缩放 c c c).
原论文在一开始提出了一个引理. 在这个引理中,
A \mathbf{A} A 和 B \mathbf{B} B 都是 m × n m\times n m×n 的矩阵. 而 R \mathbf{R} R 是 m × m m\times m m×m 的旋转矩阵. 要求使得 ∥ A − R B ∥ 2 \| \mathbf{A} - \mathbf{R}\mathbf{B} \|^2 ∥A−RB∥2 取最小值时的旋转矩阵 R \mathbf{R} R.
翻译一下就是
min
R
∥
A
−
R
B
∥
2
s
.
t
.
R
T
R
=
I
det
(
R
)
=
1
(I-1)
\min_{\mathbf{R}} \| \mathbf{A} - \mathbf{R}\mathbf{B} \|^2 \\ \begin{array}{l} {s.t.} &{\mathbf{R}^{\small\rm T} \mathbf{R}} = \mathbf{I} \\ &\det(\mathbf{R}) = 1 \end{array}\tag{I-1}
Rmin∥A−RB∥2s.t.RTR=Idet(R)=1(I-1)
在本篇博客中, 我们就对这种旋转矩阵约束的优化问题构建拉格朗日函数 (Lagrangian Function) 和拉格朗日乘子 (Lagrange multipliers).
II. 推导
1. 拉格朗日乘子法
拉格朗日乘子法求极值[2]
求函数 f ( x , y ) f(x, y) f(x,y) 在条件 g ( x , y ) = 0 g(x, y)= 0 g(x,y)=0 下的极值, 先构造拉格朗日函数 (Lagrangian Function)
L ( x , y , λ ) = f ( x , y ) + λ g ( x , y ) L(x,y,\lambda) = f(x,y) + \lambda g(x,y) L(x,y,λ)=f(x,y)+λg(x,y)
其中 λ \lambda λ 称为拉格朗日乘子 (Lagrange Multiplier). 则函数 L ( x , y , λ ) L(x,y,\lambda) L(x,y,λ) 的取极值的必要条件是
∂ L ∂ x = 0 , ∂ L ∂ y = 0 , ∂ L ∂ λ = 0 \frac{\partial L}{\partial x} = 0, \quad \frac{\partial L}{\partial y} = 0, \quad \frac{\partial L}{\partial \lambda} = 0 ∂x∂L=0,∂y∂L=0,∂λ∂L=0
如果在两个约束条件 g 1 ( x , y ) = 0 g_{1} (x, y)= 0 g1(x,y)=0 和 g 2 ( x , y ) = 0 g_2 (x, y)= 0 g2(x,y)=0 下, 为求函数 f ( x , y ) f(x, y) f(x,y) 极值而构造的朗格朗日函数为
L ( x , y , λ 1 , λ 2 ) = f ( x , y ) + λ 1 g 1 ( x , y ) + λ 2 g 2 ( x , y ) L(x,y,\lambda_1, \lambda_2) = f(x,y) + \lambda_1 g_1 (x,y) + \lambda_2 g_2 (x,y) L(x,y,λ1,λ2)=f(x,y)+λ1g1(x,y)+λ2g2(x,y)
取得极值的必要条件则为
∂ L ∂ x = 0 , ∂ L ∂ y = 0 , ∂ L ∂ λ 1 = 0 , ∂ L ∂ λ 2 = 0 \frac{\partial L}{\partial x} = 0, \quad \frac{\partial L}{\partial y} = 0, \quad \frac{\partial L}{\partial \lambda_1} = 0, \quad \frac{\partial L}{\partial \lambda_2} = 0 ∂x∂L=0,∂y∂L=0,∂λ1∂L=0,∂λ2∂L=0
更多约束条件依次类推.
2. 旋转矩阵约束
我们先以三维旋转矩阵约束为例展开. 如式 (I-1) 中旋转矩阵约束可以拆分成两部分约束方程.
A. 正交约束
正交约束
R
T
R
−
I
=
0
(II-2-A-1)
{\mathbf{R}^{\small\rm T} \mathbf{R}} - \mathbf{I} = 0 \tag{II-2-A-1}
RTR−I=0(II-2-A-1)
三维情况下
R
≜
[
R
11
R
12
R
13
R
21
R
22
R
23
R
31
R
32
R
33
]
(II-2-A-2)
\mathbf{R} \triangleq \begin{bmatrix} R_{11} & R_{12} &R_{13}\\ R_{21} & R_{22} &R_{23}\\ R_{31} & R_{32} &R_{33} \end{bmatrix} \tag{II-2-A-2}
R≜
R11R21R31R12R22R32R13R23R33
(II-2-A-2)
则正交约束方程
[
R
11
R
12
R
13
R
21
R
22
R
23
R
31
R
32
R
33
]
T
[
R
11
R
12
R
13
R
21
R
22
R
23
R
31
R
32
R
33
]
−
[
1
0
0
0
1
0
0
0
1
]
=
0
(II-2-A-3)
\begin{bmatrix} R_{11} & R_{12} &R_{13}\\ R_{21} & R_{22} &R_{23}\\ R_{31} & R_{32} &R_{33} \end{bmatrix}^{\small \rm T} \begin{bmatrix} R_{11} & R_{12} &R_{13}\\ R_{21} & R_{22} &R_{23}\\ R_{31} & R_{32} &R_{33} \end{bmatrix} - \begin{bmatrix} 1 &0 &0\\ 0 & 1 &0\\ 0 &0 &1 \end{bmatrix} =\mathbf{0} \tag{II-2-A-3}
R11R21R31R12R22R32R13R23R33
T
R11R21R31R12R22R32R13R23R33
−
100010001
=0(II-2-A-3)
计算得到
[
∑
i
=
1
3
R
i
1
R
i
1
−
1
∑
i
=
1
3
R
i
1
R
i
2
∑
i
=
1
3
R
i
1
R
i
3
∑
i
=
1
3
R
i
2
R
i
1
∑
i
=
1
3
R
i
2
R
i
2
−
1
∑
i
=
1
3
R
i
2
R
i
3
∑
i
=
1
3
R
i
3
R
i
1
∑
i
=
1
3
R
i
3
R
i
2
∑
i
=
1
3
R
i
3
R
i
3
−
1
]
=
0
(II-2-A-4)
\begin{bmatrix} \sum_{i=1}^{3} R_{i1} R_{i1}-1 & \color{red}{\sum_{i=1}^{3} R_{i1} R_{i2}} & \color{green}{\sum_{i=1}^{3} R_{i1} R_{i3}}\\ \color{red}{\sum_{i=1}^{3} R_{i2} R_{i1}} & \sum_{i=1}^{3} R_{i2} R_{i2} -1 & \color{blue}{\sum_{i=1}^{3} R_{i2} R_{i3}}\\ \color{green}{\sum_{i=1}^{3} R_{i3} R_{i1}} & \color{blue}{\sum_{i=1}^{3} R_{i3} R_{i2}} & \sum_{i=1}^{3} R_{i3} R_{i3}-1 \\ \end{bmatrix} = \mathbf{0} \tag{II-2-A-4}
∑i=13Ri1Ri1−1∑i=13Ri2Ri1∑i=13Ri3Ri1∑i=13Ri1Ri2∑i=13Ri2Ri2−1∑i=13Ri3Ri2∑i=13Ri1Ri3∑i=13Ri2Ri3∑i=13Ri3Ri3−1
=0(II-2-A-4)
如果不考虑上式的对称性 (重复约束), 三维空间中的正交约束对应 9 个标量约束方程;
如果考虑上式的对称性, 三维空间中的正交约束对应 6 个标量约束方程.
针对对称性约束的两种处理方法的等价性是后面讨论的重点.
B. 行列式约束
行列式约束本身就是 1 个标量约束方程
det
(
R
)
−
1
=
0
(II-2-B-1)
\det(\mathbf{R}) - 1 = 0 \tag{II-2-B-1}
det(R)−1=0(II-2-B-1)
无需再做进一步讨论.
C. 拉格朗日函数构造 (3 维)
不考虑对称情况下, 按照前面定义的 “拉格朗日乘子法求极值” 构造 3 维旋转矩阵约束情况下的拉格朗日函数
L
=
∥
A
−
R
B
∥
2
+
λ
11
[
∑
i
=
1
3
R
i
1
R
i
1
−
1
]
+
λ
12
[
∑
i
=
1
3
R
i
1
R
i
2
]
+
λ
13
[
∑
i
=
1
3
R
i
1
R
i
3
]
+
λ
21
[
∑
i
=
1
3
R
i
2
R
i
1
]
+
λ
22
[
∑
i
=
1
3
R
i
2
R
i
2
−
1
]
+
λ
23
[
∑
i
=
1
3
R
i
2
R
i
3
]
+
λ
31
[
∑
i
=
1
3
R
i
3
R
i
1
]
+
λ
32
[
∑
i
=
1
3
R
i
3
R
i
2
]
+
λ
33
[
∑
i
=
1
3
R
i
3
R
i
3
−
1
]
+
g
[
det
(
R
)
−
1
]
(II-2-C-1)
\begin{aligned} L = \| \mathbf{A} - \mathbf{R}\mathbf{B} \|^2 & + \lambda_{11} \left[\sum_{i=1}^{3} R_{i1} R_{i1}-1\right] + \lambda_{12} \left[ \sum_{i=1}^{3} R_{i1} R_{i2} \right] + \lambda_{13}\left[\sum_{i=1}^{3} R_{i1} R_{i3}\right] \\ & + \lambda_{21}\left[ \sum_{i=1}^{3} R_{i2} R_{i1} \right] +\lambda_{22} \left[ \sum_{i=1}^{3} R_{i2} R_{i2} -1 \right] + \lambda_{23}\left[ \sum_{i=1}^{3} R_{i2} R_{i3} \right]\\ &+ \lambda_{31}\left[ \sum_{i=1}^{3} R_{i3} R_{i1} \right] +\lambda_{32} \left[ \sum_{i=1}^{3} R_{i3} R_{i2} \right] + \lambda_{33}\left[ \sum_{i=1}^{3} R_{i3} R_{i3}-1 \right]\\ &+g\left[\det(\mathbf{R}) - 1\right] \end{aligned} \tag{II-2-C-1}
L=∥A−RB∥2+λ11[i=1∑3Ri1Ri1−1]+λ12[i=1∑3Ri1Ri2]+λ13[i=1∑3Ri1Ri3]+λ21[i=1∑3Ri2Ri1]+λ22[i=1∑3Ri2Ri2−1]+λ23[i=1∑3Ri2Ri3]+λ31[i=1∑3Ri3Ri1]+λ32[i=1∑3Ri3Ri2]+λ33[i=1∑3Ri3Ri3−1]+g[det(R)−1](II-2-C-1)
其中
(
λ
11
,
λ
12
,
λ
13
,
λ
21
,
λ
22
,
λ
23
,
λ
31
,
λ
32
,
λ
33
,
g
)
(\lambda_{11}, \lambda_{12},\lambda_{13},\lambda_{21}, \lambda_{22},\lambda_{23},\lambda_{31}, \lambda_{32},\lambda_{33}, g)
(λ11,λ12,λ13,λ21,λ22,λ23,λ31,λ32,λ33,g) 都是拉格朗日乘子, 并记
λ
≜
[
λ
11
λ
12
λ
13
λ
21
λ
22
λ
23
λ
31
λ
32
λ
33
]
(II-2-C-2)
{\boldsymbol{\lambda}} \triangleq \begin{bmatrix} \lambda_{11} &\lambda_{12} &\lambda_{13}\\ \lambda_{21}& \lambda_{22} &\lambda_{23}\\ \lambda_{31}& \lambda_{32} &\lambda_{33} \end{bmatrix} \tag{II-2-C-2}
λ≜
λ11λ21λ31λ12λ22λ32λ13λ23λ33
(II-2-C-2)
考虑对称情况下, 即利用对称性进行整理
L
=
∥
A
−
R
B
∥
2
+
λ
11
[
∑
i
=
1
3
R
i
1
R
i
1
−
1
]
+
λ
12
′
[
∑
i
=
1
3
R
i
1
R
i
2
]
+
λ
13
′
[
∑
i
=
1
3
R
i
1
R
i
3
]
+
λ
22
[
∑
i
=
1
3
R
i
2
R
i
2
−
1
]
+
λ
23
′
[
∑
i
=
1
3
R
i
2
R
i
3
]
+
λ
33
[
∑
i
=
1
3
R
i
3
R
i
3
−
1
]
(II-2-C-3)
\begin{aligned} L = \| \mathbf{A} - \mathbf{R}\mathbf{B} \|^2 & + \lambda_{11} \left[\sum_{i=1}^{3} R_{i1} R_{i1}-1\right] + \lambda_{12}^{'} \left[ \sum_{i=1}^{3} R_{i1} R_{i2} \right] + \lambda_{13}^{'} \left[\sum_{i=1}^{3} R_{i1} R_{i3}\right] \\ & +\lambda_{22} \left[ \sum_{i=1}^{3} R_{i2} R_{i2} -1 \right] + \lambda_{23}^{'} \left[ \sum_{i=1}^{3} R_{i2} R_{i3} \right]\\ & + \lambda_{33}\left[ \sum_{i=1}^{3} R_{i3} R_{i3}-1 \right] \end{aligned} \tag{II-2-C-3}
L=∥A−RB∥2+λ11[i=1∑3Ri1Ri1−1]+λ12′[i=1∑3Ri1Ri2]+λ13′[i=1∑3Ri1Ri3]+λ22[i=1∑3Ri2Ri2−1]+λ23′[i=1∑3Ri2Ri3]+λ33[i=1∑3Ri3Ri3−1](II-2-C-3)
其中
λ 12 ′ ≜ λ 12 + λ 21 λ 13 ′ ≜ λ 12 + λ 31 λ 23 ′ ≜ λ 23 + λ 32 (II-2-C-4) \lambda_{12}^{'} \triangleq \lambda_{12} +\lambda_{21}\\ \lambda_{13}^{'} \triangleq \lambda_{12} +\lambda_{31}\\ \lambda_{23}^{'} \triangleq \lambda_{23} +\lambda_{32}\\ \tag{II-2-C-4} λ12′≜λ12+λ21λ13′≜λ12+λ31λ23′≜λ23+λ32(II-2-C-4)
记
λ
′
≜
[
λ
11
λ
12
′
λ
13
′
λ
22
λ
23
′
λ
33
]
=
[
λ
11
λ
12
+
λ
21
λ
12
+
λ
31
λ
22
λ
23
+
λ
32
λ
33
]
(II-2-C-5)
{\boldsymbol{\lambda}}^{'} \triangleq \begin{bmatrix} \lambda_{11} &\lambda_{12}^{'} &\lambda_{13}^{'}\\ & \lambda_{22} &\lambda_{23}^{'}\\ & &\lambda_{33} \end{bmatrix} = \begin{bmatrix} \lambda_{11} &\lambda_{12} +\lambda_{21} &\lambda_{12} +\lambda_{31}\\ & \lambda_{22} &\lambda_{23} +\lambda_{32}\\ & &\lambda_{33} \end{bmatrix} \tag{II-2-C-5}
λ′≜
λ11λ12′λ22λ13′λ23′λ33
=
λ11λ12+λ21λ22λ12+λ31λ23+λ32λ33
(II-2-C-5)
根据 “拉格朗日乘子法求极值”, 从式 (II-2-C-1) 得到的极值条件是
{
∂
L
∂
R
=
0
∂
L
∂
λ
=
[
∑
i
=
1
3
R
i
1
R
i
1
−
1
∑
i
=
1
3
R
i
1
R
i
2
∑
i
=
1
3
R
i
1
R
i
3
∑
i
=
1
3
R
i
2
R
i
1
∑
i
=
1
3
R
i
2
R
i
2
−
1
∑
i
=
1
3
R
i
2
R
i
3
∑
i
=
1
3
R
i
3
R
i
1
∑
i
=
1
3
R
i
3
R
i
2
∑
i
=
1
3
R
i
3
R
i
3
−
1
]
=
0
∂
L
∂
g
=
det
(
R
)
−
1
=
0
(II-2-C-6)
\left\{ \begin{array}{l} \frac{\partial L}{\partial \mathbf{R}} = \mathbf{0} \\ \frac{\partial L}{\partial \boldsymbol{\lambda}} = \begin{bmatrix} \sum_{i=1}^{3} R_{i1} R_{i1}-1 & \color{red}{\sum_{i=1}^{3} R_{i1} R_{i2}} & \color{green}{\sum_{i=1}^{3} R_{i1} R_{i3}}\\ \color{red}{\sum_{i=1}^{3} R_{i2} R_{i1}} & \sum_{i=1}^{3} R_{i2} R_{i2} -1 & \color{blue}{\sum_{i=1}^{3} R_{i2} R_{i3}}\\ \color{green}{\sum_{i=1}^{3} R_{i3} R_{i1}} & \color{blue}{\sum_{i=1}^{3} R_{i3} R_{i2}} & \sum_{i=1}^{3} R_{i3} R_{i3}-1 \end{bmatrix} = \mathbf{0} \\ \frac{\partial L}{\partial g} = \det(\mathbf{R}) - 1 =\mathbf{0} \end{array} \right. \tag{II-2-C-6}
⎩
⎨
⎧∂R∂L=0∂λ∂L=
∑i=13Ri1Ri1−1∑i=13Ri2Ri1∑i=13Ri3Ri1∑i=13Ri1Ri2∑i=13Ri2Ri2−1∑i=13Ri3Ri2∑i=13Ri1Ri3∑i=13Ri2Ri3∑i=13Ri3Ri3−1
=0∂g∂L=det(R)−1=0(II-2-C-6)
根据 “拉格朗日乘子法求极值”, 从式 (II-2-C-3) 得到的极值条件是
{
∂
L
∂
R
=
0
∂
L
∂
λ
′
=
[
∑
i
=
1
3
R
i
1
R
i
1
−
1
∑
i
=
1
3
R
i
1
R
i
2
∑
i
=
1
3
R
i
1
R
i
3
∑
i
=
1
3
R
i
2
R
i
2
−
1
∑
i
=
1
3
R
i
2
R
i
3
∑
i
=
1
3
R
i
3
R
i
3
−
1
]
=
0
∂
L
∂
g
=
det
(
R
)
−
1
=
0
(II-2-C-7)
\left\{ \begin{array}{l} \frac{\partial L}{\partial \mathbf{R}} = \mathbf{0} \\ \frac{\partial L}{\partial \boldsymbol{\lambda}^{'}} = \begin{bmatrix} \sum_{i=1}^{3} R_{i1} R_{i1}-1 & \color{red}{\sum_{i=1}^{3} R_{i1} R_{i2}} & \color{green}{\sum_{i=1}^{3} R_{i1} R_{i3}}\\ & \sum_{i=1}^{3} R_{i2} R_{i2} -1 & \color{blue}{\sum_{i=1}^{3} R_{i2} R_{i3}}\\ & & \sum_{i=1}^{3} R_{i3} R_{i3}-1 \end{bmatrix} = \mathbf{0} \\ \frac{\partial L}{\partial g} = \det(\mathbf{R}) - 1 =\mathbf{0} \end{array} \right. \tag{II-2-C-7}
⎩
⎨
⎧∂R∂L=0∂λ′∂L=
∑i=13Ri1Ri1−1∑i=13Ri1Ri2∑i=13Ri2Ri2−1∑i=13Ri1Ri3∑i=13Ri2Ri3∑i=13Ri3Ri3−1
=0∂g∂L=det(R)−1=0(II-2-C-7)
因为对称性, 方程组 (II-2-C-6) 是欠定的, 无法求出
λ
\boldsymbol{\lambda}
λ 中 9 个元素的所有解, 只能求出
(
λ
11
,
λ
12
+
λ
21
,
λ
12
+
λ
31
,
λ
22
,
λ
23
+
λ
32
,
λ
33
)
(\lambda_{11} ,\,\lambda_{12} +\lambda_{21} ,\,\lambda_{12} +\lambda_{31},\, \lambda_{22},\, \lambda_{23} +\lambda_{32},\, \lambda_{33})
(λ11,λ12+λ21,λ12+λ31,λ22,λ23+λ32,λ33) 这 6 个元素. 而这恰恰是可以从式 (II-2-C-7) 中直接求得的.
这样可知, 不管在设定对应于旋转矩阵约束的拉格朗日乘子时是否考虑对称性, 得到的结果都是一样的. 即式 (II-2-C-6) 和式 (II-2-C-7) 等价.
因为只能获得
(
λ
12
+
λ
21
,
λ
12
+
λ
31
,
λ
23
+
λ
32
)
(\lambda_{12} +\lambda_{21} ,\,\lambda_{12} +\lambda_{31},\, \lambda_{23} +\lambda_{32})
(λ12+λ21,λ12+λ31,λ23+λ32), 至于
λ
12
\lambda_{12}
λ12 和
λ
21
\lambda_{21}
λ21 之间如何再分配毫不影响结果. 故为了方便设定拉格朗日乘子相关矩阵也为对称矩阵.
λ
≜
[
λ
11
λ
12
λ
13
λ
12
λ
22
λ
23
λ
13
λ
23
λ
33
]
(II-2-C-8)
{\boldsymbol{\lambda}} \triangleq \begin{bmatrix} \lambda_{11} &\lambda_{12} &\lambda_{13}\\ \lambda_{12}& \lambda_{22} &\lambda_{23}\\ \lambda_{13}& \lambda_{23} &\lambda_{33} \end{bmatrix} \tag{II-2-C-8}
λ≜
λ11λ12λ13λ12λ22λ23λ13λ23λ33
(II-2-C-8)
注:
拉格朗日乘子相关矩阵设定为式 (II-2-C-5) 不是更方便吗?
并非如此. 因为我们要把这个方法写成矩阵形式, 对称矩阵可以写得很简洁, 也便于利用对称矩阵的性质.
所以三维旋转矩阵约束情况下的拉格朗日函数写成式 (II-2-C-1) 形式, 拉格朗日乘子写成式 (II-2-C-8) 形式.
D. 拉格朗日函数构造 (3 维) 的矩阵形式
矩阵的标准内积定义
The standard inner product between matrices[3] is
< X , Y > = t r ( X T Y ) = ∑ i ∑ j X i j Y i j <\mathbf{X}, \mathbf{Y}> = {\rm tr} (\mathbf{X}^{\small\rm T} \mathbf{Y}) = \sum_{i} \sum_{j} X_{ij} Y_{ij} <X,Y>=tr(XTY)=i∑j∑XijYij
where X , Y ∈ R m × n \mathbf{X}, \mathbf{Y} \in \mathbb{R}^{m\times n} X,Y∈Rm×n
式 (II-2-C-1) 中部分项
λ
11
[
∑
i
=
1
3
R
i
1
R
i
1
−
1
]
+
λ
12
[
∑
i
=
1
3
R
i
1
R
i
2
]
+
λ
13
[
∑
i
=
1
3
R
i
1
R
i
3
]
+
λ
21
[
∑
i
=
1
3
R
i
2
R
i
1
]
+
λ
22
[
∑
i
=
1
3
R
i
2
R
i
2
−
1
]
+
λ
23
[
∑
i
=
1
3
R
i
2
R
i
3
]
+
λ
31
[
∑
i
=
1
3
R
i
3
R
i
1
]
+
λ
32
[
∑
i
=
1
3
R
i
3
R
i
2
]
+
λ
33
[
∑
i
=
1
3
R
i
3
R
i
3
−
1
]
=
<
λ
,
R
T
R
−
I
>
=
t
r
[
λ
(
R
T
R
−
I
)
]
(II-2-D-1)
\begin{aligned} &{\begin{aligned} & \lambda_{11} \left[\sum_{i=1}^{3} R_{i1} R_{i1}-1\right] + \lambda_{12} \left[ \sum_{i=1}^{3} R_{i1} R_{i2} \right] + \lambda_{13}\left[\sum_{i=1}^{3} R_{i1} R_{i3}\right] \\ + & \lambda_{21}\left[ \sum_{i=1}^{3} R_{i2} R_{i1} \right] +\lambda_{22} \left[ \sum_{i=1}^{3} R_{i2} R_{i2} -1 \right] + \lambda_{23}\left[ \sum_{i=1}^{3} R_{i2} R_{i3} \right]\\ + & \lambda_{31}\left[ \sum_{i=1}^{3} R_{i3} R_{i1} \right] +\lambda_{32} \left[ \sum_{i=1}^{3} R_{i3} R_{i2} \right] + \lambda_{33}\left[ \sum_{i=1}^{3} R_{i3} R_{i3}-1 \right]\\ \end{aligned}} \\ = \quad &<{\boldsymbol\lambda}, {\mathbf{R}^{\small\rm T} \mathbf{R}} - \mathbf{I}>\\ = \quad &{\rm tr} \left[{\boldsymbol\lambda}\left({\mathbf{R}^{\small\rm T} \mathbf{R}} - \mathbf{I}\right)\right] \end{aligned} \tag{II-2-D-1}
==++λ11[i=1∑3Ri1Ri1−1]+λ12[i=1∑3Ri1Ri2]+λ13[i=1∑3Ri1Ri3]λ21[i=1∑3Ri2Ri1]+λ22[i=1∑3Ri2Ri2−1]+λ23[i=1∑3Ri2Ri3]λ31[i=1∑3Ri3Ri1]+λ32[i=1∑3Ri3Ri2]+λ33[i=1∑3Ri3Ri3−1]<λ,RTR−I>tr[λ(RTR−I)](II-2-D-1)
其中
λ
\boldsymbol{\lambda}
λ 是对称矩阵. 这样式 (II-2-C-1) 就可以简写为矩阵形式
L
=
∥
A
−
R
B
∥
2
+
t
r
(
λ
(
R
T
R
−
I
)
)
+
g
[
det
(
R
)
−
1
]
(II-2-D-2)
L = \| \mathbf{A} - \mathbf{R}\mathbf{B} \|^2 + {\rm tr} \left({\boldsymbol\lambda}\left({\mathbf{R}^{\small\rm T} \mathbf{R}} - \mathbf{I}\right)\right) + g\left[\det(\mathbf{R}) - 1\right] \tag{II-2-D-2}
L=∥A−RB∥2+tr(λ(RTR−I))+g[det(R)−1](II-2-D-2)
E. 拉格朗日函数构造 ( m m m 维)
对照三维情况的推导, 我们可以很方便地得到同样形式的
m
m
m 维情况的拉格朗日函数构造
L
=
∥
A
−
R
B
∥
2
+
t
r
(
λ
(
R
T
R
−
I
)
)
+
g
[
det
(
R
)
−
1
]
(II-2-E-1)
L = \| \mathbf{A} - \mathbf{R}\mathbf{B} \|^2 + {\rm tr}\left({\boldsymbol\lambda}\left({\mathbf{R}^{\small\rm T} \mathbf{R}} - \mathbf{I}\right)\right) + g\left[\det(\mathbf{R}) - 1\right] \tag{II-2-E-1}
L=∥A−RB∥2+tr(λ(RTR−I))+g[det(R)−1](II-2-E-1)
其中
R
\mathbf{R}
R 是
m
×
m
m\times m
m×m 的旋转矩阵.
λ
\boldsymbol\lambda
λ 是
m
×
m
m \times m
m×m 的对称矩阵, 作为拉格朗日乘子矩阵.
g
g
g 是标量拉格朗日乘子.
式 (II-2-E-1) 就是本篇博客要求的内容了, 具有旋转矩阵约束的拉格朗日函数及其乘子的构造.
2. 注记
从上面的推导我们可以知道:
具有对称性的矩阵约束, 其对应的拉格朗日乘子矩阵也是对称的.
参考文献
[1] S. Umeyama, “Least-squares estimation of transformation parameters between two point patterns,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 13, no. 4, pp. 376-380, April 1991, doi: 10.1109/34.88573.
[2] 李大华等, 工科数学分析 (下), 华中科技大学出版社, 第二版, 2004
[3] A.A. Ahmadi, Inner products and norms, ORF 523 Lecture 2 Spring 2016, Princeton University, https://www.princeton.edu/~aaa/Public/Teaching/ORF523/S16/ORF523_S16_Lec2_gh.pdf