机器人中常用矩阵等式-II (Identities 1~6 的证明)

wzf@robotics_notes

已于 2023-10-30 21:58:55 修改

阅读量224

点赞数

分类专栏：数学基础文章标签：矩阵机器人线性代数数学建模人工智能

于 2023-08-17 21:37:21 首次发布

本文链接：https://blog.csdn.net/woyaomaishu2/article/details/132351437

版权

数学基础专栏收录该内容

15 篇文章 2 订阅

订阅专栏

# 机器人中常用矩阵等式 ([Identity 1] ~ [Identity 6] 的证明)

I. 机器人中常用矩阵等式-I (汇总)

II. 具体展开及证明

[Identity 1] Matrix Trace
${\rm tr}(\mathbf{A}\mathbf{B}) = {\rm tr}(\mathbf{B}\mathbf{A})$
where $\mathbf{A}$ is an $n\times m$ matrix, and $\mathbf{B}$ is an $\times n$ matrix. ^[1]

Proof
${\rm tr}(\mathbf{A}\mathbf{B}) = \sum_{i=1}^{n} \sum_{j=1}^{m} A_{ij} B_{ji} = \sum_{i=1}^{n} \sum_{j=1}^{m} \underline{B_{ji} A_{ij}}$

${\rm tr}(\mathbf{B}\mathbf{A}) = \sum_{p=1}^{m} \sum_{q=1}^{n} B_{pq} A_{qp} = \underline{\sum_{q=1}^{n} \sum_{p=1}^{m}} B_{pq} A_{qp}$

比较两式得到
${\rm tr}(\mathbf{A}\mathbf{B}) = {\rm tr}(\mathbf{B}\mathbf{A})$
证毕.

[Identity 2] Derivative of Vector with Respect to Vector

Let $\mathbf{x}$ and $\mathbf{y}$ be vectors of orders $n$ and $m$ respectively
$\mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots\\ x_n\end{bmatrix}, \qquad \mathbf{y} = \begin{bmatrix} y_1 \\ y_2 \\ \vdots\\ y_m\end{bmatrix}$
where $\mathbf{y}$ is a function of $\mathbf{x}$ , i.e., $\mathbf{y} = \mathbf{y}(\mathbf{x})$ .
In denominator-layout notation, the derivative of the vector $\mathbf{y}$ with respect to vector $\mathbf{x}$ is the $\times m$ matrix^[2]
$\frac{\partial \mathbf{y}}{\partial \mathbf{x}} \triangleq \begin{bmatrix} \frac{\partial y_1}{\partial x_1} &\frac{\partial y_2}{\partial x_1} &\cdots &\frac{\partial y_m}{\partial x_1}\\ \frac{\partial y_1}{\partial x_2} &\frac{\partial y_2}{\partial x_2} &\cdots &\frac{\partial y_m}{\partial x_2}\\ \vdots &\vdots & \ddots &\vdots\\ \frac{\partial y_1}{\partial x_n} &\frac{\partial y_2}{\partial x_n} &\cdots &\frac{\partial y_m}{\partial x_n} \end{bmatrix}$

In numerator-layout notation, the derivative of the vector $\mathbf{y}$ with respect to vector $\mathbf{x}$ is the $\times n$ matrix^[6]
$\frac{\partial \mathbf{y}}{\partial \mathbf{x}} \triangleq \begin{bmatrix} \frac{\partial y_1}{\partial x_1} &\frac{\partial y_1}{\partial x_2} &\cdots &\frac{\partial y_1}{\partial x_n}\\ \frac{\partial y_2}{\partial x_1} &\frac{\partial y_2}{\partial x_2} &\cdots &\frac{\partial y_2}{\partial x_n}\\ \vdots &\vdots & \ddots &\vdots\\\frac{\partial y_m}{\partial x_1} &\frac{\partial y_m}{\partial x_2} &\cdots &\frac{\partial y_m}{\partial x_n}\end{bmatrix}$

[Identity 3] Derivative of Scale with Respect to Matrix

Let $\mathbf{X}$ be an $m\times n$ matrix. For a scalar valued function $f(\mathbf X)$ , the result $\partial f/\partial \mathbf X$ has the same size with $\mathbf X$ . That is^[1]
$\frac{\partial f}{\partial \mathbf X} \triangleq \begin{bmatrix} \frac{\partial f}{\partial X_{11}} & \frac{\partial f}{\partial X_{12}} &\cdots & \frac{\partial f}{\partial X_{1n}} \\ \frac{\partial f}{\partial X_{21}} & \frac{\partial f}{\partial X_{22}} &\cdots & \frac{\partial f}{\partial X_{2n}} \\ \vdots &\vdots &\ddots &\vdots\\ \frac{\partial f}{\partial X_{m1}} & \frac{\partial f}{\partial X_{m2}} &\cdots & \frac{\partial f}{\partial X_{mn}} \\ \end{bmatrix}$

[Identity 4] Partial Derivative of a Matrix Trace of the First Order (1)
$\frac{\partial {\rm tr}{\mathbf X}{\mathbf Y}}{\partial \mathbf X} = \frac{\partial {\rm tr}{\mathbf Y} {\mathbf X}}{\partial \mathbf X} = {\mathbf Y}^{\small \rm T}$
where $\mathbf{X}$ is $m\times n$ and $\mathbf{Y}$ is $n\times m$ .^[3]

Proof

计算可知
${\rm tr} \mathbf{X} \mathbf{Y} = \sum_{i=1}^{m} \sum_{j=1}^{n} X_{ij} Y_{ji}$
则有
$\frac{\partial {\rm tr} \mathbf{X} \mathbf{Y}}{\partial X_{ij}} =\frac{\partial \sum_{i=1}^{m} \sum_{j=1}^{n} X_{ij} Y_{ji}}{\partial X_{ij}} = Y_{ji}$
故有
$\frac{\partial {\rm tr}{\mathbf X}{\mathbf Y}}{\partial \mathbf X} = {\mathbf Y}^{\small \rm T}$
根据 “[Identity 1] Matrix Trace” 得到
$\frac{\partial {\rm tr}{\mathbf Y} {\mathbf X}}{\partial \mathbf X} = {\mathbf Y}^{\small \rm T}$
得证.

[Identity 5] Partial Derivative of a Matrix Trace of the First Order (2)
$\frac{\partial {\rm tr}{{\mathbf X}^{\small\rm T}} {\mathbf Y}}{\partial \mathbf X} = \frac{\partial {\rm tr}{\mathbf Y} {{\mathbf X}^{\small\rm T}} } {\partial \mathbf X} = {\mathbf Y}$
where $\mathbf{X}$ is $\times m$ and $\mathbf{Y}$ is also $n\times m$ .

Proof

证明和 “[Identity 4] Partial Derivative of a Matrix Trace” 类似.
${\rm tr} {\mathbf{X}^{\small \rm T}} \mathbf{Y} = \sum_{i=1}^{m} \sum_{j=1}^{n} X_{ji} Y_{ji}$
则有
$\frac{\partial {\rm tr} \mathbf{X}^{\small \rm T} \mathbf{Y}}{\partial X_{ij}} =\frac{\partial \sum_{i=1}^{m} \sum_{j=1}^{n} X_{ji} Y_{ji}}{\partial X_{ij}} = Y_{ij}$
根据 “[Identity 3] Derivative of Scale with Respect to Matrix” 有
$\frac{\partial {\rm tr}{{\mathbf X}^{\small\rm T}} {\mathbf Y}}{\partial \mathbf X} = {\mathbf Y}$
又由 “[Identity 1] Matrix Trace”, 可知
$\frac{\partial {\rm tr}{{\mathbf X}^{\small\rm T}} {\mathbf Y}}{\partial \mathbf X} = \frac{\partial {\rm tr}{\mathbf Y} {{\mathbf X}^{\small\rm T}} } {\partial \mathbf X} = {\mathbf Y}$
得证.

[Identity 6] Partial Derivative of a Matrix Trace of the Second Order (1)
$\frac{\partial {\rm tr}{\mathbf X}{\mathbf Z}{\mathbf X}^{\small \rm T}}{\partial \mathbf X} = {\mathbf X}{\mathbf Z}^{\small\rm T} + {\mathbf X}{\mathbf Z}$
where $\mathbf{X}$ is $m\times n$ , and ${\mathbf Z}$ is $n\times n$ .^[3]

If $\mathbf{Z}$ is symmetric, then
$\frac{\partial {\rm tr}{\mathbf X}{\mathbf Z}{\mathbf X}^{\small \rm T}}{\partial \mathbf X} = 2{\mathbf X}{\mathbf Z}$

Proof

取 $\mathbf{X}$ 中第 $k$ 行 ${\mathbf X}_{k}$ , $\mathbf{X} \mathbf{Z} \mathbf{X}^{\small \rm T}$ 的 $k$ 行 $k$ 列元素 (对角线上) 为
$\begin{aligned} \left[\mathbf{X} \mathbf{Z} \mathbf{X}^{\small \rm T}\right]_{kk} =\mathbf{X}_{k} \mathbf{Z} \mathbf{X}_{k}^{\small \rm T} &= \begin{bmatrix} X_{k1} &X_{k2} &\cdots & X_{kn} \end{bmatrix} \begin{bmatrix} Z_{11} & Z_{12} &\cdots &Z_{1n}\\ Z_{21} & Z_{22} &\cdots &Z_{2n}\\ \vdots & \vdots &\ddots &\vdots\\ Z_{n1} & Z_{n2} &\cdots &Z_{nn} \end{bmatrix} \begin{bmatrix} X_{k1} \\X_{k2} \\ \vdots \\ X_{kn} \end{bmatrix}\\ &= \begin{bmatrix} \sum_{i=1}^{n} X_{ki} Z_{i1} &\sum_{i=1}^{n} X_{ki} Z_{i2} &\cdots & \sum_{i=1}^{n} X_{ki} Z_{in} \end{bmatrix}\begin{bmatrix} X_{k1} \\X_{k2} \\ \vdots \\ X_{kn} \end{bmatrix}\\ &= \sum_{j=1}^{n}\sum_{i=1}^{n} X_{ki} Z_{ij} X_{kj} \end{aligned}$
由方阵迹的定义
${\rm tr} \left( \mathbf{X} \mathbf{Z} \mathbf{X}^{\small \rm T}\right) = \sum_{k=1}^{m} \mathbf{X}_{k} \mathbf{Z} \mathbf{X}_{k}^{\small \rm T} = \sum_{k=1}^{m} \sum_{j=1}^{n}\sum_{i=1}^{n} X_{ki} Z_{ij} X_{kj}$

根据 “[Identity 3] Derivative of Scale with Respect to Matrix” 有
$\begin{aligned} \left[ \frac{\partial {\rm tr}{\mathbf X}{\mathbf Z}{\mathbf X}^{\small \rm T}}{\partial \mathbf X} \right]_{pq} &= \frac{\partial {\rm tr} \left( \mathbf{X} \mathbf{Z} \mathbf{X}^{\small \rm T}\right)}{\partial X_{pq}}\\ &= \frac{\partial{\sum_{k=1}^{m} \sum_{j=1}^{n}\sum_{i=1}^{n} X_{ki} Z_{ij} X_{kj}}}{\partial X_{pq}} \\ ({\rm product\ rule}\ [fg]'=f'g+fg') \qquad &= \sum_{j=1}^{n} Z_{qj}X_{pj} + \sum_{i=1}^{n} X_{pi} Z_{iq} \\ (Z_{qj}X_{pj} ={X_{pj} Z_{qj}}) \qquad &= \sum_{j=1}^{n}{X_{pj} Z_{qj}} + \sum_{i=1}^{n} X_{pi} Z_{iq} \\ \end{aligned}$

所以有
$\frac{\partial {\rm tr}{\mathbf X}{\mathbf Z}{\mathbf X}^{\small \rm T}}{\partial \mathbf X} = {\mathbf X}{\mathbf Z}^{\small\rm T}+ {\mathbf X}{\mathbf Z}$
证毕