矩阵内积求导/包含Hadamard root的矩阵求导/matrix elements-wise square root/矩阵逐元素平方根求导/F范数求导

包含Hadamard root的矩阵求导案例比较少,此案例仅供参考:


1 题目

给定 X ∈ R n \mathbf{X} \in \mathbb{R}^{n} XRn A ∈ R n × n \mathbf{A} \in \mathbb{R}^{n \times n} ARn×n f ( X ) = ∑ i = 1 n ∣ A X ∣ i 2 + δ 2 f(\mathbf{X})=\sum_{i=\mathbf{1}}^{n} \sqrt{|\mathbf{A} \mathbf{X}|_{i}^{2}+\delta^{2}} f(X)=i=1nAXi2+δ2 。 其中 ( ⋅ ) \sqrt{(\cdot)} () 表示Hadamard root (elements-wise square root),即矩阵元素逐项平方根。求 f ′ ( X ) f^{\prime}(\mathbf{X}) f(X),即 ∂ f ∂ X \frac{\partial f}{\partial \mathbf{X}} Xf

2 求解

2.1 先用Hadamard product解平方根

v = ∣ A X ∣ 2 + δ 2 1 \mathbf{v}=\sqrt{|\mathbf{A} \mathbf{X}|^{2}+\delta^{2} \mathbf{1}} v=AX2+δ21

∴ v ⊙ v = ∣ A X ∣ 2 + δ 2 1 = A X ⊙ A X + δ 2 1 \begin{aligned} \therefore \quad \mathbf{v} \odot \mathbf{v} &=|\mathbf{A} \mathbf{X}|^{2}+\delta^{2} \mathbf{1} \\ &=\mathbf{A} \mathbf{X} \odot \mathbf{A} \mathbf{X}+\delta^{2} \mathbf{1} \end{aligned} vv=AX2+δ21=AXAX+δ21

根据微分哈达马乘积性质 d ( X ⊙ Y ) = X ⊙ d Y + d X ⊙ Y d(\mathbf{X} \odot \mathbf{Y})=\mathbf{X} \odot d \mathbf{Y}+d \mathbf{X} \odot \mathbf{Y} d(XY)=XdY+dXY 有:
d ( v ⊙ v ) = v ⊙ d v + d v ⊙ v = v ⊙ d v + v ⊙ d v = 2 v ⊙ d v \begin{aligned} d(\mathbf{v} \odot \mathbf{v}) &=\mathbf{v} \odot d \mathbf{v}+d \mathbf{v} \odot \mathbf{v} \\ &=\mathbf{v} \odot d \mathbf{v}+\mathbf{v} \odot d \mathbf{v} \\ &= 2\mathbf{v} \odot d \mathbf{v} \end{aligned} d(vv)=vdv+dvv=vdv+vdv=2vdv

即:
2 v ⊙ d v = d ( A X ⊙ A X + δ 2 1 ) = d ( A X ⊙ A X ) + d ( δ 2 1 ) = 2 A X ⊙ d ( A X ) = 2 A X ⊙ ( ( d A ) X + A d X ) = 2 A X ⊙ A d X \begin{aligned} 2 \mathbf{v} \odot d \mathbf{v} &=d\left(\mathbf{A} \mathbf{X} \odot \mathbf{A} \mathbf{X}+\delta^{2} \mathbf{1}\right) \\ &=d(\mathbf{A} \mathbf{X} \odot \mathbf{A} \mathbf{X})+d(\delta^{2} \mathbf{1}) \\ &=2 \mathbf{A} \mathbf{X} \odot d(\mathbf{A} \mathbf{X}) \\ &=2 \mathbf{A} \mathbf{X} \odot((d \mathbf{A}) \mathbf{X}+\mathbf{A} d \mathbf{X}) \\ &=2 \mathbf{A} \mathbf{X} \odot \mathbf{A} d \mathbf{X} \end{aligned} 2vdv=d(AXAX+δ21)=d(AXAX)+d(δ21)=2AXd(AX)=2AX((dA)X+AdX)=2AXAdX

∴ d v = A X ⊙ A d X ⊘ v \therefore \quad d \mathbf{v}=\mathbf{A} \mathbf{X} \odot \mathbf{A} d \mathbf{X} \oslash \mathbf{v} dv=AXAdXv
其中 ⊘ \oslash 为 Hadamard division / elements-wise division,即矩阵逐项除法,与 ⊙ \odot 具有相似的性质。或令 p ⊙ v = 1 \mathbf{p} \odot \mathbf{v} = \mathbf{1} pv=1,即 p \mathbf{p} p v \mathbf{v} v的Hadamard inverse / elements-wise inverse,此时 d v = A X ⊙ A d X ⊙ p d \mathbf{v}=\mathbf{A} \mathbf{X} \odot \mathbf{A} d \mathbf{X} \odot \mathbf{p} dv=AXAdXp.

2.2 利用Frobenius inner product(矩阵内积)和迹的性质可得解

Frobenius inner product的定义: A : B = tr ⁡ ( A T B ) \mathbf{A}:\mathbf{B} = \operatorname{tr}(\mathbf{A}^{T}\mathbf{B}) A:B=tr(ATB),可得:
f = 1 : v ∴ d f = d ( 1 : v ) = d 1 : v + 1 : d v ( 性 质 : ∇ ( A : B ) = ∇ A : B + A : ∇ B ) = 1 : d v = 1 : ( A X ⊙ A d X ⊘ v ) = 1 : ( A X ⊘ v ⊙ A d X ) ( 性 质 : X ⊙ Y = Y ⊙ X ) = ( 1 ⊙ ( A X ⊘ v ) ) : ( A d X ) ( 性 质 : C : ( A ⊙ B ) = ( C ⊙ A ) : B ) = ( A X ⊘ v ) : ( A d X ) = A T ( A X ⊘ v ) : d X ( 性 质 : C A : B = A : C T B = C : B A T = tr ⁡ ( ( A T ( A X ⊘ v ) ) T d X ) ( 矩 阵 内 积 定 义 ) 即 ∂ f ∂ X = A T ( A X ⊘ v ) ( 性 质 : d f = tr ⁡ ( ( ∂ f ∂ X ) T d X ) ) \begin{aligned} f &=\mathbf{1}: \mathbf{v} \\ \therefore \quad d f &= d(\mathbf{1}: \mathbf{v}) \\ &= d\mathbf{1}:\mathbf{v} + \mathbf{1}:d\mathbf{v} \quad (性质: \nabla(\mathbf{A}: \mathbf{B})=\nabla \mathbf{A}: \mathbf{B}+\mathbf{A}: \nabla \mathbf{B}) \\ &=\mathbf{1}: d \mathbf{v} \\ &=\mathbf{1}:(\mathbf{A} \mathbf{X} \odot \mathbf{A} d \mathbf{X} \oslash \mathbf{v}) \\ &=\mathbf{1}:(\mathbf{A} \mathbf{X} \oslash \mathbf{v} \odot \mathbf{A} d \mathbf{X} ) \quad (性质: \mathbf{X} \odot \mathbf{Y} = \mathbf{Y} \odot \mathbf{X}) \\ &=(\mathbf{1}\odot(\mathbf{A} \mathbf{X} \oslash \mathbf{v})) : (\mathbf{A} d \mathbf{X} ) \quad (性质: \mathbf{C}:(\mathbf{A} \odot \mathbf{B}) = (\mathbf{C} \odot \mathbf{A}):\mathbf{B})\\ &=(\mathbf{A} \mathbf{X} \oslash \mathbf{v}):(\mathbf{A} d \mathbf{X}) \\ &=\mathbf{A}^{T}(\mathbf{A} \mathbf{X} \oslash \mathbf{v}): d \mathbf{X} \quad (性质: \mathbf{C} \mathbf{A} : \mathbf{B} = \mathbf{A} : \mathbf{C}^T\mathbf{B} = \mathbf{C} : \mathbf{B} \mathbf{A}^T \\ &=\operatorname{tr}((\mathbf{A}^T (\mathbf{A} \mathbf{X} \oslash \mathbf{v} ))^T d \mathbf{X}) \quad (矩阵内积定义)\\ 即 \quad \frac{\partial f}{\partial \mathbf{X}} &=\mathbf{A}^{T}(\mathbf{A} \mathbf{X} \oslash \mathbf{v}) \quad (性质: d f=\operatorname{tr}\left(\left(\frac{\partial f}{\partial \mathbf{X}}\right)^{T} d \mathbf{X}\right)) \end{aligned} fdfXf=1:v=d(1:v)=d1:v+1:dv(:(A:B)=A:B+A:B)=1:dv=1:(AXAdXv)=1:(AXvAdX)(:XY=YX)=(1(AXv)):(AdX)(:C:(AB)=(CA):B)=(AXv):(AdX)=AT(AXv):dX(:CA:B=A:CTB=C:BAT=tr((AT(AXv))TdX)()=AT(AXv)(df=tr((Xf)TdX))
如果是使用上面定义的Hadamard inverse,那么结果也可以表示为:
∂ f ∂ X = A T ( A X ⊙ p ) \frac{\partial f}{\partial \mathbf{X}} =\mathbf{A}^{T}(\mathbf{A} \mathbf{X} \odot \mathbf{p}) Xf=AT(AXp)

总结:
先利用Hadamard product解平方根,然后利用相关性质得到 d v d\mathbf{v} dv,最后利用矩阵内积和迹的性质可得解。

参考链接:

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Preface page xi Acknowledgments xiii Abbreviations xv Nomenclature xvii 1 Introduction 1 1.1 Introduction to the Book 1 1.2 Motivation for the Book 2 1.3 Brief Literature Summary 3 1.4 Brief Outline 5 2 Background Material 6 2.1 Introduction 6 2.2 Notation and Classification of Complex Variables and Functions 6 2.2.1 Complex-Valued Variables 7 2.2.2 Complex-Valued Functions 7 2.3 Analytic versus Non-Analytic Functions 8 2.4 Matrix-Related Definitions 12 2.5 Useful Manipulation Formulas 20 2.5.1 Moore-Penrose Inverse 23 2.5.2 Trace Operator 24 2.5.3 Kronecker and Hadamard Products 25 2.5.4 Complex Quadratic Forms 29 2.5.5 Results for Finding Generalized Matrix Derivatives 31 2.6 Exercises 38 3 Theory of Complex-Valued Matrix Derivatives 43 3.1 Introduction 43 3.2 Complex Differentials 44 3.2.1 Procedure for Finding Complex Differentials 46 3.2.2 Basic Complex Differential Properties 46 3.2.3 Results Used to Identify First- and Second-Order Derivatives 53 viii Contents 3.3 Derivative with Respect to Complex Matrices 55 3.3.1 Procedure for Finding Complex-Valued Matrix Derivatives 59 3.4 Fundamental Results on Complex-Valued Matrix Derivatives 60 3.4.1 Chain Rule 60 3.4.2 Scalar Real-Valued Functions 61 3.4.3 One Independent Input Matrix Variable 64 3.5 Exercises 65 4 Development of Complex-Valued Derivative Formulas 70 4.1 Introduction 70 4.2 Complex-Valued Derivatives of Scalar Functions 70 4.2.1 Complex-Valued Derivatives of f (z, z∗) 70 4.2.2 Complex-Valued Derivatives of f (z, z∗) 74 4.2.3 Complex-Valued Derivatives of f (Z, Z∗) 76 4.3 Complex-Valued Derivatives of Vector Functions 82 4.3.1 Complex-Valued Derivatives of f (z, z∗) 82 4.3.2 Complex-Valued Derivatives of f (z, z∗) 82 4.3.3 Complex-Valued Derivatives of f (Z, Z∗) 82 4.4 Complex-Valued Derivatives of Matrix Functions 84 4.4.1 Complex-Valued Derivatives of F(z, z∗) 84 4.4.2 Complex-Valued Derivatives of F(z, z∗) 85 4.4.3 Complex-Valued Derivatives of F(Z, Z∗) 86 4.5 Exercises 91 5 Complex Hessian Matrices for Scalar, Vector, and Matrix Functions 95 5.1 Introduction 95 5.2 Alternative Representations of Complex-Valued Matrix Variables 96 5.2.1 Complex-Valued Matrix Variables Z and Z∗ 96 5.2.2 Augmented Complex-Valued Matrix Variables Z 97 5.3 Complex Hessian Matrices of Scalar Functions 99 5.3.1 Complex Hessian Matrices of Scalar Functions Using Z and Z∗ 99 5.3.2 Complex Hessian Matrices of Scalar Functions Using Z 105 5.3.3 Connections between Hessians When Using Two-Matrix Variable Representations 107 5.4 Complex Hessian Matrices of Vector Functions 109 5.5 Complex Hessian Matrices of Matrix Functions 112 5.5.1 Alternative Expression of Hessian Matrix of Matrix Function 117 5.5.2 Chain Rule for Complex Hessian Matrices 117 5.6 Examples of Finding Complex Hessian Matrices 118 5.6.1 Examples of Finding Complex Hessian Matrices of Scalar Functions 118 5.6.2 Examples of Finding Complex Hessian Matrices of Vector Functions 123 Contents ix 5.6.3 Examples of Finding Complex Hessian Matrices of Matrix Functions 126 5.7 Exercises 129 6 Generalized Complex-Valued Matrix Derivatives 133 6.1 Introduction 133 6.2 Derivatives of Mixture of Real- and Complex-Valued Matrix Variables 137 6.2.1 Chain Rule for Mixture of Real- and Complex-Valued Matrix Variables 139 6.2.2 Steepest Ascent and Descent Methods for Mixture of Real- and Complex-Valued Matrix Variables 142 6.3 Definitions from the Theory of Manifolds 144 6.4 Finding Generalized Complex-Valued Matrix Derivatives 147 6.4.1 Manifolds and Parameterization Function 147 6.4.2 Finding the Derivative of H(X, Z, Z∗) 152 6.4.3 Finding the Derivative of G(W,W∗) 153 6.4.4 Specialization to Unpatterned Derivatives 153 6.4.5 Specialization to Real-Valued Derivatives 154 6.4.6 Specialization to Scalar Function of Square Complex-Valued Matrices 154 6.5 Examples of Generalized Complex Matrix Derivatives 157 6.5.1 Generalized Derivative with Respect to Scalar Variables 157 6.5.2 Generalized Derivative with Respect to Vector Variables 160 6.5.3 Generalized Matrix Derivatives with Respect to Diagonal Matrices 163 6.5.4 Generalized Matrix Derivative with Respect to Symmetric Matrices 166 6.5.5 Generalized Matrix Derivative with Respect to Hermitian Matrices 171 6.5.6 Generalized Matrix Derivative with Respect to Skew-Symmetric Matrices 179 6.5.7 Generalized Matrix Derivative with Respect to Skew-Hermitian Matrices 180 6.5.8 Orthogonal Matrices 184 6.5.9 Unitary Matrices 185 6.5.10 Positive Semidefinite Matrices 187 6.6 Exercises 188 7 Applications in Signal Processing and Communications 201 7.1 Introduction 201 7.2 Absolute Value of Fourier Transform Example 201 7.2.1 Special Function and Matrix Definitions 202 7.2.2 Objective Function Formulation 204 x Contents 7.2.3 First-Order Derivatives of the Objective Function 204 7.2.4 Hessians of the Objective Function 206 7.3 Minimization of Off-Diagonal Covariance Matrix Elements 209 7.4 MIMO Precoder Design for Coherent Detection 211 7.4.1 Precoded OSTBC System Model 212 7.4.2 Correlated Ricean MIMO Channel Model 213 7.4.3 Equivalent Single-Input Single-Output Model 213 7.4.4 Exact SER Expressions for Precoded OSTBC 214 7.4.5 Precoder Optimization Problem Statement and Optimization Algorithm 216 7.4.5.1 Optimal Precoder Problem Formulation 216 7.4.5.2 Precoder Optimization Algorithm 217 7.5 Minimum MSE FIR MIMO Transmit and Receive Filters 219 7.5.1 FIR MIMO System Model 220 7.5.2 FIR MIMO Filter Expansions 220 7.5.3 FIR MIMO Transmit and Receive Filter Problems 223 7.5.4 FIR MIMO Receive Filter Optimization 225 7.5.5 FIR MIMO Transmit Filter Optimization 226 7.6 Exercises 228 References 231 Index 237

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值