Frobenius product

Hadamard product
Main article: Hadamard product (matrices)
For two matrices of the same dimensions, there is the Hadamard product, also known as the element-wise product, pointwise product, entrywise product and the Schur product.[24] For two matrices A and B of the same dimensions, the Hadamard product A ○ B is a matrix of the same dimensions, the i, j element of A is multiplied with the i, j element of B, that is:

(AB)ij=AijBij ( A ∘ B ) i j = A i j B i j

displayed fully:

AB=A11A21An1A12A22An2A1mA2mAnmB11B21Bn1B12B22Bn2B1mB2mBnm=A11B11A21B21An1Bn1A12B12A22B22An2Bn2A1mB1mA2mB2mAnmBnm A ∘ B = ( A 11 A 12 ⋯ A 1 m A 21 A 22 ⋯ A 2 m ⋮ ⋮ ⋱ ⋮ A n 1 A n 2 ⋯ A n m ) ∘ ( B 11 B 12 ⋯ B 1 m B 21 B 22 ⋯ B 2 m ⋮ ⋮ ⋱ ⋮ B n 1 B n 2 ⋯ B n m ) = ( A 11 B 11 A 12 B 12 ⋯ A 1 m B 1 m A 21 B 21 A 22 B 22 ⋯ A 2 m B 2 m ⋮ ⋮ ⋱ ⋮ A n 1 B n 1 A n 2 B n 2 ⋯ A n m B n m )

This operation is identical to multiplying many ordinary numbers (mn of them) all at once; thus the Hadamard product is commutative, associative and distributive over entrywise addition. It is also a principal submatrix of the Kronecker product. It appears in lossy compression algorithms such as JPEG.

Frobenius product
The Frobenius inner product, sometimes denoted A : B, is the component-wise inner product of two matrices as though they are vectors. It is also the sum of the entries of the Hadamard product. Explicitly,

A:B=i,jAijBij=vec(A)Tvec(B)=tr(ATB)=tr(ABT), A : B = ∑ i , j A i j B i j = v e c ( A ) T v e c ( B ) = t r ( A T B ) = t r ( A B T ) ,

where “tr” denotes the trace of a matrix and vec denotes vectorization. This inner product induces the Frobenius norm.

A:B=B:A A : B = B : A

A:B=AT:BT=BT:AT A : B = A T : B T = B T : A T

A:BC=BTA:C A : B C = B T A : C

A:BC=ACT:B A : B C = A C T : B

A:(B+C)=A:B+A:C A : ( B + C ) = A : B + A : C

(A:B)=A:B+A:B ∇ ( A : B ) = ∇ A : B + A : ∇ B
(证明如下:

(A:B)=(i,jAijBij)=i,j((AijBij))=i,j(AijBij+AijBij)=i,j(AijBij)+i,j(AijBij)=A:B+A:B(1)(2)(3)(4)(5) (1) ∇ ( A : B ) = ∇ ( ∑ i , j A i j B i j ) (2) = ∑ i , j ( ∇ ( A i j B i j ) ) (3) = ∑ i , j ( ∇ A i j B i j + A i j ∇ B i j ) (4) = ∑ i , j ( ∇ A i j B i j ) + ∑ i , j ( A i j ∇ B i j ) (5) = ∇ A : B + A : ∇ B

)

另外,关于导数

d(A:X)dX=d(X:A)dX=1A=A d ( A : X ) d X = d ( X : A ) d X = 1 ∘ A = A

证明一下

d(A:X)dX=d(X:A)dX=d(i,jAi,jXi,j)Xd(A:X)dXi,j=d(X:A)dXi,j=d(i,jAi,jXi,j)dXi,j=Ai,j1 d ( A : X ) d X = d ( X : A ) d X = d ( ∑ i , j A i , j X i , j ) X d ( A : X ) d X i , j = d ( X : A ) d X i , j = d ( ∑ i , j A i , j X i , j ) d X i , j = A i , j ∗ 1

同理

d(A:F(X))dX=d(i,jAi,jFi,j)Xd(A:F)dXi,j=d(i,jAi,jFi,j)dXi,j=Ai,jdFi,jdXi,jd(A:F)dX=AdFdX d ( A : F ( X ) ) d X = d ( ∑ i , j A i , j F i , j ) X d ( A : F ) d X i , j = d ( ∑ i , j A i , j F i , j ) d X i , j = A i , j ∗ d F i , j d X i , j 所 以 d ( A : F ) d X = A ∘ d F d X

其中 求导和最后的合并都是elementwise的

对于二阶导数的求法, 这里只推导matrix cookbook上的一个公式(110), 其他的类似

Xtr(XXTB)=BX+BTX ∂ ∂ X t r ( X X T B ) = B X + B T X

求导的方法,相办法把每个 X X 单独写在 : : 的一边, 为了区别两个X, 我把它们分别表示为 X1 X 1 , X2 X 2

tr(X1XT2B)=(X1:BTX2)=X1:BTX2+BX1:X2=1:BTX2+BX1:1=BTX+BX(6)(7) (6) ∇ t r ( X 1 X 2 T B ) = ∇ ( X 1 : B T X 2 ) = ∇ X 1 : B T X 2 + B X 1 : ∇ X 2 (7) = 1 : B T X 2 + B X 1 : 1 = B T X + B X

跟frobenius norm的关系

X2F=tr(XXT)=X:X ‖ X ‖ F 2 = t r ( X X T ) = X : X

如果f(X)为矩阵X的函数,其结果为一标量,则

f(X)=f(X)XT:X=(f(X)X)T:X ∇ f ( X ) = ∂ f ( X ) ∂ X T : ∇ X = ( ∂ f ( X ) ∂ X ) T : ∇ X

为什么有转置呢?因为对矩阵或者向量求导,求出来的结果需要转置

The derivative of a scalar y function of a matrix X of independent variables, with respect to the matrix X, is given (in numerator layout notation) by

yX=yx11yx12yx1qyx21yx22yx2qyxp1yxp2yxpq. ∂ y ∂ X = [ ∂ y ∂ x 11 ∂ y ∂ x 21 ⋯ ∂ y ∂ x p 1 ∂ y ∂ x 12 ∂ y ∂ x 22 ⋯ ∂ y ∂ x p 2 ⋮ ⋮ ⋱ ⋮ ∂ y ∂ x 1 q ∂ y ∂ x 2 q ⋯ ∂ y ∂ x p q ] .

Notice that the indexing of the gradient with respect to X is transposed as compared with the indexing of X.
(引自于 https://en.wikipedia.org/wiki/Matrix_calculus)

所以为了求 A:B A : B x x 的导数只需要将

(A:B)=A:B+A:B

化成

(A:B)=A:B+A:B=?:x ∇ ( A : B ) = ∇ A : B + A : ∇ B = ? : ∇ x


f(X)X=? ∂ f ( X ) ∂ X = ?

最后让我们看个例子,f(x)为x的函数,f(x)和x都为标量,但是中间有个 Amn A m ∗ n 的矩阵作为他们的连接关系,求 df(x)dx d f ( x ) d x

这里写图片描述

后来参照了一本介绍tensor的书

tensor.pdf

在30页提到了 ⋅ ⋅ 运算符

我想我们这里可以表示成

df(x)dx=f(x)AAx d f ( x ) d x = ∂ f ( x ) ∂ A ⋅ ⋅ ∂ A ∂ x

这里说下我自己对 ⋅ ⋅ 运算符的理解

AB=tr(AB) A ⋅ ⋅ B = t r ( A B )

24, Aug, 2018 update

df(x)dx=f(x)A:Ax d f ( x ) d x = ∂ f ( x ) ∂ A : ∂ A ∂ x

注意 f(x)A ∂ f ( x ) ∂ A 的尺寸与A的大小应该一致

d(A:F(X))dX=AdF(X)dX d ( A : F ( X ) ) d X = A ∘ d F ( X ) d X

  • 4
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值