【无标题】python和Pytorch中的矩阵乘法运算总结

tanhongweibest

已于 2023-12-14 15:36:21 修改

阅读量1k

点赞数 26

文章标签： python pytorch 矩阵

于 2023-12-14 10:55:48 首次发布

本文链接：https://blog.csdn.net/tanhongweibest/article/details/134989097

版权

在数据科学中，矩阵乘法用得特别频繁，而python中有多种矩阵乘法，现作一个简单总结。
1、python中numpy库的矩阵乘法
（1）矩阵乘法
第一种： np.matmul()

import numpy as np
x=np.array([[1, 2, 3], [4, 5, 6]])
y=np.array([[1, 2], [3, 4], [5, 6]])
np.matmul(x,y)

第二种：@运算

x = np.array([[1, 2, 3], [4, 5, 6]])
y = np.array([[1, 2], [3, 4], [5, 6]])
x@y

第三种：np.dot(A, B)矩阵乘法，点乘

x = np.array([[1, 2, 3], [4, 5, 6]])
y = np.array([[1, 2], [3, 4], [5, 6]])
np.dot(x,y)

上述三个函数均可执行向量的内积运算，如

x = np.array([1,2,3])
y = np.array([4,5,6])
np.dot(x,y)#32
x@y#32
np.matmul(x,y)#32

可以看出，这三种乘法都是一样的，但是一般情况，官方推荐，矩阵乘法使用前两种，内积运算使用第三种。
（2）Hadamard 积
Hadamard 积是矩阵对应元素相乘，这种乘法在实际中也用得很多，如卷积神经网络（CNN）中的卷积运算（与严格意义上卷积运算有区别）。注意，要使用这个运算必须是同型矩阵或者能够广播的矩阵。
第一种：np.multiply()

x = np.array([[1, 2, 3], [4, 5, 6]])
y = np.array([[-1, 2, 3], [-4, 5, 16]])
np.multiply(x,y)

第二种：*运算

x = np.array([[1, 2, 3], [4, 5, 6]])
y = np.array([[-1, 2, 3], [-4, 5, 16]])
x*y

这个方法能实现数乘矩阵或向量，这充分利用了*运算的广播机制，但是@不能这样运算。如：

x = np.array([[1, 2, 3], [4, 5, 6]])
3*x

注意下面这种情况的计算和矩阵乘法的结果一致。

x = np.array([1,2,3])
y = np.array([4,5,6])
np.dot(x[:,None],y[None,:])
x[:,None]*y[None,:]

这种运算的结果和矩阵乘法运算结果是一致的，只是因为*运算可广播，才有这样的结果。但是，一定要区分，这种运算是Hadamard product。
随便提一下，有时通过两个向量，可以通过下面方法构造矩阵：

x = np.array([1,2,3])
y = np.array([4,5,6])
x[:,None]*y[None,:]

使用x[:,None]构成31的矩阵，即列向量；y[None,:]构成13的矩阵，即行向量，两者的结果就是一个33矩阵。这种方法，在神经网络的特征嵌入中，用得特别多，如sinusoidal positional embedding，Fourier feature mapping（embeding），其基本原理是：x=[1,2,3]—>x[:,None]=[[1,2,3]]的维数从1维变成2维（31），此时表示3个样本具有一个特征；y[None,:]同样从原来的1维变成2维，即形状是1*3，这个表示具有3个特征，那么这样x[:,None]y[None,:]的形状就是33，表示3个样本（x）具有3个特征（1列就一个特征），相当于将x嵌入到高维特征空间中去，在sinusoidal positional embedding中，y表示位置特征；在Fourier feature embedding中，y表示Gaussian 分布，则使用Gaussian核嵌入。这就神经网络中的特征嵌入原理，其实就是简单的Hadamard积，但是作用很大。

2、Pytorch中的矩阵乘法
（1）矩阵乘法
第一种：torch.matmul（），不但能执行矩阵的乘法，而且还能执行向量的内积运算，且可以广播。

x = torch.tensor([[1, 2, 3], [4, 5, 6]])
y = torch.tensor([[-1, 2],[ 3, -4], [5, 16]])
torch.matmul(x,y)#x.matmul(y),方法

第二种：@运算, 能执行内积运算

x = torch.tensor([[1, 2, 3], [4, 5, 6]])
y = torch.tensor([[-1, 2],[ 3, -4], [5, 16]])
x@y

第三种：torch.mm()，执行矩阵乘法运算，与torch.matmul（）不同的是，这个函数不支持广播，即必须满足矩阵运算法则—“前列后行相等，前行后列结果”。

x = torch.tensor([[1, 2, 3], [4, 5, 6]])
y = torch.tensor([[-1, 2],[ 3, -4], [5, 16]])
torch.mm(x,y)#x.mm(y),方法

第四种：torch.mv()，执行矩阵（m）和向量（v）的乘法运算，不支持广播运算。

x = torch.tensor([[1, 2, 3], [4, 5, 6]])#2*3
y = torch.tensor([3,4,5])
torch.mv(x,y)#x.mv(y)方法
#output：
tensor([26, 62])

注意不能用矩阵乘法结果原则，来判断torch.mv(x,y)的维数，这个结果是1D向量，长度为2。因此，严格意义上这不能算作矩阵乘法。
第五种：torch.dot()，只能计算向量之间的内积运算，不能做矩阵乘法运算，这与python中的dot函数不一样。

x = torch.tensor([1,3,6])
y = torch.tensor([3,4,9])
torch.dot(x,y)#x.dot(y)

其实，上面的第一种方法torch.matmul()和@运算均可以作内积运算，但在Pytorch中，一般都是使用torch.dot()执行内积运算。
（2）Hadamard 积
第一种：torch.mul（）,与torch.multiply()等价。

x = torch.tensor([1,3,6])
y = torch.tensor([3,4,9])
torch.mul(x,y)#x.mul(y)方法

第二种：*运算。

x = torch.tensor([1,3,6])
y = torch.tensor([3,4,9])
x*y

同样，可以执行数乘向量或矩阵，如：

3*torch.tensor([1,3,6])
3*torch.tensor([[1, 2, 3], [4, 5, 6]])

3、总结

通过上面的总结，我们看到，python中的矩阵乘法有三种：np.matmul()、@运算和np.dot(); Hadamard积有两种np.multiply()和*运算。Pytorch中的矩阵乘法运算有五种：torch.matmul()、@、torch.dot()(只能进行内积运算)、torch.mm()、torch.mv()(只能进行矩阵向量乘积运算)；Hadamard积有三种：torch.mul()、torch.multiply()、*运算，注：前两种方法等价。相比python，Pytorch的矩阵乘法运算以及Hadamard积运算基本一样，尤其Hadamard积运算完全一样；乘法运算中，Pytorch增加了torch.mm()和torch.mv()两个运算方法。注意，Pytorch中的乘法函数均可当作方法使用。