Pytorch基础：Torch.mul、Torch.mm与Torch.matmul的异同

最新推荐文章于 2024-04-20 13:24:47 发布

名字填充中

最新推荐文章于 2024-04-20 13:24:47 发布

阅读量3.2k

点赞数 3

分类专栏：深度学习基础文章标签： pytorch 深度学习 python

本文链接：https://blog.csdn.net/qq_42388742/article/details/120474434

版权

深度学习基础专栏收录该内容

16 篇文章

订阅专栏

本文详细介绍了PyTorch中的Torch.mul、Torch.mm和Torch.matmul三个函数，它们分别用于张量元素乘法、矩阵乘法和灵活的矩阵乘积。Torch.mul支持标量或张量乘法，Torch.mm执行标准矩阵乘法，不支持广播，而Torch.matmul则提供了更广泛的矩阵乘法支持，包括广播和多种矩阵运算情况。通过示例代码，阐述了每个函数的使用方式和输出结果。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Pytorch基础：Torch.mul、Torch.mm与Torch.matmul的异同

Torch.mul

torch.mul(input, other, ***, out=None) → Tensor

将输入的每个元素与另一个标量相乘，返回一个新的张量。
$out_i = other \times input_i$
input是张量， other将会乘每个张量元素。输出是张量。

如果输入是FloatTensor或DoubleTensor类型，other应该是实数，否则应该是整数

示例

>>> a = torch.randn(3)
>>> a
tensor([ 0.2015, -0.4255,  2.6087])
>>> torch.mul(a, 100)
tensor([  20.1494,  -42.5491,  260.8663])

torch.mul(input, other, ***, out=None) → Tensor

张量input的每个元素必须乘张量other的每个元素，结果会返回一个张量

input和other必须是符合广播机制的
$out_i = input_i \times other_i$
input 和other都是张量。返回也是张量

示例

>>> a = torch.randn(4, 1)
>>> a
tensor([[ 1.1207],
        [-0.3137],
        [ 0.0700],
        [ 0.8378]])
>>> b = torch.randn(1, 4)
>>> b
tensor([[ 0.5146,  0.1216, -0.5244,  2.2382]])
>>> torch.mul(a, b)
tensor([[ 0.5767,  0.1363, -0.5877,  2.5083],
        [-0.1614, -0.0382,  0.1645, -0.7021],
        [ 0.0360,  0.0085, -0.0367,  0.1567],
        [ 0.4312,  0.1019, -0.4394,  1.8753]])

Torch.mm

torch.mm(input, mat2, ***, out=None) → Tensor

执行矩阵输入和mat2的矩阵乘法

如果input是 $\times m）$ 的张量， mat2是 $\times p)$ 的张量，输出将会是 $\times p)$ 的张量

这个函数没有广播机制，如果要使用广播机制，需要torch.matmul()

支持strided和稀疏的二维张量作为输入，autograd with respect to strided inputs.

该操作符支持TensorFloat32。

>>> mat1 = torch.randn(2, 3)
>>> mat2 = torch.randn(3, 3)
>>> torch.mm(mat1, mat2)
tensor([[ 0.4851,  0.5037, -0.3633],
        [-0.0760, -3.6705,  2.4784]])

input是第一个张量矩阵， mat2是第二个张量矩阵。output是张量

Torch.matmul

torch.matmul(input, other, ***, out=None) → Tensor

两个张量的矩阵乘积。

其行为取决于张量的维数如下:

如果两个张量都是一维的，则返回点积(标量)。
如果两个参数都是二维的，则返回矩阵-矩阵乘积。
如果第一个参数是一维的，第二个参数是二维的，为了使矩阵相乘，在它的维数前面加了一个1。在矩阵相乘之后，附加的维度被删除。
如果第一个参数是二维的，第二个参数是一维的，则返回矩阵-向量乘积。
如果两个参数至少是一维的，且至少一个参数是N维的(其中N > 2)，则返回一个批处理矩阵乘法。如果第一个参数是一维的，则在其维数前加上1，以便批处理矩阵相乘，然后删除。如果第二个参数是一维的，则为批处理矩阵倍数的目的，将在其维上追加一个1，然后删除它。
非矩阵(即批处理)维度是广播的(因此必须是可广播的)。

示例：如果input是 $(j\times 1 \times n \times n)$ 的张量乘另外一个张量other $\times n \times n)$ ，那么输出将会是 $\times k \times n \times n)$

要注意的是，在确定输入是否可广播时，广播逻辑只查看批处理维，而不查看矩阵维。

例如：input是张量 $\times 1 \times n \times m)$ 而other是张量 $\times m\times p)$ ，这些输入对于广播是有效的，即使最后两个维度(即矩阵维度)是不同的。out将会是张量 $\times k \times n \times p)$ 。

支持tensorfloat32

  >>> # vector x vector
  >>> tensor1 = torch.randn(3)
  >>> tensor2 = torch.randn(3)
  >>> torch.matmul(tensor1, tensor2).size()
  torch.Size([])
  >>> # matrix x vector
  >>> tensor1 = torch.randn(3, 4)
  >>> tensor2 = torch.randn(4)
  >>> torch.matmul(tensor1, tensor2).size()
  torch.Size([3])
  >>> # batched matrix x broadcasted vector
  >>> tensor1 = torch.randn(10, 3, 4)
  >>> tensor2 = torch.randn(4)
  >>> torch.matmul(tensor1, tensor2).size()
  torch.Size([10, 3])
  >>> # batched matrix x batched matrix
  >>> tensor1 = torch.randn(10, 3, 4)
  >>> tensor2 = torch.randn(10, 4, 5)
  >>> torch.matmul(tensor1, tensor2).size()
  torch.Size([10, 3, 5])
  >>> # batched matrix x broadcasted matrix
  >>> tensor1 = torch.randn(10, 3, 4)
  >>> tensor2 = torch.randn(4, 5)
  >>> torch.matmul(tensor1, tensor2).size()
  torch.Size([10, 3, 5])