乘法运算
torch.matmul方法
torch.matmul(input, other, *, out=None)→ Tensor
- 维度相同的情况
If both tensors are 1-dimensional, the dot product (scalar) is returned.
均为1维tensor,则返回点积(标量)。
If both arguments are 2-dimensional, the matrix-matrix product is returned.
如果两个参数都是二维的,则返回矩阵-矩阵乘积。
- 维度不一样且不大于2时,官方文档描述
If the first argument is 1-dimensional and the second argument is 2-dimensional, a 1 is prepended to its dimension for the purpose of the matrix multiply. After the matrix multiply, the prepended dimension is removed.
如果第一个参数
a
a
a是1维,第二个参数
b
b
b是2维,为了进行矩阵乘法,把1加到
a
a
a本身的维度之前(也即由[Na]变成[1,Na])。进行矩阵乘法之后,去除加上的一维。
其实就是一个行向量乘以矩阵。
In [19]: t1 = torch.tensor([1,1,1])
In [20]: t2 = torch.tensor([[1,0,1,1],
...: [1,1,1,0],
...: [0,0,0,1]])
In [21]: t3 = torch.matmul(t1, t2)
In [22]: t3
Out[22]: tensor([2, 1, 2, 2])
If the first argument is 2-dimensional and the second argument is 1-dimensional, the matrix-vector product is returned.
如果第一个参数是2维的,第二个参数是1维的,则返回矩阵-向量乘积。
也就是将第二个参数看作向量,将其转置后进行矩阵与列向量乘法。
在下例中,也可看作t2.reshape(4,-1)
后,再进行矩阵乘法,最后降低维度从[3,1]—>[3]。
In [12]: t1 = torch.tensor([[1,0,1,1],
...: [1,1,1,0],
...: [0,0,0,1]])
In [13]: t2 = torch.tensor([0,1,1,0])
In [14]: t3 = torch.matmul(t1,t2)
In [15]: t3
Out[15]: tensor([1, 2, 0])
- 存在高维参数
If both arguments are at least 1-dimensional and at least one argument is N-dimensional (where N > 2), then a batched matrix multiply is returned. If the first argument is 1-dimensional, a 1 is prepended to its dimension for the purpose of the batched matrix multiply and removed after. If the second argument is 1-dimensional, a 1 is appended to its dimension for the purpose of the batched matrix multiple and removed after. The non-matrix (i.e. batch) dimensions are broadcasted (and thus must be broadcastable). For example, if input is a ( j × 1 × n × n ) (j×1×n×n) (j×1×n×n) tensor and other is a ( k × n × n ) (k×n×n) (k×n×n) tensor, out will be a ( j × k × n × n ) (j×k×n×n) (j×k×n×n)tensor.
如果两个参数都至少为1维且至少有一个参数为N维(其中N > 2),则返回一个批量矩阵乘法。如果第一个参数是一维的,则在它的维度前加个1([Na]->[1,Na]
),以便进行批量矩阵相乘,之后将其删除。如果第二个参数是一维的,则在其维数后附加一个1([Na]->[Na,1]
),以获得批量矩阵的倍数,然后将其删除。
# 结果shape为[3,2]
In [29]: t1 = torch.randn(3,2,4)
In [30]: t1
Out[30]:
tensor([[[ 0.2782, 0.1087, -0.7995, -0.5771],
[ 0.3996, -1.6472, -0.5291, 0.2667]],
[[-1.0375, -0.1722, -0.3408, -0.9480],
[ 0.2967, 2.3588, -0.0715, 1.5943]],
[[ 0.2808, -0.4695, 0.2120, 0.4003],
[ 1.5741, 0.9394, 0.1248, 0.8616]]])
In [39]: t2 = torch.tensor([1,1,0,1.0])
In [40]: t3 = torch.matmul(t1, t2)
In [41]: t3
Out[41]:
tensor([[-0.1902, -0.9808],
[-2.1576, 4.2498],
[ 0.2116, 3.3751]])
非矩阵(如 batch)维度则是广播(因此必须是可广播的)。例如,如果输入的一个为 ( j × 1 × n × n ) (j ×1×n ×n) (j×1×n×n)tensor,另一个是 ( k × n × n ) (k × n × n) (k×n×n)的tensor,输出为一个 ( j × k × n × n ) (j × k × n × n) (j×k×n×n)tensor。