Torch 常见的对矩阵指定位置添值、取值，以及TopK等操作

广元凉面

已于 2024-05-27 19:32:52 修改

阅读量1.1k

点赞数 20

文章标签： python 深度学习

于 2024-05-24 10:38:24 首次发布

本文链接：https://blog.csdn.net/yang8530635/article/details/139072536

版权

写文章的起因：

因为代码中涉及到MOE（mixture of expert)，而MOE里面涉及到TOPK、对矩阵的指定位置添值以及取值等操作，因此将这些操作记录下来。

①：TopK操作

Topk是矩阵中较为常见的操作，其目的是在指定的维度选出最大的数值以及对应的索引

# TopK
a = torch.randn((3,4))
print(a)
tensor([[ 0.6614,  0.2669,  0.0617,  0.6213],
        [-0.4519, -0.1661, -1.5228,  0.3817],
        [-1.0276, -0.5631, -0.8923, -0.0583]])

a_top_k_values,a_top_k_index = a.topk(2,dim=1)

print("Topk_values:",a_top_k_values)
Topk_values: tensor([[ 0.6614,  0.6213],
        			[ 0.3817, -0.1661],
        			[-0.0583, -0.5631]])
print("Topk_index:",a_top_k_index)
Topk_index: tensor([[0, 3],
        			[3, 1],
       				[3, 1]])

现在已经知道了矩阵中的每个样本(每行)的最大值以及其对应的索引，现在想在一个和a一样大小的全0矩阵中，对于每个样本，只根据TopK的值以及索引进行填充(小问题：至于为啥非要另外建立一个矩阵，而不是直接在原矩阵上进行操作呢 )

②：填充操作

根据看的代码，这里主要介绍两个操作，即tensor.scatter()【scatter是散射的意思】以及 tensor.index_add()

tensor.scatter()

首先，scatter() 和 scatter_() 的作用是一样的，但是 scatter() 不会直接修改原来的 Tensor，而 scatter_() 会修改原先的 Tensor，这里以scatter()为例进行讲解
scatter的API接口如下：

Tensor.scatter_(dim, index, src, *, reduce=None) → Tensor

这里我们以一个（3，4）的全零矩阵为例，为其插入上方求得的a矩阵的TopK值

## scatter
b = torch.zeros((3,4))
print("b:",b)
b: tensor([  [0., 0., 0., 0.],
        	 [0., 0., 0., 0.],
       		 [0., 0., 0., 0.]])
b = b.scatter(1,a_top_k_index,a_top_k_values)
print("scatted_b:",b)
scatted_b: tensor([ [ 0.6614,  0.0000,  0.0000,  0.6213],
        			[ 0.0000, -0.1661,  0.0000,  0.3817],
        			[ 0.0000, -0.5631,  0.0000, -0.0583]])

注意：上述是一个dim=1的例子，dim=0的时候较为难理解
在dim=1时，假设index和src形状相同，那么index和src的行和列是一一对应的，十分清楚，但是当dim=0时，src每一行的值和要被填充的矩阵列维度一一对应，但是要填充到哪个行是由index决定的，下面是torch官网的例子：

import torch
src = torch.arange(1, 11).reshape((2, 5))
print(src)
# tensor([[ 1,  2,  3,  4,  5],
#         [ 6,  7,  8,  9, 10]])
index = torch.tensor([[0, 1],[0,1]])
c = torch.zeros(3, 5, dtype=src.dtype).scatter_(0, index, src)
print("c:",c)
c: tensor([[6, 0, 0, 0, 0],
        	[0, 7, 0, 0, 0],
        	[0, 0, 0, 0, 0]])
d = torch.zeros(3, 5, dtype=src.dtype).scatter_(0, index, src,reduce = 'add')
print("d:",d)
d: tensor([[7, 0, 0, 0, 0],
        	[0, 9, 0, 0, 0],
        	[0, 0, 0, 0, 0]])

c和d的区别是：不加reduce = ‘add’，那么scatter默认是将值覆盖的
这里其实发现对于每行样本，根据每行所选择的列进行填充，tensor.scatter()十分方便，但是对于行填充来说，不是特别方便，如对于上个例子，如果要将[ 1, 2, 3, 4, 5]全部填充到第一行，则需要进行：

e = torch.zeros(3, 5, dtype=src.dtype).scatter_(0, torch.tensor([[0,0,0,0,0],[0,0,0,0,0]]), src,reduce = 'add')
print("e:",e)
e: tensor([[ 7,  9, 11, 13, 15],
        [ 0,  0,  0,  0,  0],
        [ 0,  0,  0,  0,  0]])

即在index上，需要较多的操作，那么有没有在index上的简化操作函数呢？下面就要介绍tensor.index_add()

③ tensor.index_add()

调用函数的方法：

Tensor.index_add_(dim, index, source, *, alpha=1) → Tensor

tensor.index_add()多用于以行为单位，根据索引，将该特征直接加到某行中，下面是官网的例子：
dim=0时

import torch
x = torch.ones(5, 3)
t = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=torch.float)
index = torch.tensor([0, 4, 2])
x.index_add_(0, index, t)
print(x)
tensor([[  2.,   3.,   4.],
        [  1.,   1.,   1.],
        [  8.,   9.,  10.],
        [  1.,   1.,   1.],
        [  5.,   6.,   7.]])

dim=1时

x = torch.ones(3, 4)
t = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=torch.float)
index = torch.tensor([1, 0, 2])
x.index_add_(1, index, t)
print(x)
tensor([[ 3.,  2.,  4.,  1.],
        [ 6.,  5.,  7.,  1.],
        [ 9.,  8., 10.,  1.]])

④：tensor.expand()

在这里借用另外两个个博主的定义：
链接：expand_and_repeat解释
 expand_解释
作用： expand()函数可以将张量广播到新的形状。
注意：只能对维度值为1的维度进行扩展，无需扩展的维度，维度值不变，对应位置可写上原始维度大小或直接写作-1；且扩展的Tensor不会分配新的内存，只是原来的基础上创建新的视图并返回，返回的张量内存是不连续的。
首先解释下什么叫单数维（singleton dimensions），张量在某个维度上的size为1，则称为单数维。比如zeros(2,3,4)不存在单数维，而zeros(2,1,4)在第二个维度（即维度1）上为单数维。expand函数仅仅能作用于这些单数维的维度上。
这里还需要注意：不能拓展右边的维，比如x的shape为(4)，不能x.expend(4,2)，下面举例：

a = torch.tensor([0,1,2])
b = a.expand(2,3)
print("b:",b)
b: tensor([[0, 1, 2],
        	[0, 1, 2]])

但是以下的操作会报错：

c = a.expand(-1,-1,2)
print(c)

d = a.expand(-1,2)
print(d)

e = b.expand(-1,-1,2)
print(e)

错误的原因就是：不能拓展右边的维

⑤：tensor.repeat()

作用：和expand()作用类似，均是将tensor广播到新的形状。
注意：不允许使用维度-1，1即为不变。
由于expand仅能作用于单数维，那对于非单数维的拓展，需要借助于repeat函数了。与expand不同，repeat函数会真正的复制数据并存放于内存中。repeat开辟了新的内存空间，torch.repeat返回的张量在内存中是连续的，这里可以理解为repeat会直接改变操作后的数据，是连续的，以下为例子：

a = torch.tensor([0,1,2])
a.repeat(1,2)
tensor([[0, 1, 2, 0, 1, 2]])
a.repeat(2,2)
tensor([[0, 1, 2, 0, 1, 2],
        [0, 1, 2, 0, 1, 2]])
a.repeat(4,3)
tensor([[0, 1, 2, 0, 1, 2, 0, 1, 2],
        [0, 1, 2, 0, 1, 2, 0, 1, 2],
        [0, 1, 2, 0, 1, 2, 0, 1, 2],
        [0, 1, 2, 0, 1, 2, 0, 1, 2]])
b = torch.tensor([0,1,2])
b.expand(2,-1)
tensor([[0, 1, 2],
        [0, 1, 2]])
b.expand(3,-1)
tensor([[0, 1, 2],
        [0, 1, 2],
        [0, 1, 2]])
b.expand(3,-1)
tensor([[0, 1, 2],
        [0, 1, 2],
        [0, 1, 2]])

⑤：tensor.split()

这里再介绍一个矩阵切割操作，可以按照指定的数值将矩阵切割成不同大小的块

调用：torch.split(tensor, split_size_or_sections, dim=0)
注意：这里split_size_or_sections可以是int也可以是list
下面是官网的例子：torch.split()

>>> a = torch.arange(10).reshape(5, 2)
>>> a
tensor([[0, 1],
        [2, 3],
        [4, 5],
        [6, 7],
        [8, 9]])
>>> torch.split(a, 2)
(tensor([[0, 1],
         [2, 3]]),
 tensor([[4, 5],
         [6, 7]]),
 tensor([[8, 9]]))
>>> torch.split(a, [1, 4])
(tensor([[0, 1]]),
 tensor([[2, 3],
         [4, 5],
         [6, 7],
         [8, 9]]))

此外，在对于dim=1的情况下：

a = torch.arange(10).reshape(5, 2)
print(a)
tensor([[0, 1],
        [2, 3],
        [4, 5],
        [6, 7],
        [8, 9]])
b = torch.split(a,1,dim=1)
(tensor([[0],
        [2],
        [4],
        [6],
        [8]]), tensor([[1],
        [3],
        [5],
        [7],
        [9]]))

tensor.split()比较关键的一点是可以保持原始形状而不会二维变一维

⑤：tensor.gather()

前面的tensor.scatter()和tensor.index_add()是用于填充，tips:(填充列tensor.scatter()方便，填充行tensor.index_add()方便)

调用函数：

torch.gather(input, dim, index, *, sparse_grad=False, out=None) → Tensor

这里要注意一点：这里indexs必须也是Tensor，并且维度数与input相同（len(input.shape)=len(indexs.shape)）

以下是例子：

>>> t = torch.tensor([[1, 2], [3, 4]])
>>> torch.gather(t, 1, torch.tensor([[0, 0], [1, 0]]))
tensor([[ 1,  1],
        [ 4,  3]])
t.gather(0,torch.tensor([[0,0],[0,0]]))
tensor([[1, 2],
        [1, 2]])

这里还有一点，就是dim=1时，因为index的每一行和input每一行是一一对应的，所以适合已经明确的每一行都会有相应操作的场景；

且适合对列进行抽取，对行进行抽取，直接：

t[[0,0,0]]
tensor([[1, 2],
        [1, 2],
        [1, 2]])

广元凉面

关注

20
点赞
踩
25

收藏

觉得还不错? 一键收藏
0
评论
Torch 常见的对矩阵指定位置添值、取值，以及TopK等操作

注意：这里split_size_or_sections可以是int也可以是list>>> a[2, 3],[4, 5],[6, 7],[8, 9]])[2, 3]]),[6, 7]]),[4, 5],[6, 7],[8, 9]]))此外，在对于dim=1print(a)[2, 3],[4, 5],[6, 7],[8, 9]])[2],[4],[6],[3],[5],[7],[9]]))tensor.split()比较关键的一点是可以保持原始形状而不会二维变一维。
复制链接

扫一扫