关于python 高维数组transpose的实现原理以及pytorch view等的思考

最新推荐文章于 2024-06-02 10:44:28 发布

normol

最新推荐文章于 2024-06-02 10:44:28 发布

阅读量3.8k

点赞数 6

分类专栏： Python 深度学习计算机视觉文章标签：高维数组 tranpose python view pytorch

本文链接：https://blog.csdn.net/normol/article/details/88555804

版权

计算机视觉同时被 3 个专栏收录

26 篇文章 63 订阅

订阅专栏

Python

17 篇文章 5 订阅

订阅专栏

深度学习

14 篇文章 6 订阅

订阅专栏

很多时候需要给高维数组变形，以达到需要的格式，但很多时候，可能变形后的结果并不是你所预想的那样，我是在看一段pytorch的代码时引发的思考：

prediction = prediction.view(batch_size, bbox_attrs*num_anchors, grid_size*grid_size)
prediction = prediction.transpose(1,2).contiguous()
prediction = prediction.view(batch_size, grid_size*grid_size*num_anchors, bbox_attrs)

这么“曲折”的变形过程说明一步到位是不可能的，但是为什么呢？
在查找的过程中，stackoverflow上有个回答很好：
How does NumPy’s transpose() method permute the axes of an array?

其实，无论几维，在内存中都是连续的一维空间，高维保存下的不过是stride而已，即下一次要取的元素的跨度是多少，而transpose改变的自然也只是stride
举例说明：

a=np.arange(1,17)

a
Out[10]: array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16])

a=a.reshape(2,2,4)

a
Out[12]: 
array([[[ 1,  2,  3,  4],
        [ 5,  6,  7,  8]],

       [[ 9, 10, 11, 12],
        [13, 14, 15, 16]]])

此时a.shape=(2, 2, 4),对应的stride为 2*4, 4, 1即8，4，1, (注：当然，实际是以byte为单位，即int为8byte，那么步长就变为（64，32，8）

"""模拟实现 多维数组的输出"""
memory = np.arange(0,17)
# a见上面的代码
i_index = j_index = k_index = 0
for i in range(a.shape[0]):
    i_index = i*strides[0]
    for j in range(a.shape[1]):
        j_index =i_index + j*strides[1]
        for k in range(a.shape[2]):
            k_index = j_index + k*strides[2]
            print(memory[k_index],end=' ')
        print()
"""  输出结果  
1 2 3 4 
5 6 7 8 
9 10 11 12 
13 14 15 16 
"""

那么当transpose(1,0,2)时，只发生了下列参数的变换：
a.shape=(2, 2, 4) #这儿第一二个参数已经交换了位置
a.strides=(4, 8, 1)

i_index = j_index = k_index = 0
for i in range(a.shape[1]):
    i_index = i*strides[1]
    for j in range(a.shape[0]):
        j_index =i_index + j*strides[0]
        for k in range(a.shape[2]):
            k_index = j_index + k*strides[2]
            print(memory[k_index],end=' ')
        print()
"""
1 2 3 4 
9 10 11 12 
5 6 7 8 
13 14 15 16 
"""

接下来说到pytorch的view，正是由于数组存放是连续的内存块，所以
若a.shape=(2,3, 4)，你想将它变成(3, 8), 假设赋予其意义，有两个样本，3个class，4个像素，如今想要每行为一个class，一行中有8个值，分别代表第一个样本的4个像素和第二个样本的四个像素。

a=np.arange(1,25)

a=a.reshape(2,3,4)

a
Out[39]: 
array([[[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]],

       [[13, 14, 15, 16],
        [17, 18, 19, 20],
        [21, 22, 23, 24]]])

那么，(3,8)我们希望的结果是这样的：

array([[1,  2,  3,  4, 13, 14, 15, 16],
        [ 5,  6,  7,  8, 17, 18, 19, 20],
        [ 9, 10, 11, 12, 21, 22, 23, 24]])

但是直接view，结果是这样的：

a_tensor = torch.from_numpy(a)

a_view=a_tensor.view(3,-1)

a_view
Out[42]: 
tensor([[ 1,  2,  3,  4,  5,  6,  7,  8],
        [ 9, 10, 11, 12, 13, 14, 15, 16],
        [17, 18, 19, 20, 21, 22, 23, 24]])

因为view只是简单的切分连续的内存地址，
若想达到我们想要的效果，需要先做transpose，如下：

a_trans = a_tensor.transpose(1,0)

# 这样会报错，因为很多操作都不是改变原数组的，如果报错了，就按提示使用contiguous，
# 使用后才会真正复制一份
a_correct = a_trans.view(3,-1)
Traceback (most recent call last):

  File "<ipython-input-44-f2206c32ae9c>", line 1, in <module>
    a_correct = a_trans.view(3,-1)

RuntimeError: invalid argument 2: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Call .contiguous() before .view(). at /Users/soumith/mc3build/conda-bld/pytorch_1549593514549/work/aten/src/TH/generic/THTensor.cpp:213

更改后：

a_correct = a_trans.contiguous().view(3,-1)

a_correct
Out[47]: 
tensor([[ 1,  2,  3,  4, 13, 14, 15, 16],
        [ 5,  6,  7,  8, 17, 18, 19, 20],
        [ 9, 10, 11, 12, 21, 22, 23, 24]])

可以看见，已经达到了我们想要的结果。

normol

关注

6
点赞
踩
12

收藏

觉得还不错? 一键收藏
0
评论
关于python 高维数组transpose的实现原理以及pytorch view等的思考

很多时候需要给高维数组变形，以达到需要的格式，但很多时候，可能变形后的结果并不是你所预想的那样，我是在看一段pytorch的代码时引发的思考：prediction = prediction.view(batch_size, bbox_attrs*num_anchors, grid_size*grid_size)prediction = prediction.transpose(1,2).con...
复制链接

扫一扫