torch.nn.Unfold函数的一些踩坑

最新推荐文章于 2022-10-22 09:30:01 发布

好名字可以让朋友..更容易记住你

最新推荐文章于 2022-10-22 09:30:01 发布

阅读量907

点赞数 2

分类专栏：笔记文章标签：深度学习人工智能 pytorch

本文链接：https://blog.csdn.net/qq_31768873/article/details/115330399

版权

笔记专栏收录该内容

6 篇文章 0 订阅

订阅专栏

torch.nn.Unfold函数踩的一些坑

Pytorch中view函数的使用
torch.nn.Unfold使用
Involution中的操作
结尾

由于最近看到了Involution: Inverting the Inherence of Convolution for Visual Recognition这篇文章，作者在论文中使用PyTorch实现算子使用了较多的Unfold和view函数的操作，因此对Unfold函数的使用作了一定的整理。
论文链接：https://arxiv.org/abs/2103.06255
代码和模型链接：https://github.com/d-li14/involution

Pytorch中view函数的使用

In [1]: import torch
In [2]: data = torch.randn(3,4)
In [3]: data
Out[3]:
tensor([[ 1.2261, -2.0669, -0.3668,  0.7459],
        [ 0.5222,  0.2053,  2.6281, -0.9076],
        [-0.7366, -2.0966,  0.9073, -0.7197]])
In [4]: data.view(-1)
Out[4]:
tensor([ 1.2261, -2.0669, -0.3668,  0.7459,  0.5222,  0.2053,  2.6281, -0.9076,
        -0.7366, -2.0966,  0.9073, -0.7197])

In [5]: data.view(4,3)
Out[5]:
tensor([[ 1.2261, -2.0669, -0.3668],
        [ 0.7459,  0.5222,  0.2053],
        [ 2.6281, -0.9076, -0.7366],
        [-2.0966,  0.9073, -0.7197]])

In [6]: data.transpose(1,0)
Out[6]:
tensor([[ 1.2261,  0.5222, -0.7366],
        [-2.0669,  0.2053, -2.0966],
        [-0.3668,  2.6281,  0.9073],
        [ 0.7459, -0.9076, -0.7197]])

对比上述输出可以发现，对PyTorch中的张量直接进行view的操作实际上是将张量按照打平以后的顺序进行重新排列的。区别于使用transpose、permute等函数进行维度的转置，需要十分小心。

torch.nn.Unfold使用

In [1]: import torch
In [2]: data = torch.randn(1,2,2,2)
In [3]: unfold = torch.nn.Unfold([3,3],padding=1)
In [4]: out = unfold(data)
In [5]: data
Out[5]:
tensor([[[[-1.5352, -2.9045],
          [-0.6881,  0.6854]],

         [[ 0.3453,  1.9499],
          [-1.1957,  1.3823]]]])
In [6]: out.shape
Out[6]: torch.Size([1, 18, 4])
In [7]: out
Out[7]:
tensor([[[ 0.0000,  0.0000,  0.0000, -1.5352],
         [ 0.0000,  0.0000, -1.5352, -2.9045],
         [ 0.0000,  0.0000, -2.9045,  0.0000],
         [ 0.0000, -1.5352,  0.0000, -0.6881],
         [-1.5352, -2.9045, -0.6881,  0.6854],
         [-2.9045,  0.0000,  0.6854,  0.0000],
         [ 0.0000, -0.6881,  0.0000,  0.0000],
         [-0.6881,  0.6854,  0.0000,  0.0000],
         [ 0.6854,  0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000,  0.3453],
         [ 0.0000,  0.0000,  0.3453,  1.9499],
         [ 0.0000,  0.0000,  1.9499,  0.0000],
         [ 0.0000,  0.3453,  0.0000, -1.1957],
         [ 0.3453,  1.9499, -1.1957,  1.3823],
         [ 1.9499,  0.0000,  1.3823,  0.0000],
         [ 0.0000, -1.1957,  0.0000,  0.0000],
         [-1.1957,  1.3823,  0.0000,  0.0000],
         [ 1.3823,  0.0000,  0.0000,  0.0000]]])
In [8]: out2 = out.view(1,-1,3,3,2,2)
In [9]: out2[0,0,:,:,0,0]
Out[9]:
tensor([[ 0.0000,  0.0000,  0.0000],
        [ 0.0000, -1.5352, -2.9045],
        [ 0.0000, -0.6881,  0.6854]])

可以看到，Pytorch中，Unfold函数对于按照一定的kernel_size滑窗取到的张量flatten之后，按照channel的维度依次排序的，因此我们需要按照

[batch_size, -1, kernel_size,kernel_size,height,width]

的顺序来重新对张量进行reshape，从而实现得到利用上面代码中 $3\times3$ 的卷积核滑窗得到的每一个patch。
附上PyTorch对于Unfold功能的解释
https://pytorch.org/docs/stable/generated/torch.nn.Unfold.html?highlight=unfold#torch.nn.Unfold

Involution中的操作

# Algorithm 1 Pseudo code of involution in a PyTorch-like style.
# B: batch size, H: height, W: width
# C: channel number, G: group number
# K: kernel size, s: stride, r: reduction ratio
################### initialization ###################
o = nn.AvgPool2d(s, s) if s > 1 else nn.Identity()
reduce = nn.Conv2d(C, C//r, 1)
span = nn.Conv2d(C//r, K * K * G, 1)
unfold = nn.Unfold(K, dilation, padding, s)
#################### forward pass ####################
x_unfolded = unfold(x) # B,CxKxK,HxW
'''这里有可能应该是KxKxC'''
x_unfolded = x_unfolded.view(B, G, C//G, K * K, H, W)
# kernel generation, Eqn.(6)
kernel = span(reduce(o(x))) # B,KxKxG,H,W
kernel = kernel.view(B, G, K * K, H, W).unsqueeze(2)
# Multiply-Add operation, Eqn.(4)
out = mul(kernel, x_unfolded).sum(dim=3) # B,G,C/G,H,W
out = out.view(B, C, H, W)
return out

在Involution的论文中，作者就是对原本的输入图像做了一个上述的操作，利用unfold得到了输入图像使用 $k\times k$ 的滑窗得到的每一个张量，并且与自身经过2次 $1\times1$ 的卷积操作之后的图像进行矩阵乘法，从而实现注意力的操作。

结尾

本文简单介绍了一下unfold函数的使用细节，也是push自己更好的理解这个函数。如有错误希望大家指出。最后重要的事情说三遍！！！

区别view和permute！ 区别view和permute！ 区别view和permute！

好名字可以让朋友..更容易记住你

关注

2
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
torch.nn.Unfold函数的一些踩坑

torch.nn.Unfold函数的一些踩坑Pytorch中view函数的使用torch.nn.Unfold使用功能快捷键合理的创建标题，有助于目录的生成如何改变文本的样式插入链接与图片如何插入一段漂亮的代码片生成一个适合你的列表创建一个表格设定内容居中、居左、居右SmartyPants创建一个自定义列表如何创建一个注脚注释也是必不可少的KaTeX数学公式新的甘特图功能，丰富你的文章UML 图表FLowchart流程图导出与导入导出导入由于最近看到了Involution: Inverting the
复制链接

扫一扫