小白学Pytorch系列--Torch.nn API Vision Layers(15)

最新推荐文章于 2024-04-03 13:39:18 发布

发呆的比目鱼

最新推荐文章于 2024-04-03 13:39:18 发布

阅读量1.5k

点赞数

分类专栏： PyTorch框架文章标签： pytorch python 深度学习

本文链接：https://blog.csdn.net/weixin_42486623/article/details/129773632

版权

PyTorch框架专栏收录该内容

52 篇文章 8 订阅

订阅专栏

本文介绍了Pytorch中用于图像处理的几个关键模块，包括nn.PixelShuffle和nn.PixelUnshuffle，它们分别用于执行上采样的像素重排和下采样的反向操作。nn.Upsample则提供了一般性的上采样功能，支持最近邻和双线性插值方法。这些API在深度学习模型中常用于调整特征图的尺寸。

摘要由CSDN通过智能技术生成

小白学Pytorch系列–Torch.nn API Vision Layers(15)

方法	注释
nn.PixelShuffle	将形状张量 $C r^2,H,W)$ 中的元素重新排列为形状张量 $(* ， C, Hr, W r)$ ，其中r是一个高阶因子。
nn.PixelUnshuffle	通过将形状张量 $(* ， C, Hr, W r)$ 中的元素重新排列为形状张量 $C r^2,H,W)$ 来反转PixelShuffle操作，其中r是一个降尺度因子。
nn.Upsample	对给定的多通道1D(时间)、2D(空间)或3D(体积)数据进行上采样。
nn.UpsamplingNearest2d	对由多个输入通道组成的输入信号应用二维最近邻上采样。
nn.UpsamplingBilinear2d	对由多个输入通道组成的输入信号应用二维双线性上采样。

nn.PixelShuffle

将形状张量 $C r^2,H,W)$ 中的元素重新排列为形状张量 $(* ， C, H * r, W * r)$ ，其中r是一个高阶因子。

>>> pixel_shuffle = nn.PixelShuffle(3)
>>> input = torch.randn(1, 9, 4, 4)
>>> output = pixel_shuffle(input)
>>> print(output.size())
torch.Size([1, 1, 12, 12])

nn.PixelUnshuffle

通过将形状张量 $(* ， C, H * r, W * r)$ 中的元素重新排列为形状张量 $C r^2,H,W)$ 来反转PixelShuffle操作，其中r是一个降尺度因子。

>>> pixel_unshuffle = nn.PixelUnshuffle(3)
>>> input = torch.randn(1, 1, 12, 12)
>>> output = pixel_unshuffle(input)
>>> print(output.size())
torch.Size([1, 9, 4, 4])

nn.Upsample

>>> input = torch.arange(1, 5, dtype=torch.float32).view(1, 1, 2, 2)
>>> input
tensor([[[[1., 2.],
          [3., 4.]]]])

>>> m = nn.Upsample(scale_factor=2, mode='nearest')
>>> m(input)
tensor([[[[1., 1., 2., 2.],
          [1., 1., 2., 2.],
          [3., 3., 4., 4.],
          [3., 3., 4., 4.]]]])

>>> m = nn.Upsample(scale_factor=2, mode='bilinear')  # align_corners=False
>>> m(input)
tensor([[[[1.0000, 1.2500, 1.7500, 2.0000],
          [1.5000, 1.7500, 2.2500, 2.5000],
          [2.5000, 2.7500, 3.2500, 3.5000],
          [3.0000, 3.2500, 3.7500, 4.0000]]]])

>>> m = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
>>> m(input)
tensor([[[[1.0000, 1.3333, 1.6667, 2.0000],
          [1.6667, 2.0000, 2.3333, 2.6667],
          [2.3333, 2.6667, 3.0000, 3.3333],
          [3.0000, 3.3333, 3.6667, 4.0000]]]])

>>> # Try scaling the same data in a larger tensor
>>> input_3x3 = torch.zeros(3, 3).view(1, 1, 3, 3)
>>> input_3x3[:, :, :2, :2].copy_(input)
tensor([[[[1., 2.],
          [3., 4.]]]])
>>> input_3x3
tensor([[[[1., 2., 0.],
          [3., 4., 0.],
          [0., 0., 0.]]]])

>>> m = nn.Upsample(scale_factor=2, mode='bilinear')  # align_corners=False
>>> # Notice that values in top left corner are the same with the small input (except at boundary)
>>> m(input_3x3)
tensor([[[[1.0000, 1.2500, 1.7500, 1.5000, 0.5000, 0.0000],
          [1.5000, 1.7500, 2.2500, 1.8750, 0.6250, 0.0000],
          [2.5000, 2.7500, 3.2500, 2.6250, 0.8750, 0.0000],
          [2.2500, 2.4375, 2.8125, 2.2500, 0.7500, 0.0000],
          [0.7500, 0.8125, 0.9375, 0.7500, 0.2500, 0.0000],
          [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]]]])

>>> m = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
>>> # Notice that values in top left corner are now changed
>>> m(input_3x3)
tensor([[[[1.0000, 1.4000, 1.8000, 1.6000, 0.8000, 0.0000],
          [1.8000, 2.2000, 2.6000, 2.2400, 1.1200, 0.0000],
          [2.6000, 3.0000, 3.4000, 2.8800, 1.4400, 0.0000],
          [2.4000, 2.7200, 3.0400, 2.5600, 1.2800, 0.0000],
          [1.2000, 1.3600, 1.5200, 1.2800, 0.6400, 0.0000],
          [0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000]]]])

nn.UpsamplingNearest2d

>>> input = torch.arange(1, 5, dtype=torch.float32).view(1, 1, 2, 2)
>>> input
tensor([[[[1., 2.],
          [3., 4.]]]])

>>> m = nn.UpsamplingNearest2d(scale_factor=2)
>>> m(input)
tensor([[[[1., 1., 2., 2.],
          [1., 1., 2., 2.],
          [3., 3., 4., 4.],
          [3., 3., 4., 4.]]]])

nn.UpsamplingBilinear2d

>>> input = torch.arange(1, 5, dtype=torch.float32).view(1, 1, 2, 2)
>>> input
tensor([[[[1., 2.],
          [3., 4.]]]])

>>> m = nn.UpsamplingBilinear2d(scale_factor=2)
>>> m(input)
tensor([[[[1.0000, 1.3333, 1.6667, 2.0000],
          [1.6667, 2.0000, 2.3333, 2.6667],
          [2.3333, 2.6667, 3.0000, 3.3333],
          [3.0000, 3.3333, 3.6667, 4.0000]]]])