[深度学习] 池化层函数及其逆过程函数

池化函数

最大池化函数

  • 一维
    class torch.nn.MaxPool1d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
  • 二维
    class torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
  • 三维
    class torch.nn.MaxPool3d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)

对于输入信号的输入通道,提供一维二维三维最大池化(max pooling)操作

参数:

  • kernel_size(int or tuple) - max pooling的窗口大小
  • stride(int or tuple, optional) - max pooling的窗口移动的步长。默认值是kernel_size
  • padding(int or tuple, optional) - 输入的每一条边补充0的层数
  • dilation(int or tuple, optional) – 一个控制窗口中元素步幅的参数
  • return_indices - 如果等于True,会返回输出最大值的序号,对于上采样操作会有帮助
  • ceil_mode - 如果等于True,计算输出信号大小的时候,会使用向上取整,代替默认的向下取整的操作。

shape:

  • 一维最大池化层
    输入: (N,C_in,L_in)
    输出: (N,C_out,L_out)
    L o u t = f l o o r ( ( L i n + 2 p a d d i n g − d i l a t i o n ( k e r n e l s i z e − 1 ) − 1 ) / s t r i d e + 1 L_{out}=floor((L_{in} + 2padding - dilation(kernel_size - 1) - 1)/stride + 1 Lout=floor((Lin+2paddingdilation(kernelsize1)1)/stride+1

  • 二维最大池化层
    输入: (N,C,H_{in},W_in)
    输出: (N,C,H_out,W_out)
    H o u t = f l o o r ( ( H i n + 2 p a d d i n g [ 0 ] − d i l a t i o n [ 0 ] ( k e r n e l s i z e [ 0 ] − 1 ) − 1 ) / s t r i d e [ 0 ] + 1 H_{out}=floor((H_{in} + 2padding[0] - dilation[0](kernel_size[0] - 1) - 1)/stride[0] + 1 Hout=floor((Hin+2padding[0]dilation[0](kernelsize[0]1)1)/stride[0]+1 W o u t = f l o o r ( ( W i n + 2 p a d d i n g [ 1 ] − d i l a t i o n [ 1 ] ( k e r n e l s i z e [ 1 ] − 1 ) − 1 ) / s t r i d e [ 1 ] + 1 W_{out}=floor((W_{in} + 2padding[1] - dilation[1](kernel_size[1] - 1) - 1)/stride[1] + 1 Wout=floor((Win+2padding[1]dilation[1](kernelsize[1]1)1)/stride[1]+1

  • 三维最大池化层
    输入: (N,C,H_in,W_in)
    输出: (N,C,H_out,W_out)
    D o u t = f l o o r ( ( D i n + 2 p a d d i n g [ 0 ] − d i l a t i o n [ 0 ] ( k e r n e l s i z e [ 0 ] − 1 ) − 1 ) / s t r i d e [ 0 ] + 1 ) D_{out}=floor((D_{in} + 2padding[0] - dilation[0](kernel_size[0] - 1) - 1)/stride[0] + 1) Dout=floor((Din+2padding[0]dilation[0](kernelsize[0]1)1)/stride[0]+1) H o u t = f l o o r ( ( H i n + 2 p a d d i n g [ 1 ] − d i l a t i o n [ 1 ] ( k e r n e l s i z e [ 0 ] − 1 ) − 1 ) / s t r i d e [ 1 ] + 1 ) H_{out}=floor((H_{in} + 2padding[1] - dilation[1](kernel_size[0] - 1) - 1)/stride[1] + 1) Hout=floor((Hin+2padding[1]dilation[1](kernelsize[0]1)1)/stride[1]+1) W o u t = f l o o r ( ( W i n + 2 p a d d i n g [ 2 ] − d i l a t i o n [ 2 ] ( k e r n e l s i z e [ 2 ] − 1 ) − 1 ) / s t r i d e [ 2 ] + 1 ) W_{out}=floor((W_{in} + 2padding[2] - dilation[2](kernel_size[2] - 1) - 1)/stride[2] + 1) Wout=floor((Win+2padding[2]dilation[2](kernelsize[2]1)1)/stride[2]+1)

Example:

一维最大池化层

# pool of size=3, stride=2
m = nn.MaxPool1d(3, stride=2)
input = autograd.Variable(torch.randn(20, 16, 50))
output = m(input)

二维最大池化层

# pool of square window of size=3, stride=2
m = nn.MaxPool2d(3, stride=2)
# pool of non-square window
m = nn.MaxPool2d((3, 2), stride=(2, 1))
input = autograd.Variable(torch.randn(20, 16, 50, 32))
output = m(input)

三维最大池化层

# pool of square window of size=3, stride=2
m = nn.MaxPool3d(3, stride=2)
# pool of non-square window
m = nn.MaxPool3d((3, 2, 2), stride=(2, 1, 2))
input = autograd.Variable(torch.randn(20, 16, 50,44, 31))  
output = m(input)

平均池化函数

  • 一维
    class torch.nn.AvgPool1d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True)
  • 二维
    class torch.nn.AvgPool2d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True)
  • 三维
    class torch.nn.AvgPool3d(kernel_size, stride=None)

对信号的输入通道,提供一维、二维、三维平均池化average pooling

参数:
一维、二维平均池化

  • kernel_size(int or tuple) - 池化窗口大小
  • stride(int or tuple, optional) - max pooling的窗口移动的步长。默认值是kernel_size
  • padding(int or tuple, optional) - 输入的每一条边补充0的层数
  • dilation(int or tuple, optional) – 一个控制窗口中元素步幅的参数
  • count_include_pad - 如果等于True,计算平均池化时,将包括padding填充的0
  • ceil_mode - 如果等于True,计算输出信号大小的时候,会使用向上取整,代替默认的向下取整的操作

三维平均池化
kernel_size(int or tuple) - 池化窗口大小
stride(int or tuple, optional) - avg pooling的窗口移动的步长。默认值是kernel_size

shape:

  • 一维平均池化
    input:(N,C,L_in)
    output:(N,C,L_out)
    L o u t = f l o o r ( ( L i n + 2 ∗ p a d d i n g − k e r n e l _ s i z e ) / s t r i d e + 1 ) L_{out}=floor((L_{in}+2 ∗ padding−kernel\_size)/stride+1) Lout=floor((Lin+2paddingkernel_size)/stride+1)

  • 二维平均池化
    input: (N,C,H_in,W_in)
    output: (N,C,H_out,W_out)
    H o u t = f l o o r ( ( H i n + 2 p a d d i n g [ 0 ] − k e r n e l s i z e [ 0 ] ) / s t r i d e [ 0 ] + 1 )   H_{out}=floor((H_{in}+2padding[0]-kernel_size[0])/stride[0]+1)\ Hout=floor((Hin+2padding[0]kernelsize[0])/stride[0]+1)  W o u t = f l o o r ( ( W i n + 2 p a d d i n g [ 1 ] − k e r n e l s i z e [ 1 ] ) / s t r i d e [ 1 ] + 1 ) W_{out}=floor((W_{in}+2padding[1]-kernel_size[1])/stride[1]+1) Wout=floor((Win+2padding[1]kernelsize[1])/stride[1]+1)

  • 三维平均池化
    输入大小:(N,C,D_in,H_in,W_in)
    输出大小:(N,C,D_out,H_out,W_out)
    D o u t = f l o o r ( ( D i n + 2 p a d d i n g [ 0 ] − k e r n e l s i z e [ 0 ] ) / s t r i d e [ 0 ] + 1 )   D_{out}=floor((D_{in}+2padding[0]-kernel_size[0])/stride[0]+1)\ Dout=floor((Din+2padding[0]kernelsize[0])/stride[0]+1)  H o u t = f l o o r ( ( H i n + 2 p a d d i n g [ 1 ] − k e r n e l s i z e [ 1 ] ) / s t r i d e [ 1 ] + 1 )   H_{out}=floor((H_{in}+2padding[1]-kernel_size[1])/stride[1]+1)\ Hout=floor((Hin+2padding[1]kernelsize[1])/stride[1]+1)  W o u t = f l o o r ( ( W i n + 2 ∗ p a d d i n g [ 2 ] − k e r n e l s i z e [ 2 ] ) / s t r i d e [ 2 ] + 1 ) W_{out}=floor((W_{in}+2*padding[2]-kernel_size[2])/stride[2]+1) Wout=floor((Win+2padding[2]kernelsize[2])/stride[2]+1)

Example:
一维平均池化

>>> # pool with window of size=3, stride=2
>>> m = nn.AvgPool1d(3, stride=2)
>>> m(Variable(torch.Tensor([[[1,2,3,4,5,6,7]]])))
Variable containing:
    (0 ,.,.) =
    2  4  6
    [torch.FloatTensor of size 1x1x3]

二维平均池化

>>> # pool of square window of size=3, stride=2
>>> m = nn.AvgPool2d(3, stride=2)
>>> # pool of non-square window
>>> m = nn.AvgPool2d((3, 2), stride=(2, 1))
>>> input = autograd.Variable(torch.randn(20, 16, 50, 32))
>>> output = m(input)

三维平均池化

>>> # pool of square window of size=3, stride=2
>>> m = nn.AvgPool3d(3, stride=2)
>>> # pool of non-square window
>>> m = nn.AvgPool3d((3, 2, 2), stride=(2, 1, 2))
>>> input = autograd.Variable(torch.randn(20, 16, 50,44, 31))
>>> output = m(input)

池化逆函数

最大池化的近似逆函数

  • 一维
    class torch.nn.MaxUnpool1d(kernel_size, stride=None, padding=0)
  • 二维
    class torch.nn.MaxUnpool2d(kernel_size, stride=None, padding=0)
  • 三维
    class torch.nn.MaxUnpool3d(kernel_size, stride=None, padding=0)

Maxpool1dMaxpool2dMaxpool3d的逆过程,不过并不是完全的逆过程,因为在maxpool1dmaxpool2dmaxpool3d的过程中,一些最大值的已经丢失。 MaxUnpool1dMaxUnpool2dMaxUnpool3d输入MaxPool1dMaxPool2dMaxPool3d的输出,包括最大值的索引,并计算所有maxpool1dmaxpool2dmaxpool3d过程中非最大值被设置为零的部分的反向。

注意:
MaxPool1dMaxPool2dMaxPool3d可以将多个输入大小映射到相同的输出大小。因此,反演过程可能会变得模棱两可。 为了适应这一点,可以在调用中将输出大小(output_size)作为额外的参数传入。 具体用法,请参阅下面的输入和示例

参数:

  • kernel_size(int or tuple) - max pooling的窗口大小
  • stride(int or tuple, optional) - max pooling的窗口移动的步长。默认值是kernel_size
  • padding(int or tuple, optional) - 输入的每一条边补充0的层数

输入:

  • input:需要转换的tensor
  • indicesMaxpool1d的索引号
  • output_size:一个指定输出大小的torch.Size

shape:

  • 一维逆池化
    input: (N,C,H_in)
    output:(N,C,H_out)
    H o u t = ( H i n − 1 ) s t r i d e [ 0 ] − 2 p a d d i n g [ 0 ] + k e r n e l _ s i z e [ 0 ] H_{out}=(H_{in}-1)stride[0]-2padding[0]+kernel\_size[0] Hout=(Hin1)stride[0]2padding[0]+kernel_size[0] 也可以使用output_size指定输出的大小
  • 二维逆池化
    input: (N,C,H_in,W_in)
    output:(N,C,H_out,W_out)
    H o u t = ( H i n − 1 ) s t r i d e [ 0 ] − 2 p a d d i n g [ 0 ] + k e r n e l _ s i z e [ 0 ] H_{out}=(H_{in}-1)stride[0]-2padding[0]+kernel\_size[0] Hout=(Hin1)stride[0]2padding[0]+kernel_size[0] W o u t = ( W i n − 1 ) s t r i d e [ 1 ] − 2 p a d d i n g [ 1 ] + k e r n e l _ s i z e [ 1 ] W_{out}=(W_{in}-1)stride[1]-2padding[1]+kernel\_size[1] Wout=(Win1)stride[1]2padding[1]+kernel_size[1]
    也可以使用output_size指定输出的大小
  • 三维逆池化
    input: (N,C,D_in,H_in,W_in)
    output:(N,C,D_out,H_out,W_out)
    D o u t = ( D i n − 1 ) s t r i d e [ 0 ] − 2 p a d d i n g [ 0 ] + k e r n e l _ s i z e [ 0 ]   D_{out}=(D_{in}-1)stride[0]-2padding[0]+kernel\_size[0]\ Dout=(Din1)stride[0]2padding[0]+kernel_size[0]  H o u t = ( H i n − 1 ) s t r i d e [ 1 ] − 2 p a d d i n g [ 0 ] + k e r n e l _ s i z e [ 1 ]   H_{out}=(H_{in}-1)stride[1]-2padding[0]+kernel\_size[1]\ Hout=(Hin1)stride[1]2padding[0]+kernel_size[1]  W o u t = ( W i n − 1 ) s t r i d e [ 2 ] − 2 p a d d i n g [ 2 ] + k e r n e l _ s i z e [ 2 ] W_{out}=(W_{in}-1)stride[2]-2padding[2]+kernel\_size[2] Wout=(Win1)stride[2]2padding[2]+kernel_size[2] 也可以使用output_size指定输出的大小

Example:

一维逆池化

>>> pool = nn.MaxPool1d(2, stride=2, return_indices=True)
>>> unpool = nn.MaxUnpool1d(2, stride=2)
>>> input = Variable(torch.Tensor([[[1, 2, 3, 4, 5, 6, 7, 8]]]))
>>> output, indices = pool(input)
>>> unpool(output, indices)
    Variable containing:
    (0 ,.,.) =
       0   2   0   4   0   6   0   8
    [torch.FloatTensor of size 1x1x8]

>>> # Example showcasing the use of output_size
>>> input = Variable(torch.Tensor([[[1, 2, 3, 4, 5, 6, 7, 8, 9]]]))
>>> output, indices = pool(input)
>>> unpool(output, indices, output_size=input.size())
    Variable containing:
    (0 ,.,.) =
       0   2   0   4   0   6   0   8   0
    [torch.FloatTensor of size 1x1x9]
>>> unpool(output, indices)
    Variable containing:
    (0 ,.,.) =
       0   2   0   4   0   6   0   8
    [torch.FloatTensor of size 1x1x8]

二维逆池化

>>> pool = nn.MaxPool2d(2, stride=2, return_indices=True)
>>> unpool = nn.MaxUnpool2d(2, stride=2)
>>> input = Variable(torch.Tensor([[[[ 1,  2,  3,  4],
    ...                                  [ 5,  6,  7,  8],
    ...                                  [ 9, 10, 11, 12],
    ...                                  [13, 14, 15, 16]]]]))
>>> output, indices = pool(input)
>>> unpool(output, indices)
    Variable containing:
    (0 ,0 ,.,.) =
       0   0   0   0
       0   6   0   8
       0   0   0   0
       0  14   0  16
    [torch.FloatTensor of size 1x1x4x4]

>>> # specify a different output size than input size
>>> unpool(output, indices, output_size=torch.Size([1, 1, 5, 5]))
    Variable containing:
    (0 ,0 ,.,.) =
       0   0   0   0   0
       6   0   8   0   0
       0   0   0  14   0
      16   0   0   0   0
       0   0   0   0   0
    [torch.FloatTensor of size 1x1x5x5]

三维逆池化

>>> # pool of square window of size=3, stride=2
>>> pool = nn.MaxPool3d(3, stride=2, return_indices=True)
>>> unpool = nn.MaxUnpool3d(3, stride=2)
>>> output, indices = pool(Variable(torch.randn(20, 16, 51, 33, 15)))
>>> unpooled_output = unpool(output, indices)
>>> unpooled_output.size()
torch.Size([20, 16, 51, 33, 15])
  • 2
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值