自适应池化Adaptive Pooling是PyTorch含有的一种池化层,在PyTorch的中有六种形式:
自适应最大池化Adaptive Max Pooling:
torch.nn.AdaptiveMaxPool1d(output_size)
torch.nn.AdaptiveMaxPool2d(output_size)
torch.nn.AdaptiveMaxPool3d(output_size)
自适应平均池化Adaptive Average Pooling:
torch.nn.AdaptiveAvgPool1d(output_size)
torch.nn.AdaptiveAvgPool2d(output_size)
torch.nn.AdaptiveAvgPool3d(output_size)
具体可见官方文档。
官方给出的例子:
>>> # target output size of 5x7
>>> m = nn.AdaptiveMaxPool2d((5,7))
>>> input = torch.randn(1, 64, 8, 9)
>>> output = m(input)
>>> output.size()
torch.Size([1, 64, 5, 7])
>>> # target output size of 7x7 (square)
>>> m = nn.AdaptiveMaxPool2d(7)
>>> input = torch.randn(1, 64, 10, 9)
>>> output = m(input)
>>> output.size()
torch.Size([1, 64, 7, 7])
>>> # target output size of 10x7
>>> m = nn.AdaptiveMaxPool2d((None, 7))
>>> input = torch.randn(1, 64, 10, 9)
>>> output = m(input)
>>> output.size()
torch.Size([1, 64, 10, 7])
Adaptive Pooling特殊性在于,输出张量的大小都是给定的 o u t p u t _ s i z e output\_size output_size。例如输入张量大小为(1, 64, 8, 9),设定输出大小为(5,7),通过Adaptive Pooling层,可以得到大小为(1, 64, 5, 7)的张量。
若已知池化层的
k
e
r
n
e
l
_
s
i
z
e
kernel\_size
kernel_size、
p
a
d
d
i
n
g
padding
padding、
s
t
r
i
d
e
stride
stride以及输入张量的大小
i
n
p
u
t
_
s
i
z
e
input\_size
input_size,则输出张量大小
o
u
t
p
u
t
_
s
i
z
e
output\_size
output_size为:
o
u
t
p
u
t
_
s
i
z
e
=
(
i
n
p
u
t
_
s
i
z
e
+
2
∗
p
a
d
d
i
n
g
−
k
e
r
n
e
l
_
s
i
z
e
)
/
s
t
r
i
d
e
+
1
output\_size = (input\_size+2*padding-kernel\_size)/stride + 1
output_size=(input_size+2∗padding−kernel_size)/stride+1
则根据上式可得:
k
e
r
n
e
l
_
s
i
z
e
=
(
i
n
p
u
t
_
s
i
z
e
+
2
∗
p
a
d
d
i
n
g
)
−
(
o
u
t
p
u
t
_
s
i
z
e
−
1
)
∗
s
t
r
i
d
e
kernel\_size=(input\_size+2*padding)-(output\_size-1) * stride
kernel_size=(input_size+2∗padding)−(output_size−1)∗stride
要想知道Adaptive Pooling的原理,我们需要找到该层的
k
e
r
n
e
l
_
s
i
z
e
kernel\_size
kernel_size、
p
a
d
d
i
n
g
padding
padding和
s
t
r
i
d
e
stride
stride。
通过以下的例子,我们看一下这三个参数的关系:
>>> inputsize = 9
>>> outputsize = 4
>>> input = torch.randn(1, 1, inputsize)
>>> input
tensor([[[ 1.5695, -0.4357, 1.5179, 0.9639, -0.4226, 0.5312, -0.5689, 0.4945, 0.1421]]])
>>> m1 = nn.AdaptiveMaxPool1d(outputsize)
>>> m2 = nn.MaxPool1d(kernel_size=math.ceil(inputsize / outputsize), stride=math.floor(inputsize / outputsize), padding=0)
>>> output1 = m1(input)
>>> output2 = m2(input)
>>> output1
tensor([[[1.5695, 1.5179, 0.5312, 0.4945]]]) torch.Size([1, 1, 4])
>>> output2
tensor([[[1.5695, 1.5179, 0.5312, 0.4945]]]) torch.Size([1, 1, 4])
通过实验发现:
s
t
r
i
d
e
=
f
l
o
o
r
(
i
n
p
u
t
_
s
i
z
e
/
o
u
t
p
u
t
_
s
i
z
e
)
stride=floor(input\_size/output\_size)
stride=floor(input_size/output_size)
k e r n e l _ s i z e = i n p u t _ s i z e − ( o u t p u t _ s i z e − 1 ) ∗ s t r i d e kernel\_size=input\_size-(output\_size-1) * stride kernel_size=input_size−(output_size−1)∗stride
p
a
d
d
i
n
g
=
0
padding=0
padding=0
c
e
i
l
ceil
ceil为向上取整,
f
l
o
o
r
floor
floor为向下取整。
下面是Adaptive Average Pooling的c++源码部分。
template <typename scalar_t>
static void adaptive_avg_pool2d_out_frame(
scalar_t *input_p,
scalar_t *output_p,
int64_t sizeD,
int64_t isizeH,
int64_t isizeW,
int64_t osizeH,
int64_t osizeW,
int64_t istrideD,
int64_t istrideH,
int64_t istrideW)
{
int64_t d;
#pragma omp parallel for private(d)
for (d = 0; d < sizeD; d++)
{
/* loop over output */
int64_t oh, ow;
for(oh = 0; oh < osizeH; oh++)
{
int istartH = start_index(oh, osizeH, isizeH);
int iendH = end_index(oh, osizeH, isizeH);
int kH = iendH - istartH;
for(ow = 0; ow < osizeW; ow++)
{
int istartW = start_index(ow, osizeW, isizeW);
int iendW = end_index(ow, osizeW, isizeW);
int kW = iendW - istartW;
/* local pointers */
scalar_t *ip = input_p + d*istrideD + istartH*istrideH + istartW*istrideW;
scalar_t *op = output_p + d*osizeH*osizeW + oh*osizeW + ow;
/* compute local average: */
scalar_t sum = 0;
int ih, iw;
for(ih = 0; ih < kH; ih++)
{
for(iw = 0; iw < kW; iw++)
{
scalar_t val = *(ip + ih*istrideH + iw*istrideW);
sum += val;
}
}
/* set output to local average */
*op = sum / kW / kH;
}
}
}
}
另外,pytorch的adaptive pooling和ROI pooling并不是很一样,需要加以区分。