从stackoverflow中查到如下内容,翻译如下:
input stride为我们正常进行卷积时候设置的stride值,
output stride为该矩阵经过多次卷积pooling操作后,尺寸缩小的值,例如:
input image为224 * 224,经过多次卷积pooling操作后,feature map为7*7,那么output stride为224/7 = 32.
Input stride is the stride of the filter . How much you shift the filter in the output .
Output Stride this is actually a nominal value . We get feature map in a CNN after doing several convolution , max-pooling operations . Let’s say our input image is 224 * 224 and our final feature map is 7*7 .
Then we say our output stride is : 224/7 = 32 (Approximate of what happened to the image after down sampling .)
This tensorflow script describe what is this output stride , and how to use in FCN which is the case of dense prediction .
one uses inputs with spatial dimensions that are multiples of 32 plus 1, e.g., [321, 321]. In this case the feature maps at the ResNet output will have spatial shape [(height - 1) / output_stride + 1, (width - 1) / output_stride + 1] and corners exactly aligned with the input image corners, which greatly facilitates alignment of the features to the image. Using as input [225, 225] images results in [8, 8] feature maps at the output of the last ResNet block.