import tensorflow as tf
inputs = tf.placeholder(dtype=tf.float32,shape=[1,225,225,3],name='a')
output1 = tf.layers.conv2d(inputs=inputs,filters=3,kernel_size=3,strides=2,padding='SAME')
output11 = tf.layers.conv2d(inputs=inputs,filters=3,kernel_size=3,strides=2,padding='VALID')
output2 = tf.layers.conv2d(inputs=inputs,filters=3,kernel_size=3,strides=4,padding='SAME')
output22 = tf.layers.conv2d(inputs=inputs,filters=3,kernel_size=3,strides=4,padding='VALID')
#output1 <tf.Tensor 'conv2d_1/BiasAdd:0' shape=(1, 113, 113, 3) dtype=float32>
#output11 <tf.Tensor 'conv2d_2/BiasAdd:0' shape=(1, 112, 112, 3) dtype=float32>
#output2 <tf.Tensor 'conv2d_6/BiasAdd:0' shape=(1, 57, 57, 3) dtype=float32>
#output22 <tf.Tensor 'conv2d_4/BiasAdd:0' shape=(1, 56, 56, 3) dtype=float32>
我们平时都在用卷积神经网络,而其中的padding参数更是我们需要掌握的地方.我们都知道padding="VALID"的作用是不对卷积输入做padding操作.那padding="SAME"是什么意义呢?padding=SAME ,输出的尺寸是⌈w/strides ⌉,含义是向上取整.
我们在学习tensorflow的过程中会发现,当stride>1时,大家通常都会自己定义SAME,而不会使用cov2d中的SAME.
原因是conv2d中的SAME会收到输入inputs形状,kernel,stride的影响,如
input=225, kernel=7, stride=2 —> padding = [3, 3]
input=224, kernel=7, stride=2 —> padding = [2, 3]
由于 input不同,得到的padding不相同,而在实际的CNN问题中,padding的形状,不受inputs的影响,所以通常要自己定义这个东西.
与在其他框架中训练的模型不一致。这已经不是我第一次在加载别人发布的模型时需要手动修正填充了.这也会给多后端框架(如Keras)带来痛苦,因为“SAME”并不意味着每个后端都是相同的。这里有一个Keras问题的例子。
下面给出slim中经过修改后的conv2d
def conv2d_same(inputs, num_outputs, kernel_size, stride, rate=1, scope=None):
"""Strided 2-D convolution with 'SAME' padding.
When stride > 1, then we do explicit zero-padding, followed by conv2d with
'VALID' padding.
Note that
net = conv2d_same(inputs, num_outputs, 3, stride=stride)
is equivalent to
net = slim.conv2d(inputs, num_outputs, 3, stride=1, padding='SAME')
net = subsample(net, factor=stride)
whereas
net = slim.conv2d(inputs, num_outputs, 3, stride=stride, padding='SAME')
is different when the input's height or width is even, which is why we add the
current function. For more details, see ResnetUtilsTest.testConv2DSameEven().
Args:
inputs: A 4-D tensor of size [batch, height_in, width_in, channels].
num_outputs: An integer, the number of output filters.
kernel_size: An int with the kernel_size of the filters.
stride: An integer, the output stride.
rate: An integer, rate for atrous convolution.
scope: Scope.
Returns:
output: A 4-D tensor of size [batch, height_out, width_out, channels] with
the convolution output.
"""
if stride == 1:
return slim.conv2d(inputs, num_outputs, kernel_size, stride=1, rate=rate,
padding='SAME', scope=scope)
else:
kernel_size_effective = kernel_size + (kernel_size - 1) * (rate - 1)
pad_total = kernel_size_effective - 1
pad_beg = pad_total // 2
pad_end = pad_total - pad_beg
inputs = tf.pad(inputs,
[[0, 0], [pad_beg, pad_end], [pad_beg, pad_end], [0, 0]])
return slim.conv2d(inputs, num_outputs, kernel_size, stride=stride,
rate=rate, padding='VALID', scope=scope)
这个问题的参考地址
https://github.com/tensorflow/tensorflow/issues/18213