本文主要介绍前两个函数tf.nn.conv2d和tf.contrib.layers.conv2d
因为tf.contrib.layers.conv2d 和 tf.contrib.slim.conv2d用法是相似的,都是高级api,只是slim是一个更高级别的库,用slim中的repeat函数,可以用几行就写出一个vgg16网络。但是tf.nn和tf.contrib.layers是基本ops,最常见也最重要。
首先说tf.nn.conv2d()的参数
tf.nn.conv2d(
input,
filter,
strides,
padding,
use_cudnn_on_gpu=True,
data_format='NHWC',
dilations=[1, 1, 1, 1],
name=None
)
input的格式由data_format参数决定,默认都是‘NHWC’,也就是batch size;feature map height;feature map width;channels。
filter是单独定义的, 可以用tf.get_variable()或者tf.Variable()来定义weight,关于这两种定义方法,可以参见我的另一篇博文
以tf.Variable和tf.nn.conv2d为例子来建立一个卷积层:
def create_weights(shape):
return tf.Variable(tf.truncated_normal(shape,stddev=0.1))
def create_bias(num_filters):
return tf.Variable(tf.constant(0.05,shape=num_filters))
def create_conv_layer(input,num_channels,filtr_size,num_filters):
weights=create_weights(shape=[input,num_channels,filtr_size,num_filter])
bias=create_bias(num_filters)
layer=tf.nn.conv2d(input=input,filter=weights,stride=[1,1,1,1],padding='same')
layer+=bias
layer=tf.nn.max_pool(layer,ksize=[1,2,2,1],strides=[1,2,2,1],padding='same')
return layer
以上是建立卷积层最基础的方式。
接下来来看用tf.contrib.layers.conv2d创建卷积层。
tf.contrib.layers.conv2d(
inputs, #四维tensor ‘NHDC’
num_outputs, #filter的数量
kernel_size,#[k,k]
stride=1,#默认为[1,1]
padding='SAME',#valid or same
data_format=None,#NHDC
rate=1,#dilation rate
activation_fn=tf.nn.relu,#默认是relu,可选tf.nn.leaky_relu
normalizer_fn=None,
normalizer_params=None,
weights_initializer=tf.contrib.layers.xavier_initializer(),
weights_regularizer=None,
biases_initializer=tf.zeros_initializer(),
biases_regularizer=None,
reuse=None,
variables_collections=None,
outputs_collections=None,
trainable=True,
scope=None
)
这里提一下tf.contrib.layers.xavier_initializer()
这个比 tf.truncated_normal_initializer好一点(不过有人做实验显示两者效果差不多)。
xavier initializer is designed to keep the scale of the gradients roughly the same in all layers.
In uniform distribution, tf.contrib.layers.xavier_initializer(uniform=True) this ends up being the range: x = sqrt(6. / (in + out)); [-x, x]
In normal distribution, tf.contrib.layers.xavier_initializer(uniform=False) a standard deviation of sqrt(3. / (in + out)) is used.
有以下几篇介绍xavier initialier及其他initializer的文章很详细
1 why xavier and He initializer
2 聊一聊weights initializer
3 这篇也不错
tf.contrib.layers.conv2d()简单的example是:
def mynet(input,reuse=False):
with tf.variable_scope("conv1") as scope:
net=tf.contrib.layers.conv2d(input,32,[5,5],padding='SAME',weights_initializer=tf.contrib.layers.xavier_initializer(uniform=False),scope=scope,reuse=reuse)
net=tf.contrib.layers.max_pool2d(net,[2,2],padding='SAME')
with tf.variable_scope("conv2") as scope:
...
net=tf.contrib.layers.flatten(net)
return net