我们在对MNIST做卷积的时候,只要指定inputs, num_outputs, kernel_size, scope这几个参数就可以了,比如:
conv1 = tf.contrib.layers.conv2d(inputs, 4, [5, 5], 'conv_layer1')
#stride默认1,weights和biases也都是默认的
定义池化层
可以用 tf.contrib.layers.max_pool2d或者tf.contrib.layers.avg_pool2d
max_pool2d(inputs, kernel_size, stride=2, padding=’VALID’, data_format=DATA_FORMAT_NHWC, outputs_collections=None, scope=None)
- inputs: 就是卷积的输出了;
- kernel_size: 是不是叫pool_size更贴切。[kernel_height, kernel_width]或者是一个整数;
- stride: [stride_height, stride_width],不过文档上说目前这两个值必须一样
- padding: 这里默认是VALID,和卷积默认不一样,为什么要这样呢?
- data_format: 注意和卷积用的一样哦;
- outputs_collections: …
- scope: pooling的时候没有参数,需要scope吗?
pool1 = tf.contrib.layers.max_pool2d(conv1, [2, 2], padding='SAME')
定义全连接层
tf.contrib.layers下有可用的全连接方法:
fully_connected(inputs, num_outputs, activation_fn=nn.relu, normalizer_fn=None, normalizer_params=None, weights_initializer=initializers.xavier_initializer(), weights_regularizer=None, biases_initializer=init_ops.zeros_initializer, biases_regularizer=None, reuse=None, variables_collections=None, outputs_collections=None, trainable=True, scope=None)
看这个函数,参数和卷积很多地方是一样的, 我们可以这样用:
fc = tf.contrib.layers.fully_connected(inputs, 1024, scope='fc_layer')
唯一需要注意的是这里的inputs参数,一般是二维的形式[batch_size, depth],而前面卷积的结果,一般是[batch_size, height, width, channels]的形式,所以需要做一个flatten操作后再传给fully_connected。
一般在fc之后还会做dropout,可以用如下方法:
dropout(inputs, keep_prob=0.5, noise_shape=None, is_training=True, outputs_collections=None, scope=None)
参数的意义很明显,其中is_training需要注意一下,在训练的时候传True,其他情况下传False。
dropout是指在深度学习网络的训练过程中,对于神经网络单元,按照一定的概率将其暂时从网络中丢弃。注意是暂时,对于随机梯度下降来说,由于是随机丢弃,故而每一个mini-batch都在训练不同的网络。
dropout是CNN中防止过拟合提高效果的一个大杀器。
定义logits
全连接之后,一般就是用softmax做分类,然后定义loss,就可以训练了。但是看官方的例子,softmax前还加了一步,计算叫logits的东西,代码里面的说明是:
We don’t apply softmax here because
tf.nn.softmax_cross_entropy_with_logits accepts the unscaled logits
and performs the softmax internally for efficiency.
为什么要这样暂时不太明白,但是依样画葫芦,定义logtis本身很简单,做一个线性变换,把FC的结果映射到分类的数量上:
def inference(x, num_class):
with tf.variable_scope('softmax'):
dtype = x.dtype.base_dtype
# Set up the requested initialization.
init_mean = 0.0
init_stddev = 0.0
weights = tf.get_variable('weights',
[x.get_shape()[1], num_class], initializer=init_ops.random_normal_initializer(init_mean, init_stddev, dtype=dtype), dtype=dtype)
biases = tf.get_variable('bias', [num_class], initializer=init_ops.random_normal_initializer(init_mean, init_stddev, dtype=dtype), dtype=dtype)
logits </span>=<span style="color: #000000;"> tf.nn.xw_plus_b(x, weights, biases)
</span><span style="color: #0000ff;">return</span> logits</pre>
定义loss
在tf.contrib.losses下有一些预定义的loss函数,比如直接用
softmax_cross_entropy(logits, onehot_labels, weights=_WEIGHT_SENTINEL, label_smoothing=0, scope=None)
注意这里的label是onehot格式的, 我们从mnist获取的label要转换成这个格式。
定义train_op
可以用tf.contrib.layers.optimize_loss,通过传递不同的参数,就可以调用不同的优化方法。
optimize_loss(loss,
global_step,
learning_rate,
optimizer,
gradient_noise_scale=None,
gradient_multipliers=None,
clip_gradients=None,
learning_rate_decay_fn=None,
update_ops=None,
variables=None,
name=None,
summaries=None,
colocate_gradients_with_ops=False):
预定义的optimizer有:
OPTIMIZER_CLS_NAMES = {
"Adagrad": train.AdagradOptimizer,
"Adam": train.AdamOptimizer,
"Ftrl": train.FtrlOptimizer,
"Momentum": train.MomentumOptimizer,
"RMSProp": train.RMSPropOptimizer,
"SGD": train.GradientDescentOptimizer,
}
或者这么写
train_op = tf.contrib.layers.optimize_loss(
loss, tf.contrib.framework.get_global_step(), optimizer='Adagrad', learning_rate=0.1)
model和Estimator
结合上面的内容,就可以定义出model, 从而用Estimator完成训练,预测等功能,完整的程序如下:
import sklearn.metrics as metrics
import tensorflow as tf
from PIL import Image
from tensorflow.contrib import learn
from tensorflow.contrib.learn import SKCompat
from tensorflow.contrib.learn.python.learn.estimators import model_fn as model_fn_lib
from tensorflow.python.ops import init_ops
IMAGE_SIZE = 28
LOG_DIR = ‘./ops_logs’
mnist = learn.datasets.load_dataset(‘mnist’)
def inference(x, num_class):
with tf.variable_scope(‘softmax’):
dtype = x.dtype.base_dtype
init_mean = 0.0
init_stddev = 0.0
weight = tf.get_variable(‘weights’,
[x.get_shape()[1], num_class], initializer=init_ops.random_normal_initializer(init_mean, init_stddev, dtype=dtype), dtype=dtype)
biases = tf.get_variable(‘bias’, [num_class], initializer=init_ops.random_normal_initializer(init_mean, init_stddev, dtype=dtype), dtype=dtype)
logits </span>=<span style="color: #000000;"> tf.nn.xw_plus_b(x, weight, biases)
</span><span style="color: #0000ff;">return</span><span style="color: #000000;"> logits
def model(features, labels, mode):
if mode != model_fn_lib.ModeKeys.INFER:
labels = tf.one_hot(labels, 10, 1, 0)
else:
labels = None
inputs </span>= tf.reshape(features, (-<span style="color: #800080;">1</span>, IMAGE_SIZE, IMAGE_SIZE, <span style="color: #800080;">1</span><span style="color: #000000;">))
#conv1
conv1 </span>= tf.contrib.layers.conv2d(inputs, <span style="color: #800080;">4</span>, [<span style="color: #800080;">5</span>, <span style="color: #800080;">5</span>], scope=<span style="color: #800000;">'</span><span style="color: #800000;">conv_layer1</span><span style="color: #800000;">'</span>, activation_fn=<span style="color: #000000;">tf.nn.tanh);
pool1 </span>= tf.contrib.layers.max_pool2d(conv1, [<span style="color: #800080;">2</span>, <span style="color: #800080;">2</span>], padding=<span style="color: #800000;">'</span><span style="color: #800000;">SAME</span><span style="color: #800000;">'</span><span style="color: #000000;">)
#conv2
conv2 </span>= tf.contrib.layers.conv2d(pool1, <span style="color: #800080;">6</span>, [<span style="color: #800080;">5</span>, <span style="color: #800080;">5</span>], scope=<span style="color: #800000;">'</span><span style="color: #800000;">conv_layer2</span><span style="color: #800000;">'</span>, activation_fn=<span style="color: #000000;">tf.nn.tanh);
pool2 </span>= tf.contrib.layers.max_pool2d(conv2, [<span style="color: #800080;">2</span>, <span style="color: #800080;">2</span>], padding=<span style="color: #800000;">'</span><span style="color: #800000;">SAME</span><span style="color: #800000;">'</span><span style="color: #000000;">)
pool2_shape </span>=<span style="color: #000000;"> pool2.get_shape()
pool2_in_flat </span>= tf.reshape(pool2, [pool2_shape[<span style="color: #800080;">0</span>].value or -<span style="color: #800080;">1</span>, np.prod(pool2_shape[<span style="color: #800080;">1</span><span style="color: #000000;">:]).value])
#fc
fc1 </span>= tf.contrib.layers.fully_connected(pool2_in_flat, <span style="color: #800080;">1024</span>, scope=<span style="color: #800000;">'</span><span style="color: #800000;">fc_layer1</span><span style="color: #800000;">'</span>, activation_fn=<span style="color: #000000;">tf.nn.tanh)
#dropout
is_training </span>=<span style="color: #000000;"> False
</span><span style="color: #0000ff;">if</span> mode ==<span style="color: #000000;"> model_fn_lib.ModeKeys.TRAIN:
is_training </span>=<span style="color: #000000;"> True
dropout </span>= tf.contrib.layers.dropout(fc1, keep_prob=<span style="color: #800080;">0.5</span>, is_training=is_training, scope=<span style="color: #800000;">'</span><span style="color: #800000;">dropout</span><span style="color: #800000;">'</span><span style="color: #000000;">)
logits </span>= inference(dropout, <span style="color: #800080;">10</span><span style="color: #000000;">)
prediction </span>=<span style="color: #000000;"> tf.nn.softmax(logits)
</span><span style="color: #0000ff;">if</span> mode !=<span style="color: #000000;"> model_fn_lib.ModeKeys.INFER:
loss </span>=<span style="color: #000000;"> tf.contrib.losses.softmax_cross_entropy(logits, labels)
train_op </span>=<span style="color: #000000;"> tf.contrib.layers.optimize_loss(
loss, tf.contrib.framework.get_global_step(), optimizer</span>=<span style="color: #800000;">'</span><span style="color: #800000;">Adagrad</span><span style="color: #800000;">'</span><span style="color: #000000;">,
learning_rate</span>=<span style="color: #800080;">0.1</span><span style="color: #000000;">)
</span><span style="color: #0000ff;">else</span><span style="color: #000000;">:
train_op </span>=<span style="color: #000000;"> None
loss </span>=<span style="color: #000000;"> None
</span><span style="color: #0000ff;">return</span> {<span style="color: #800000;">'</span><span style="color: #800000;">class</span><span style="color: #800000;">'</span>: tf.argmax(prediction, <span style="color: #800080;">1</span>), <span style="color: #800000;">'</span><span style="color: #800000;">prob</span><span style="color: #800000;">'</span><span style="color: #000000;">: prediction}, loss, train_op
classifier = SKCompat(learn.Estimator(model_fn=model, model_dir=LOG_DIR))
classifier.fit(mnist.train.images, mnist.train.labels, steps=1000, batch_size=300)
predictions = classifier.predict(mnist.test.images)
score = metrics.accuracy_score(mnist.test.labels, predictions[‘class’])
print(‘Accuracy: {0:f}’.format(score))