一、网络结构
使用2个卷积层,2个池化层, 2个全连接层组成网络
输入→ 卷积→ ReLU→max polling→ 卷积→ ReLU→max polling→ FC→输出
-
输入
一个4维的tensor: [batch_size, image_width, image_height, channels], 分别代表梯度下降处理的批量数据大小,图片宽度,图片高度和图片的channel个数(彩色图片channel数为3[Red, Green, Blue],单色图片channel数为1)# Input Layer
# Reshape X to 4-D tensor: [batch_size, width, height, channels]
# MNIST images are 28x28 pixels, and have one color channel
input_layer
=
tf.reshape(features, [
-
1
,
28
,
28
,
1
])
-
卷积层#1:
采用32(channel)个5*5的过滤器(kernel)对原始输入图像做卷积(局部感知), 另外对输入矩阵加了zero padding以保持卷积输出宽高和输入一致,并用ReLU作为激活函数引入非线性特性# Convolutional Layer #1
# Computes 32 features using a 5x5 filter with ReLU activation.
# Padding is added to preserve width and height.
# Input Tensor Shape: [batch_size, 28, 28, 1]
# Output Tensor Shape: [batch_size, 28, 28, 32]
conv1
=
tf.layers.conv2d(
inputs
=
input_layer,
filters
=
32
,
kernel_size
=
[
5
,
5
],
padding
=
"same"
,
activation
=
tf.nn.relu)
-
池化层#1
采用2*2的过滤器(stride=2)对卷积层#1的输出做最大值下采样(max polling), 降低了数据纬度,并避免过拟合# Pooling Layer #1
# First max pooling layer with a 2x2 filter and stride of 2
# Input Tensor Shape: [batch_size, 28, 28, 32]
# Output Tensor Shape: [batch_size, 14, 14, 32]
pool1
=
tf.layers.max_pooling2d(inputs
=
conv1, pool_size
=
[
2
,
2
], strides
=
2
)
-
卷积层#2
采用64个5*5的过滤器(kernel)对池化层#1的输出做卷积, 并用ReLU作为激活函数
# Convolutional Layer #2
# Computes 64 features using a 5x5 filter.
# Padding is added to preserve width and height.
# Input Tensor Shape: [batch_size, 14, 14, 32]
# Output Tensor Shape: [batch_size, 14, 14, 64]
conv2
=
tf.layers.conv2d(
inputs
=
pool1,
filters
=
64
,
kernel_size
=
[
5
,
5
],
padding
=
"same"
,
activation
=
tf.nn.relu)
-
池化层#2
采用2*2的过滤器(stride=2)对卷积层#2的输出做最大值下采样(max polling)
# Pooling Layer #2
# Second max pooling layer with a 2x2 filter and stride of 2
# Input Tensor Shape: [batch_size, 14, 14, 64]
# Output Tensor Shape: [batch_size, 7, 7, 64]
pool2
=
tf.layers.max_pooling2d(inputs
=
conv2, pool_size
=
[
2
,
2
], strides
=
2
)
-
全连接层#1
首先把池化层#2的输出打平(flatten)成二维[batch_size, 7*7*64]矩阵,然后和1024个神经元做全连接,同时指定dropout=0.4(随机保留60%的数据做训练,避免过拟合)# Flatten tensor into a batch of vectors
# Input Tensor Shape: [batch_size, 7, 7, 64]
# Output Tensor Shape: [batch_size, 7 * 7 * 64]
pool2_flat
=
tf.reshape(pool2, [
-
1
,
7
*
7
*
64
])
# Dense Layer
# Densely connected layer with 1024 neurons
# Input Tensor Shape: [batch_size, 7 * 7 * 64]
# Output Tensor Shape: [batch_size, 1024]
dense
=
tf.layers.dense(inputs
=
pool2_flat, units
=
1024
, activation
=
tf.nn.relu)
# Add dropout operation; 0.6 probability that element will be kept
dropout
=
tf.layers.dropout(
inputs
=
dense, rate
=
0.4
, training
=
mode
=
=
learn.ModeKeys.TRAIN)
-
输出
10个神经元,依次代表0-9# Logits layer
# Input Tensor Shape: [batch_size, 1024]
# Output Tensor Shape: [batch_size, 10]
logits
=
tf.layers.dense(inputs
=
dropout, units
=
10
)
二、模型训练
-
对label做one-hot encoding
# tf.one_hot接受两个参数:
# indices代表one-hot encoding后,值为1的位置(其余为0)
# depth代表目标值的个数(以手写数字识别为例,目标值为0-9, 所以depth=10)
onehot_labels
=
tf.one_hot(indices
=
tf.cast(labels, tf.int32), depth
=
10
)
-
计算交叉熵损失:
loss
=
tf.losses.softmax_cross_entropy(onehot_labels
=
onehot_labels, logits
=
logits)
-
配置训练操作, 学习率=0.001,优化方法采用随机梯度下降:
train_op
=
tf.contrib.layers.optimize_loss(
loss
=
loss,
global_step
=
tf.contrib.framework.get_global_step(),
learning_rate
=
0.001
,
optimizer
=
"SGD"
)
-
模型预测
# Generate Predictions
# classes: 预测的分类,取值0-9
# probabilities: classed对应的可能性, 经过softmax激活函数处理
predictions
=
{
"classes"
: tf.argmax(
input
=
logits, axis
=
1
),
"probabilities"
: tf.nn.softmax(
logits, name
=
"softmax_tensor"
)
}
-
创建评估器(Estimator),返回一个分类器,能做训练和评估
# Create the Estimator
# 这里的cnn_model_fn几乎就是上面全部代码的一个wrap, 详见:https://www.tensorflow.org/tutorials/layers#building_the_cnn_mnist_classifier
mnist_classifier
=
learn.Estimator(
model_fn
=
cnn_model_fn, model_dir
=
"/tmp/mnist_convnet_model"
)
-
训练:
# Train the model
mnist_classifier.fit(
x
=
train_data,
y
=
train_labels,
batch_size
=
100
,
steps
=
20000
,
monitors
=
[logging_hook])
三、模型评估
-
配置评估metric并做评估
# Configure the accuracy metric for evaluation
metrics
=
{
"accuracy"
:
learn.MetricSpec(
metric_fn
=
tf.metrics.accuracy, prediction_key
=
"classes"
),
}
# Evaluate the model and print results
eval_results
=
mnist_classifier.evaluate(
x
=
eval_data, y
=
eval_labels, metrics
=
metrics)
四、源码
原文地址: https://www.tensorflow.org/tutorials/layers