基于MNIST设计神经网络识别手写数字(version 3 ResNet残差网络)
本项目是tensorflow + resnet + mnist
Version 3 是基于残差网络设计的,ResNet的论文可以去这里下载(链接),我设计的结构为1个卷积层+6个shortcut+2个全连接层,shortcut的结构如下图所示:
1. 文件结构:
2. 什么是shortcut?
ResNet里的shortcut是什么样子?其实很简单,他是由两个卷积层构成,这两个卷积层也就是第一个图中的两个weight layer,X也就是input_data,input_data经过一个卷积层,再用relu函数激励得到output1,output1在经过一个卷积层,再用relu函数激励得到output2,然后这个outputs2和input_data相加得到一个结果output3,这就是shortcut。
还有,在CNN中,比如输入维度 ,我们经过一个卷积层和池化变成
,这时候由于X的维度为1,F的维度为64,所以我们需要将想X的维度变为64,再与F相加。针对这种情况,ResNet作者建议可以用
的卷积层,stride=2,来使得
,从而与
维度匹配起来,再进行相加。
shortcut的代码(resnet.py):
import numpy as np
import tensorflow as tf
slim = tf.contrib.slim
def res_identity(input_tensor, conv_depth, kernel_shape, layer_name):
# gamma_init = tf.random_normal_initializer(1., 0.02)
with tf.variable_scope(layer_name):
relu = tf.nn.relu(slim.conv2d(input_tensor, conv_depth, kernel_shape))
outputs = tf.nn.relu(slim.conv2d(relu, conv_depth, kernel_shape) + input_tensor)
return outputs
def res_change(input_tensor, conv_depth, kernel_shape, layer_name):
input_depth = input_tensor.shape[3]
with tf.variable_scope(layer_name):
relu = tf.nn.relu(slim.conv2d(input_tensor, conv_depth, kernel_shape, stride=2))
input_tensor_reshape = slim.conv2d(input_tensor, conv_depth, [1,1], stride=2)
outputs = tf.nn.relu(slim.conv2d(relu, conv_depth, kernel_shape) + input_tensor_reshape)
return outputs
3. 网络结构:1个卷积层+6个shortcut+2个全连接层(res_mnist_inference.py)
import tensorflow as tf
import resnet
slim = tf.contrib.slim
def inference(inputs):
x = tf.reshape(inputs, [-1, 28, 28, 1])
conv_1 = tf.nn.relu(slim.conv2d(x, 32, [3, 3])) #28*28*32
# bn_1 = tf.contrib.layers.batch_norm(conv_1)
pool_1 = slim.max_pool2d(conv_1, [2, 2]) #14*14*32
block_1 = resnet.res_identity(pool_1, 32, [3, 3], 'layer_2')
block_2 = resnet.res_identity(block_1, 32, [3, 3], 'layer_3')
block_3 = resnet.res_identity(block_2, 32, [3, 3], 'layer_4') #14*14*32
block_4 = resnet.res_change(block_3, 64, [3, 3], 'layer_5')
block_5 = resnet.res_identity(block_4, 64, [3, 3], 'layer_6')
block_6 = resnet.res_identity(block_5, 64, [3, 3], 'layer_7')
net_flatten = slim.flatten(block_6, scope='flatten')
fc_1 = slim.fully_connected(slim.dropout(net_flatten, 0.8), 200, activation_fn=tf.nn.tanh, scope='fc_1')
output = slim.fully_connected(slim.dropout(fc_1, 0.8), 10, activation_fn=None, scope='output_layer')
return output
4 . 主函数训练train(res_mnist.py)
下载mnist文件,也就是4个压缩包。要进行one_hot处理哦。
import tensorflow as tf
import sys
from tensorflow.examples.tutorials.mnist import input_data
import res_mnist_inference
mnist = input_data.read_data_sets("./MNIST_data/", one_hot=True)
batch_size = 100
learning_rate = 0.003
learning_rate_decay = 0.97
# regularization_rate = 0.0001
model_save_path = './model/'
def train():
x = tf.placeholder(tf.float32, [None, 784])
y = tf.placeholder(tf.float32, [None, 10])
y_outputs = res_mnist_inference.inference(x)
global_step = tf.Variable(0, trainable=False)
entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y_outputs, labels=tf.argmax(y, 1))
loss = tf.reduce_mean(entropy)
rate = tf.train.exponential_decay(learning_rate, global_step, 200, learning_rate_decay)
train_op = tf.train.AdamOptimizer(learning_rate).minimize(loss, global_step=global_step)
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(500):
x_b, y_b = mnist.train.next_batch(batch_size)
train_op_, loss_, step = sess.run([train_op, loss, global_step], feed_dict={x:x_b, y:y_b})
if i % 50 == 0:
print("training step {0}, loss {1}".format(step, loss_))
saver.save(sess, model_save_path+'my_model', global_step=global_step)
def main(_):
train()
if __name__ == '__main__':
tf.app.run()
5. 训练结构
准确度:98.2%,这里说明一下,在设计网络结构的时候,最好加上dropout处理。还有,最好进行batch normalization处理,这样能在一定情况下让数据更有效。