数据集:MNIST
模型:CNN;卷积层数:2;全连接层数:3,全连接层Dropout Rate:0.8;激活函数:Relu
损失函数:交叉熵
Batch size: 100
Optimizer:AdamOptimizer
Learning Rate: 1e-4
代码如下:
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
#读取数据
mnist = input_data.read_data_sets('MNIST_data/',one_hot=True)
#取批量大小为100
batch_size = 100
n_batch = mnist.train.num_examples//batch_size
#定义输入占位符
x = tf.placeholder(tf.float32,shape=[None,784],name='x-input')
y = tf.placeholder(tf.float32,shape=[None,10],name='y-input')
x_image = tf.reshape(x,shape=[-1,28,28,1],name='x_image')
#初始化weight
def weight_variable(shape,name):
initial = tf.truncated_normal(shape,stddev=0.1)
return tf.Variable(initial,name=name)
#初始化bias
def bias_variable(shape,name):
initial = tf.constant(0.1,shape=shape)
return tf.Variable(initial,name=name)
#定义卷积层
def conv2d(x,W):
return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')
#定义max池化层
def max_pool_2x2(x):
return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')
#初始化第一卷积层参数
W_conv1 = weight_variable([5,5,1,32],name='W_conv1')
B_conv1 = bias_variable([32],name='B_conv1')
#卷积网络第一层
conv2d_1 = conv2d(x_image,W_conv1) + B_conv1
h_conv_1 = tf.nn.relu(conv2d_1,name='h_conv2d_1')
h_pool_1 = max_pool_2x2(h_conv_1)
#初始化第二卷积层参数
W_conv2 = weight_variable([5,5,32,64],name='W_conv2')
B_conv2 = bias_variable([64],name='B_conv2')
#卷积网络第二层
conv2d_2 = conv2d(h_pool_1,W_conv2) + B_conv2
h_conv_2 = tf.nn.relu(conv2d_2,name='h_conv_2')
h_pool_2 = max_pool_2x2(h_conv_2)
#池化层输出结果转化为一维向量
h_pool2_vec = tf.reshape(h_pool_2,[-1,7*7*64])
#初始化全连接1层参数
W_fc_1 = weight_variable([7*7*64,1024],name='W_fc_1')
B_fc_1 =bias_variable([1024],name='b_fc_1')
#第一全连接层
h_fc_1 = tf.nn.relu(tf.matmul(h_pool2_vec,W_fc_1) + B_fc_1)
#定义drop_rate
keep_prob = tf.placeholder(tf.float32,name='keep_prob')
h_fc1_drop = tf.nn.dropout(h_fc_1,keep_prob=keep_prob)
#初始化第2全连接层参数
W_fc_2 = weight_variable([1024,200],name='W_fc_2')
B_fc_2 = bias_variable([200],name='B_fc_2')
#第2全连接层
h_fc_2 = tf.nn.relu(tf.matmul(h_fc1_drop,W_fc_2) + B_fc_2)
h_fc2_drop = tf.nn.dropout(h_fc_2,keep_prob=keep_prob)
#初始化第3全连接层参数
W_fc_3 = weight_variable([200,10],name='W_fc_3')
B_fc_3 = bias_variable([10],name='B_fc_3')
#第3全连接层
prediction = tf.nn.softmax(tf.matmul(h_fc2_drop,W_fc_3) + B_fc_3)
#交叉熵损失函数
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=prediction))
lr = tf.Variable(1e-4,dtype=tf.float32)
train_step = tf.train.AdamOptimizer(lr).minimize(cross_entropy)
predict_result = tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(predict_result,tf.float32))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(51):
for batch in range(n_batch):
batch_x,batch_y = mnist.train.next_batch(batch_size)
sess.run(train_step,feed_dict={x:batch_x,y:batch_y,keep_prob:0.8})
test_acc = sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels,keep_prob:1.0})
print ("Iter " + str(epoch) + ", Testing Accuracy= " + str(test_acc))
训练51 epoch,测试集accuracy如下:
Iter 41, Testing Accuracy= 0.993
Iter 42, Testing Accuracy= 0.9925
Iter 43, Testing Accuracy= 0.9929
Iter 44, Testing Accuracy= 0.9913
Iter 45, Testing Accuracy= 0.9931
Iter 46, Testing Accuracy= 0.9923
Iter 47, Testing Accuracy= 0.9931
Iter 48, Testing Accuracy= 0.9934
Iter 49, Testing Accuracy= 0.9928
Iter 50, Testing Accuracy= 0.9931
一点疑惑:
上面代码learning rate采用固定值1e-4,当我将learning rate设置为变量1e-4*0.95**epoch或1e-4*0.96**epoch时,测试集accuracy只能达到0.985左右,accuracy不升反降是为啥呢?
另:由于本文模型比较简单,所以不使用dropout时accuracy可能会略有提高。