上篇博客尝试了下使用逻辑回归的方式实现Mnist数据集的预测,本篇打算使用神经网络的方法进行预测,并分别设置1个隐层和2个隐层进行效果比对。
1个隐层
导入函数库、数据集:
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
mnist = input_data.read_data_sets('data/', one_hot=True)
设置参数:
numClasses = 10
inputSize = 784
# 隐藏单元个数50(把784个像素点映射成50个特征)
numHiddenUnits = 50
trainingIterations = 10000
batchSize = 100
X = tf.placeholder(tf.float32, shape = [None, inputSize])
y = tf.placeholder(tf.float32, shape = [None, numClasses])
参数初始化:
W1 = tf.Variable(tf.truncated_normal([inputSize, numHiddenUnits], stddev=0.1))
# B1设置窍门:W1想得到的输出是多大的,B1就设置多大。B1输出50个值
B1 = tf.Variable(tf.constant(0.1), [numHiddenUnits])
W2 = tf.Variable(tf.truncated_normal([numHiddenUnits, numClasses], stddev=0.1))
B2 = tf.Variable(tf.constant(0.1), [numClasses])
构建1个隐层的神经网络结构:
hiddenLayerOutput = tf.matmul(X, W1) + B1
hiddenLayerOutput = tf.nn.relu(hiddenLayerOutput)
finalOutput = tf.matmul(hiddenLayerOutput, W2) + B2
finalOutput = tf.nn.relu(finalOutput)
对网络结构进行迭代:
# y:真实值;logits:预测值
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = y, logits = finalOutput))
opt = tf.train.GradientDescentOptimizer(learning_rate = .1).minimize(loss)
correct_prediction = tf.equal(tf.argmax(finalOutput, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
for i in range(trainingIterations):
batch = mnist.train.next_batch(batchSize)
batchInput = batch[0]
batchLabels = batch[1]
_, trainingLoss = sess.run([opt, loss], feed_dict={X: batchInput, y: batchLabels})
if i%1000 ==0:
train_accuracy = accuracy.eval(session=sess, feed_dict={X: batchInput, y: batchLabels})
print("step %d, training accuracy %g"%(i, train_accuracy))
执行结果:
step 0, training accuracy 0.1
step 1000, training accuracy 0.92
step 2000, training accuracy 0.93
step 3000, training accuracy 0.93
step 4000, training accuracy 0.95
step 5000, training accuracy 0.96
step 6000, training accuracy 0.98
step 7000, training accuracy 0.99
step 8000, training accuracy 0.99
step 9000, training accuracy 0.98
2个隐层
numHiddenUnitsLayer2 = 100
trainingIterations = 10000
X = tf.placeholder(tf.float32, shape = [None, inputSize])
y = tf.placeholder(tf.float32, shape = [None, numClasses])
W1 = tf.Variable(tf.truncated_normal([inputSize, numHiddenUnits], stddev=0.1))
B1 = tf.Variable(tf.constant(0.1), [numHiddenUnits])
W2 = tf.Variable(tf.truncated_normal([numHiddenUnits, numHiddenUnitsLayer2], stddev=0.1))
B2 = tf.Variable(tf.constant(0.1), [numHiddenUnitsLayer2])
W3 = tf.Variable(tf.truncated_normal([numHiddenUnitsLayer2, numClasses], stddev=0.1))
B3 = tf.Variable(tf.constant(0.1), [numClasses])
hiddenLayerOutput = tf.matmul(X, W1) + B1
hiddenLayerOutput = tf.nn.relu(hiddenLayerOutput)
hiddenLayer2Output = tf.matmul(hiddenLayerOutput, W2) + B2
hiddenLayer2Output = tf.nn.relu(hiddenLayer2Output)
finalOutput = tf.matmul(hiddenLayer2Output, W3) + B3
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels = y, logits = finalOutput))
opt = tf.train.GradientDescentOptimizer(learning_rate = .1).minimize(loss)
correct_prediction = tf.equal(tf.argmax(finalOutput, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
for i in range(trainingIterations):
batch = mnist.train.next_batch(batchSize)
batchInput = batch[0]
batchLabels = batch[1]
_, trainingLoss = sess.run([opt, loss], feed_dict={X: batchInput, y: batchLabels})
if i%1000 ==0:
train_accuracy = accuracy.eval(session=sess, feed_dict={X: batchInput, y: batchLabels})
print("step %d, training accuracy %g"%(i, train_accuracy))
testInputs = mnist.test.images
testLabels = mnist.test.labels
acc = accuracy.eval(session=sess, feed_dict={X: testInputs, y: testLabels})
print("testing accuracy: {}".format(acc))
执行结果:
step 0, training accuracy 0.12
step 1000, training accuracy 0.93
step 2000, training accuracy 0.94
step 3000, training accuracy 1
step 4000, training accuracy 1
step 5000, training accuracy 0.98
step 6000, training accuracy 0.97
step 7000, training accuracy 1
step 8000, training accuracy 0.99
step 9000, training accuracy 1
testing accuracy: 0.9729999899864197
从执行结果来看,2个隐层的效果要好于1个隐层。