写在前面
没有系统的学习过 TensorFlow,现在开始从零学习 TensorFlow。
下面的示例是摘抄自教程,根据 TensorFlow 2.0 进行了一些修改。
主题
如何使用前面的数据集来解决 TensorFlow 手写数字(MNIST)的分类问题。
网络结构
将构建一个五层网络:第一层到第四层是 Sigmoid 结构,第五层是 softmax 激活函数。请记住,定义此网络是为了激活它是一组正值,总和等于
1
1
1。这意味着输出的第
j
j
j 个值是与网络输入对应的类
j
j
j 的概率。前馈网络架构如下图所示:
为了确定网络的适当大小(即,层中的神经元或单元的数量),即隐藏层的数量和每层神经元的数量,通常我们依赖于一般的经验标准,个人经验或适当的测试。这些是需要调整的一些超参数。
下图总结了已实现的网络架构。它显示了每层神经元的数量,以及相应的激活函数:
前四层的激活函数是 Sigmoid 函数。激活函数的最后一层始终是 softmax,因为网络的输出必须表示输入数字的概率。通常,中间层的数量和大小会极大地影响的网络表现:
- 以积极的方式,因为在这些层上是基于网络推广的能力,并检测输入的特殊特征
- 以负面的方式,因为如果网络是冗余的,那么它会不必要地减轻学习阶段的负担
代码
操作系统 : Win10
Python : 3.6.8
TensorFlow : 1.14
import tensorflow.compat.v1 as tf
from tensorflow.examples.tutorials.mnist import input_data
import math
from tensorflow.python.framework import ops
import random
import os
tf.disable_v2_behavior()
logs_path = 'E:\zhouyi\graph' # logging path
batch_size = 100 # batch size while performing training
learning_rate = 0.003 # Learning rate
training_epochs = 10 # training epoch
display_epoch = 1
dataPath = "temp/"
if not os.path.exists(dataPath):
os.makedirs(dataPath)
mnist = input_data.read_data_sets(dataPath, one_hot=True) # MNIST to be downloaded
X = tf.placeholder(tf.float32, [None, 784], name='InputData') # image shape 28*28=784
XX = tf.reshape(X, [-1, 784]) # reshape input
Y_ = tf.placeholder(tf.float32, [None, 10], name='LabelData') # 0-9 digits => 10 classes
L=200
M=100
N=60
O=30
W1 = tf.Variable(tf.truncated_normal([784, L], stddev=0.1)) # Initialize random weights for the hidden layer 1
B1 = tf.Variable(tf.zeros([L])) # Bias vector for layer 1
Y1 = tf.nn.sigmoid(tf.matmul(XX, W1) + B1) # Output from layer 1
W2 = tf.Variable(tf.truncated_normal([L, M], stddev=0.1)) # Initialize random weights for the hidden layer 2
B2 = tf.Variable(tf.ones([M])) # Bias vector for layer 2
Y2 = tf.nn.sigmoid(tf.matmul(Y1, W2) + B2) # Output from layer 2
W3 = tf.Variable(tf.truncated_normal([M, N], stddev=0.1)) # Initialize random weights for the hidden layer 3
B3 = tf.Variable(tf.ones([N])) # Bias vector for layer 3
Y3 = tf.nn.sigmoid(tf.matmul(Y2, W3) + B3) # Output from layer 3
W4 = tf.Variable(tf.truncated_normal([N, O], stddev=0.1)) # Initialize random weights for the hidden layer 4
B4 = tf.Variable(tf.ones([O])) # Bias vector for layer 4
Y4 = tf.nn.sigmoid(tf.matmul(Y3, W4) + B4) # Output from layer 4
W5 = tf.Variable(tf.truncated_normal([O, 10], stddev=0.1)) # Initialize random weights for the hidden layer 5
B5 = tf.Variable(tf.ones([10])) # Bias vector for layer 5
Ylogits = tf.matmul(Y4, W5) + B5 # computing the logits
Y = tf.nn.softmax(Ylogits)# output from layer 5
cross_entropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=Ylogits, labels=Y) # final outcome using softmax cross entropy
cost_op = tf.reduce_mean(cross_entropy)*100
correct_prediction = tf.equal(tf.argmax(Y, 1), tf.argmax(Y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# Optimization op (backprop)
train_op = tf.train.AdamOptimizer(learning_rate).minimize(cost_op)
# Create a summary to monitor cost tensor
tf.summary.scalar("cost", cost_op)
# Create a summary to monitor accuracy tensor
tf.summary.scalar("accuracy", accuracy)
# Merge all summaries into a single op
summary_op = tf.summary.merge_all()
init = tf.initialize_all_variables()
with tf.Session() as sess:
# Run the initializer
sess.run(init)
# op to write logs to TensorBoard
writer = tf.summary.FileWriter(logs_path, graph=tf.get_default_graph())
for epoch in range(training_epochs):
batch_count = int(mnist.train.num_examples/batch_size)
for i in range(batch_count):
batch_x, batch_y = mnist.train.next_batch(batch_size)
_,summary = sess.run([train_op, summary_op], feed_dict={X: batch_x, Y_: batch_y})
writer.add_summary(summary, epoch * batch_count + i)
print("Epoch: ", epoch)
print("Optimization Finished!")
print("Accuracy: ", accuracy.eval(feed_dict={X: mnist.test.images, Y_: mnist.test.labels}))
尴尬,这次运行结果非常差。可能的原因是信息太少或图像质量差。
使用TensorBoard
先在 Win10 命令行运行
tensorboard --logdir="e:\zhouyi\graph"
然后打开浏览器(Chorme或者Edge都可以),输入 localhost:6060,就可以看到下面的图片。
和参考书上的演示图片差距好大。