深度学习之LeNet-5(代码实现环节)
加载数据
加载MNIST数据,该数据是预先用TensorFlow加载的。
这部分不需要做以修改。
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", reshape=False)
X_train, y_train = mnist.train.images, mnist.train.labels
X_validation, y_validation = mnist.validation.images, mnist.validation.labels
X_test, y_test = mnist.test.images, mnist.test.labels
assert(len(X_train) == len(y_train))
assert(len(X_validation) == len(y_validation))
assert(len(X_test) == len(y_test))
print()
print("Image Shape: {}".format(X_train[0].shape))
print()
print("Training Set: {} samples".format(len(X_train)))
print("Validation Set: {} samples".format(len(X_validation)))
print("Test Set: {} samples".format(len(X_test)))
TensorFlow预加载的MNIST数据为28x28x1图像。但是,LeNet架构只接受32x32xC图像,其中C是彩色通道的数量。为了将MNIST数据重新格式化为LeNet可以接受的形状,我们在数据的顶部和底部填充了两行0,在左侧和右侧填充了两列0(28+2+2 = 32)。
使用以下代码可以实现在width
和height
上维度的增加:
import numpy as np
# Pad images with 0s
X_train = np.pad(X_train, ((0,0),(2,2),(2,2),(0,0)), 'constant')
X_validation = np.pad(X_validation, ((0,0),(2,2),(2,2),(0,0)), 'constant')
X_test = np.pad(X_test, ((0,0),(2,2),(2,2),(0,0)), 'constant')
print("Updated Image Shape: {}".format(X_train[0].shape))
可视化数据
查看数据集中的示例。
import random
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
index = random.randint(0, len(X_train))
image = X_train[index].squeeze()
plt.figure(figsize=(1,1))
plt.imshow(image, cmap="gray")
plt.axis('off')
print(y_train[index])
处理数据
shuffle
训练数据。
from sklearn.utils import shuffle
X_train, y_train = shuffle(X_train, y_train)
导入Tensorflow
EPOCH
和BATCH_SIZE
影响模型训练的速度和精度。
import tensorflow as tf
EPOCHS = 10
BATCH_SIZE = 128
LeNet-5的实现
实现LeNet-5神经网络架构。
INPUT
LeNet体系结构接受一个32x32xC图像作为输入,其中C是颜色通道的数量。由于MNIST图像是灰度图像,因此在本例中C为1。
Architecture
Layer 1: 卷积. 输出大小为 28x28x6.
Activation. 你选择的激活函数。.
Pooling. stride=2的池化操作。
Layer 2: 卷积. 输出卷积大小为 10x10x16.
Activation. 你选择的激活函数。.
Pooling. Ttride=2的池化操作,池化后的大小为 5x5x16.
**Flatten.**将最后的池化层的输出形状压平,使其为1D而不是3D.tf.contrib.layers.flatten
是已经为你提供好的展平操作。
Layer 3: 全连接操作. 这样可以得到(120,).
Activation. 你选择的激活函数。
Layer 4: 全连接操作. 这样可以得到(84,).
Activation. 你选择的激活函数。
Layer 5: 全连接 (Logits). 得到(10,)
from tensorflow.contrib.layers import flatten
def LeNet(x):
# Hyperparameters
mu = 0
sigma = 0.1
layer_depth = {
'layer_1' : 6,
'layer_2' : 16,
'layer_3' : 120,
'layer_f1' : 84
}
# TODO: Layer 1: Convolutional. Input = 32x32x1. Output = 28x28x6.
conv1_w = tf.Variable(tf.truncated_normal(shape = [5,5,1,6],mean = mu, stddev = sigma))
conv1_b = tf.Variable(tf.zeros(6))
conv1 = tf.nn.conv2d(x,conv1_w, strides = [1,1,1,1], padding = 'VALID') + conv1_b
# TODO: Activation.
conv1 = tf.nn.relu(conv1)
# TODO: Pooling. Input = 28x28x6. Output = 14x14x6.
pool_1 = tf.nn.max_pool(conv1,ksize = [1,2,2,1], strides = [1,2,2,1], padding = 'VALID')
# TODO: Layer 2: Convolutional. Output = 10x10x16.
conv2_w = tf.Variable(tf.truncated_normal(shape = [5,5,6,16], mean = mu, stddev = sigma))
conv2_b = tf.Variable(tf.zeros(16))
conv2 = tf.nn.conv2d(pool_1, conv2_w, strides = [1,1,1,1], padding = 'VALID') + conv2_b
# TODO: Activation.
conv2 = tf.nn.relu(conv2)
# TODO: Pooling. Input = 10x10x16. Output = 5x5x16.
pool_2 = tf.nn.max_pool(conv2, ksize = [1,2,2,1], strides = [1,2,2,1], padding = 'VALID')
# TODO: Flatten. Input = 5x5x16. Output = 400.
fc1 = flatten(pool_2)
# TODO: Layer 3: Fully Connected. Input = 400. Output = 120.
fc1_w = tf.Variable(tf.truncated_normal(shape = (400,120), mean = mu, stddev = sigma))
fc1_b = tf.Variable(tf.zeros(120))
fc1 = tf.matmul(fc1,fc1_w) + fc1_b
# TODO: Activation.
fc1 = tf.nn.relu(fc1)
# TODO: Layer 4: Fully Connected. Input = 120. Output = 84.
fc2_w = tf.Variable(tf.truncated_normal(shape = (120,84), mean = mu, stddev = sigma))
fc2_b = tf.Variable(tf.zeros(84))
fc2 = tf.matmul(fc1,fc2_w) + fc2_b
# TODO: Activation.
fc2 = tf.nn.relu(fc2)
# TODO: Layer 5: Fully Connected. Input = 84. Output = 10.
fc3_w = tf.Variable(tf.truncated_normal(shape = (84,10), mean = mu , stddev = sigma))
fc3_b = tf.Variable(tf.zeros(10))
logits = tf.matmul(fc2, fc3_w) + fc3_b
return logits
特征和标签
训练LeNet对MNIST数据进行分类。x是一批输入图像的占位符。y是一批输出标签的占位符。
x = tf.placeholder(tf.float32, (None, 32, 32, 1))
y = tf.placeholder(tf.int32, (None))
one_hot_y = tf.one_hot(y, 10)
训练管线
创建一个使用该模型对MNIST数据进行分类的培训管道。
rate = 0.001
logits = LeNet(x)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = one_hot_y)
loss_operation = tf.reduce_mean(cross_entropy)
optimizer = tf.train.AdamOptimizer(learning_rate = rate)
training_operation = optimizer.minimize(loss_operation)
模型评估
评估给定数据集模型的损失和准确性。
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(one_hot_y, 1))
accuracy_operation = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
saver = tf.train.Saver()
def evaluate(X_data, y_data):
num_examples = len(X_data)
total_accuracy = 0
sess = tf.get_default_session()
for offset in range(0, num_examples, BATCH_SIZE):
batch_x, batch_y = X_data[offset:offset+BATCH_SIZE], y_data[offset:offset+BATCH_SIZE]
accuracy = sess.run(accuracy_operation, feed_dict={x: batch_x, y: batch_y})
total_accuracy += (accuracy * len(batch_x))
return total_accuracy / num_examples
训练模型
通过训练管道运行训练数据来训练模型。
在每个epoch之前,洗牌训练集。
在每个epoch之后,测量验证集的损失和准确性。
培训后保存模型。
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
num_examples = len(X_train)
print("Training...")
print()
for i in range(EPOCHS):
X_train, y_train = shuffle(X_train, y_train)
for offset in range(0, num_examples, BATCH_SIZE):
end = offset + BATCH_SIZE
batch_x, batch_y = X_train[offset:end], y_train[offset:end]
sess.run(training_operation, feed_dict={x: batch_x, y: batch_y})
validation_accuracy = evaluate(X_validation, y_validation)
print("EPOCH {} ...".format(i+1))
print("Validation Accuracy = {:.3f}".format(validation_accuracy))
print()
saver.save(sess, './lenet')
print("Model saved")
预测模型
一旦你对你的模型完全满意,评估模型在测试集上的性能。
确保只做一次!
如果您要在测试集上度量您训练的模型的性能,然后改进您的模型,然后再在测试集上度量您的模型的性能,这将使您的测试结果无效。您无法真正地度量您的模型对真实数据的执行情况。
with tf.Session() as sess:
saver.restore(sess, tf.train.latest_checkpoint('.'))
test_accuracy = evaluate(X_test, y_test)
print("Test Accuracy = {:.3f}".format(test_accuracy))
【参考】:link.