将以最简单的MNIST数据集展示如何使用Keras框架训练神经网络。
- 最简单的神经网络,在测试集上的精度达到了88.5%
- 使用全连接的神经网络,精度达到97.2%。
最简单的神经网络(输入784维,输出10维)
1.加载和处理MNIST数据集
1.1 加载数据集
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
print('shape of x_train:' + str(x_train.shape))
print('shape of x_test:' + str(x_test.shape))
print('shape of y_train:' + str(y_train.shape))
print('shape of y_test:' + str(y_test.shape))
输出mnist训练集、测试集的数据和标签的shape:
shape of x_train:(60000, 28, 28)
shape of x_test:(10000, 28, 28)
shape of y_train:(60000,)
shape of y_test:(10000,)
1.2 将28×28的images转换成784维的向量
x_train_vec = x_train.reshape(60000,784)
x_test_vec = x_test.reshape(10000,784)
print('shape of x_train_vec is' + str(x_train_vec.shape))
输出训练集数据的向量shape:
shape of x_train_vec is(60000, 784)
1.3 使用One-hot编码,将0-9的整数数字用10维的向量表示
import numpy as np
def to_one_hot(labels,dimension = 10):
results = np.zeros((len(labels),dimension))
for i, label in enumerate(labels):
results[i, label] =1.
return results
y_train_vec = to_one_hot(y_train)
y_test_vec = to_one_hot(y_test)
print('shape of y_train_vec is' + str(y_train_vec.shape))
输出训练集标签的向量shape:
shape of y_train_vec is(60000, 10)
1.4 将训练集(已有的60000个数据)划分为训练集(50000个)和验证集(10000个)
rand_indices = np.random.permutation(60000)
train_indices = rand_indices[0:50000]
valid_indices = rand_indices[50000:60000]
x_train_vec = x_train_vec[train_indices, :]
y_train_vec = y_train_vec[train_indices, :]
x_valid_vec = x_train_vec[valid_indices, :]
y_valid_vec = y_train_vec[valid_indices, :]
print('shape of x_train_vec: ' + str(x_train_vec.shape))
print('shape of y_train_vec: ' + str(y_train_vec.shape))
print('shape of x_valid_vec: ' + str(x_valid_vec.shape))
print('shape of y_valid_vec: ' + str(y_valid_vec.shape))
输出划分后的训练集和验证集的元素个数和shape
shape of x_valid_vec: (10000, 784)
shape of y_valid_vec: (10000, 10)
shape of x_train_vec: (50000, 784)
shape of y_train_vec: (50000, 10)
2.构造Softmax分类器
输入: 784维的向量 x
输出:f(x)=SoftMax(W x+b)
(其中,W: 10×784维的weights, b: 10维的bias)
from keras import models
from keras import layers
model = models.Sequential()
model.add(layers.Dense(10,activation= 'softmax', input_shape=(784,)))
#print the summary of the model
model.summary()
输出:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 10) 7850
=================================================================
Total params: 7,850
Trainable params: 7,850
Non-trainable params: 0
_________________________________________________________________
3. 训练Softmax 分类器
3.1 指定优化算法、学习率、损失函数、指标
from keras import optimizers
model.compile(optimizers.RMSprop(lr=0.0001),
loss = 'categorical_crossentropy',
metrics =['accuracy'])
3.2 指定batch_size和epochs数
history = model.fit(x_train_vec, y_train_vec,
batch_size=128, epochs =50,
validation_data = (x_valid_vec,y_valid_vec))
输出是每个epoch的loss、accuracy、val_loss、val_accuracy。
Epoch 1/50
391/391 [==============================] - 4s 3ms/step - loss: 74.4538 - accuracy: 0.3118 - val_loss: 16.1609 - val_accuracy: 0.7188
Epoch 2/50
391/391 [==============================] - 1s 2ms/step - loss: 13.4796 - accuracy: 0.7522 - val_loss: 9.5718 - val_accuracy: 0.8094
Epoch 3/50
391/391 [==============================] - 1s 2ms/step - loss: 8.9222 - accuracy: 0.8190 - val_loss: 7.6669 - val_accuracy: 0.8389
······
Epoch 50/50
391/391 [==============================] - 1s 2ms/step - loss: 1.2781 - accuracy: 0.9025 - val_loss: 1.8603 - val_accuracy: 0.8840
4. 测试结果
4.1 描点打印每个epoch的accuracy
import matplotlib.pyplot as plt
%matplotlib inline
epochs = range(50) #50 is the number of epochs
train_acc = history.history['accuracy']
valid_acc = history.history['val_accuracy']
plt.plot(epochs, train_acc, 'bo',label = 'Training Accuracy')
plt.plot(epochs, valid_acc, 'r', label = 'Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
4.2 评估模型在测试集上的表现
loss_and_acc = model.evaluate(x_test_vec, y_test_vec)
print('loss = ' + str(loss_and_acc[0]))
print('accuracy = ' + str(loss_and_acc[1]))
模型在测试集上的精度达到了88.5%
313/313 [==============================] - 1s 2ms/step
- loss: 1.9009 - accuracy: 0.8851
loss = 1.9008890390396118
accuracy = 0.8851000070571899
使用全连接神经网络(在输入与输出中间添加2个隐藏层)
1.数据加载与处理同前
2.构造Softmax分类器
from keras import models
from keras import layers
d1 = 500 #width of the 1st hidden layer
d2 = 500 #width of the 2nd hidden layer
model = models.Sequential()
model.add(layers.Dense(d1,activation='relu',input_shape=(784,)))
model.add(layers.Dense(d2,activation='relu'))
model.add(layers.Dense(10,activation='softmax'))
#print the summary of the model
model.summary()
输出model及参数个数
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_2 (Dense) (None, 500) 392500
_________________________________________________________________
dense_3 (Dense) (None, 500) 250500
_________________________________________________________________
dense_4 (Dense) (None, 10) 5010
=================================================================
Total params: 648,010
Trainable params: 648,010
Non-trainable params: 0
3.训练Softmax分类器
3.1 指定优化算法、学习率、损失函数、指标
from keras import optimizers
model.compile(optimizers.RMSprop(lr=0.0001),
loss = 'categorical_crossentropy',
metrics =['accuracy'])
3.2 指定batch_size和epochs数
history = model.fit(x_train_vec, y_train_vec,
batch_size=128, epochs =50,
validation_data = (x_valid_vec,y_valid_vec))
输出是每个epoch的loss、accuracy、val_loss、val_accuracy。
可以看到,添加2个隐藏层后,精度和表现均比之前高了许多。
Epoch 1/50
391/391 [==============================] - 3s 4ms/step - loss: 9.4749 - accuracy: 0.7422 - val_loss: 1.6772 - val_accuracy: 0.9148
Epoch 2/50
391/391 [==============================] - 1s 3ms/step - loss: 1.1193 - accuracy: 0.9351 - val_loss: 1.1648 - val_accuracy: 0.9304
Epoch 3/50
391/391 [==============================] - 1s 3ms/step - loss: 0.5274 - accuracy: 0.9589 - val_loss: 1.1609 - val_accuracy: 0.9287
······
391/391 [==============================] - 1s 3ms/step - loss: 0.0012 - accuracy: 0.9998 - val_loss: 0.6443 - val_accuracy: 0.9763
4. 测试结果
4.1 描点打印每个epoch的accuracy
import matplotlib.pyplot as plt
%matplotlib inline
epochs = range(50) #50 is the number of epochs
train_acc = history.history['accuracy']
valid_acc = history.history['val_accuracy']
plt.plot(epochs, train_acc, 'bo',label = 'Training Accuracy')
plt.plot(epochs, valid_acc, 'r', label = 'Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
4.2 评估模型在测试集上的表现
loss_and_acc = model.evaluate(x_test_vec, y_test_vec)
print('loss = ' + str(loss_and_acc[0]))
print('accuracy = ' + str(loss_and_acc[1]))
模型在测试集上的精度提升至了97.7%
313/313 [==============================] - 1s 2ms/step
- loss: 0.5598 - accuracy: 0.9770
loss = 0.5597637891769409
accuracy = 0.9769999980926514