如何为MNIST手写数字分类开发CNN

最新推荐文章于 2023-01-04 11:00:25 发布

架构师小秘圈

最新推荐文章于 2023-01-04 11:00:25 发布

阅读量426

点赞数

导言

MNIST手写数字分类问题是计算机视觉和深度学习中使用的标准数据集。

虽然数据集得到了有效的解决，但它可以作为学习和实践如何开发，评估和使用卷积深度学习神经网络从头开始进行图像分类的基础。这包括如何开发一个强大的测试工具来估计模型的性能，如何探索模型的改进，以及如何保存模型，然后加载它以对新数据进行预测。

在本教程中，您将了解如何从头开始为手写数字分类开发卷积神经网络。

完成本教程后，您将了解：

您可以在我的新书中发现如何开发用于物体检测，人脸识别等的深度学习模型。它包含30个完整源代码的教程。

让我们开始吧。

640?wx_fmt=jpeg

教程概述

本教程分为五个部分; 他们是：

MNIST手写数字分类数据集
模型评估方法
如何建立基线模型
如何开发改进模型
如何敲定模型并做出预测

MNIST手写数字分类数据集

该MNIST数据集是代表标准和技术数据集的改良研究所的缩写。

它是一个60,000个小方形28×28像素灰度图像的数据集，手写单位数在0到9之间。

任务是将手写数字的给定图像分类为表示0到9的整数值的10个类之一，包括在内。

它是一个被广泛使用和深入理解的数据集，并且在大多数情况下是“已解决的”。表现最佳的模型是深度学习卷积神经网络，其分类精度达到99％以上，误差率在0.4％和0.2之间。保留测试数据集的％。

下面的示例使用Keras API加载MNIST数据集，并创建训练数据集中前九个图像的图。

# example of loading the mnist datasetfrom keras.datasets import mnistfrom matplotlib import pyplot# load dataset(trainX, trainy), (testX, testy) = mnist.load_data()# summarize loaded datasetprint('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))print('Test: X=%s, y=%s' % (testX.shape, testy.shape))# plot first few imagesfor i in range(9):    # define subplot    pyplot.subplot(330 + 1 + i)    # plot raw pixel data    pyplot.imshow(trainX[i], cmap=pyplot.get_cmap('gray'))# show the figurepyplot.show()
from keras.datasets import mnist
from matplotlib import pyplot
# load dataset
(trainX, trainy), (testX, testy) = mnist.load_data()
# summarize loaded dataset
print('Train: X=%s, y=%s' % (trainX.shape, trainy.shape))
print('Test: X=%s, y=%s' % (testX.shape, testy.shape))
# plot first few images
for i in range(9):
    # define subplot
    pyplot.subplot(330 + 1 + i)
    # plot raw pixel data
    pyplot.imshow(trainX[i], cmap=pyplot.get_cmap('gray'))
# show the figure
pyplot.show()

我们可以看到训练数据集中有60,000个示例，测试数据集中有10,000个示例，并且图像确实是28×28像素的正方形。运行该示例将加载MNIST列车和测试数据集并打印其形状。

Train: X=(60000, 28, 28), y=(60000,)Test: X=(10000, 28, 28), y=(10000,)28, 28), y=(60000,)Test: X=(10000, 28, 28), y=(10000,)

还创建了数据集中前九个图像的图，显示了要分类的图像的自然手写特性。

640?wx_fmt=png

模型评估方法

虽然MNIST数据集得到了有效的解决，但它可以成为开发和实践使用卷积神经网络解决图像分类任务的方法的有用起点。

我们可以从头开发一个新模型，而不是回顾有关数据集上表现良好的模型的文献。

数据集已经有一个明确定义的列车和测试数据集，我们可以使用。

为了估计给定训练运行的模型的性能，我们可以进一步将训练集分成训练集和验证数据集。然后可以绘制列车上的性能和每次运行的验证数据集，以提供学习曲线并深入了解模型学习问题的程度。

Keras API通过在训练模型时指定model.fit（）函数的“ validation_data ”参数来支持这一点，该参数将返回描述每个训练时期所选损失和度量的模型性能的对象。

# record model performance on a validation dataset during traininghistory = model.fit(..., validation_data=(valX, valY))
history = model.fit(..., validation_data=(valX, valY))

我们可以使用scikit-learn API中的KFold类来实现给定神经网络模型的k折交叉验证评估。有很多方法可以实现这一点，尽管我们可以选择一种灵活的方法，其中KFold类仅用于指定用于每个spit的行索引。为了估计模型在一般问题上的性能，我们可以使用k折交叉验证，也许是五重交叉验证。这将在一定程度上考虑模型方差，包括训练和测试数据集的差异，以及学习算法的随机性质。在给定标准偏差的情况下，模型的性能可以被视为k倍的平均性能，如果需要，可以用于估计置信区间。

我们可以使用scikit-learn API中的KFold类来实现给定神经网络模型的k折交叉验证评估。有很多方法可以实现这一点，尽管我们可以选择一种灵活的方法，其中KFold类仅用于指定用于每个spit的行索引。

# example of k-fold cv for a neural netdata = ...model = ...# prepare cross validationkfold = KFold(5, shuffle=True, random_state=1)# enumerate splitsfor train_ix, test_ix in kfold.split(data):...

data = ...

model = ...

# prepare cross validation

kfold = KFold(5, shuffle=True, random_state=1)

# enumerate splits

for train_ix, test_ix in kfold.split(data):

...

我们将阻止实际测试数据集并将其用作我们最终模型的评估。

如何建立基线模型

第一步是开发基线模型。

这一点至关重要，因为它既包括为测试工具开发基础架构，也可以在数据集上评估我们设计的任何模型，并在模型性能问题上建立基线，通过该基线可以比较所有改进。

测试线束的设计是模块化的，我们可以为每个部件开发单独的功能。如果需要，这允许测试线束的给定方面被修改或相互改变，与其余部分分开。

我们可以用五个关键元素开发这个测试工具。它们是数据集的加载，数据集的准备，模型的定义，模型的评估以及结果的呈现。

加载数据集

我们知道有关数据集的一些事情。

例如，我们知道图像都是预先对齐的（例如，每个图像仅包含手绘数字），图像都具有相同的28×28像素的正方形尺寸，并且图像是灰度的。

因此，我们可以加载图像并重塑数据阵列以具有单个颜色通道。

# load dataset(trainX, trainY), (testX, testY) = mnist.load_data()# reshape dataset to have a single channeltrainX = trainX.reshape((trainX.shape[0], 28, 28, 1))testX = testX.reshape((testX.shape[0], 28, 28, 1))
(trainX, trainY), (testX, testY) = mnist.load_data()
# reshape dataset to have a single channel
trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
testX = testX.reshape((testX.shape[0], 28, 28, 1))

我们也知道有10个类，并且这些类表示为唯一整数。

因此，我们可以对每个样本的类元素使用一个热编码，将整数转换为10元素二进制向量，其中1表示类值的索引，0值表示所有其他类。我们可以使用to_categorical（）实用程序函数来实现这一点。

# one hot encode target valuestrainY = to_categorical(trainY)testY = to_categorical(testY)
trainY = to_categorical(trainY)
testY = to_categorical(testY)

所述load_dataset（）函数实现这些行为，并且可以用于加载数据集。

准备像素数据

我们知道数据集中每个图像的像素值是黑色和白色之间的无符号整数，或0到255。

我们不知道缩放建模像素值的最佳方法，但我们知道需要进行一些缩放。

一个很好的起点是标准化灰度图像的像素值，例如将它们重新调整到范围[0,1]。这涉及首先将数据类型从无符号整数转换为浮点数，然后将像素值除以最大值。

# converrom integers to floatstrain_norm = train.astype('float32')test_norm = test.astype('float32')# normalize to range 0-1train_norm = train_norm / 255.0test_norm = test_norm / 255.0

train_norm = train.astype('float32')

test_norm = test.astype('float32')

# normalize to range 0-1

train_norm = train_norm / 255.0

test_norm = test_norm / 255.0

下面的prep_pixels（）函数实现了这些行为，并提供了需要缩放的训练和测试数据集的像素值。

# scale pixelsdef prep_pixels(train, test):    # convert from integers to floats    train_norm = train.astype('float32')    test_norm = test.astype('float32')    # normalize to range 0-1    train_norm = train_norm / 255.0    test_norm = test_norm / 255.0    # return normalized images    return train_norm, test_norm
def prep_pixels(train, test):
    # convert from integers to floats
    train_norm = train.astype('float32')
    test_norm = test.astype('float32')
    # normalize to range 0-1
    train_norm = train_norm / 255.0
    test_norm = test_norm / 255.0
    # return normalized images
    return train_norm, test_norm

必须调用此函数以在任何建模之前准备像素值。

定义模型

接下来，我们需要为问题定义基线卷积神经网络模型。

该模型有两个主要方面：由卷积和池化层组成的特征提取前端，以及将进行预测的分类器后端。

对于卷积前端，我们可以从具有小滤波器大小（3,3）和适度数量的滤波器（32）的单个卷积层开始，然后是最大池化层。然后可以展平过滤器贴图以向分类器提供特征。

鉴于问题是多类分类任务，我们知道我们将需要具有10个节点的输出层，以便预测属于10个类中的每个类的图像的概率分布。这还需要使用softmax激活功能。在特征提取器和输出层之间，我们可以添加一个密集层来解释特征，在这种情况下有100个节点。

所有层都将使用ReLU激活函数和He权重初始化方案，这两种方法都是最佳实践。

我们将使用随机梯度下降优化器的保守配置，学习率为0.01，动量为0.9。分类交叉熵损失函数将被优化，适用于多类分类，并且我们将监视分类准确度度量，这是适当的，因为我们在10个类别中的每一个中具有相同数量的示例。

下面的define_model（）函数将定义并返回此模型。

# define cnn modeldef define_model():    model = Sequential()    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))    model.add(MaxPooling2D((2, 2)))    model.add(Flatten())    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))    model.add(Dense(10, activation='softmax'))    # compile model    opt = SGD(lr=0.01, momentum=0.9)    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])    return model
def define_model():
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
    model.add(Dense(10, activation='softmax'))
    # compile model
    opt = SGD(lr=0.01, momentum=0.9)
    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
    return model

评估模型

定义模型后，我们需要对其进行评估。

该模型将使用五重交叉验证进行评估。选择k = 5的值以提供重复评估的基线，并且不大到需要长的运行时间。每个测试集将是训练数据集的20％，或大约12,000个示例，接近此问题的实际测试集的大小。

在分割之前对训练数据集进行混洗，并且每次执行样本混洗，以便我们评估的任何模型在每个折叠中具有相同的训练和测试数据集，从而在模型之间进行逐个比较。

我们将针对适度的10个训练时期训练基线模型，默认批量大小为32个示例。每个折叠的测试集将用于在训练运行的每个时期期间评估模型，以便我们稍后可以创建学习曲线，并且在运行结束时，以便我们可以估计模型的性能。因此，我们将跟踪每次运行的结果历史记录，以及折叠的分类准确性。

下面的evaluate_model（）函数实现了这些行为，将定义的模型和训练数据集作为参数，并返回可以在以后汇总的准确度分数和训练历史列表。

# evaluate a model using k-fold cross-validationdef evaluate_model(model, dataX, dataY, n_folds=5):    scores, histories = list(), list()    # prepare cross validation    kfold = KFold(n_folds, shuffle=True, random_state=1)    # enumerate splits    for train_ix, test_ix in kfold.split(dataX):        # select rows for train and test        trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]        # fit model        history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)        # evaluate model        _, acc = model.evaluate(testX, testY, verbose=0)        print('> %.3f' % (acc * 100.0))        # stores scores        scores.append(acc)        histories.append(history)    return scores, histories
def evaluate_model(model, dataX, dataY, n_folds=5):
    scores, histories = list(), list()
    # prepare cross validation
    kfold = KFold(n_folds, shuffle=True, random_state=1)
    # enumerate splits
    for train_ix, test_ix in kfold.split(dataX):
        # select rows for train and test
        trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]
        # fit model
        history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)
        # evaluate model
        _, acc = model.evaluate(testX, testY, verbose=0)
        print('> %.3f' % (acc * 100.0))
        # stores scores
        scores.append(acc)
        histories.append(history)
    return scores, histories

目前的结果

评估模型后，我们可以显示结果。

有两个关键方面：训练期间模型学习行为的诊断和模型性能的估计。这些可以使用单独的功能来实现。

首先，诊断涉及创建线图，显示在k折交叉验证的每个折叠期间列车和测试集上的模型性能。这些图对于了解模型是否过度拟合，欠拟合或者是否适合数据集非常有用。

我们将创建一个带有两个子图的单个图，一个用于丢失，一个用于精度。蓝线表示训练数据集的模型性能，橙色线表示保持测试数据集的性能。下面的summarize_diagnostics（）函数根据收集的训练历史创建并显示该图。

# plot diagnostic learning curvesdef summarize_diagnostics(histories):    for i in range(len(histories)):        # plot loss        pyplot.subplot(211)        pyplot.title('Cross Entropy Loss')        pyplot.plot(histories[i].history['loss'], color='blue', label='train')        pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')        # plot accuracy        pyplot.subplot(212)        pyplot.title('Classification Accuracy')        pyplot.plot(histories[i].history['acc'], color='blue', label='train')        pyplot.plot(histories[i].history['val_acc'], color='orange', label='test')    pyplot.show()
def summarize_diagnostics(histories):
    for i in range(len(histories)):
        # plot loss
        pyplot.subplot(211)
        pyplot.title('Cross Entropy Loss')
        pyplot.plot(histories[i].history['loss'], color='blue', label='train')
        pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')
        # plot accuracy
        pyplot.subplot(212)
        pyplot.title('Classification Accuracy')
        pyplot.plot(histories[i].history['acc'], color='blue', label='train')
        pyplot.plot(histories[i].history['val_acc'], color='orange', label='test')
    pyplot.show()

接下来，可以通过计算平均值和标准偏差来总结在每次折叠期间收集的分类准确度分数。这提供了对在该数据集上训练的模型的平均预期性能的估计，并且估计了平均值的平均方差。我们还将通过创建和显示框和胡须图来总结分数的分布。

下面的summarize_performance（）函数为模型评估期间收集的给定分数列表实现此功能。

# summarize model performancedef summarize_performance(scores):    # print summary    print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))    # box and whisker plots of results    pyplot.boxplot(scores)    pyplot.show()
def summarize_performance(scores):
    # print summary
    print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))
    # box and whisker plots of results
    pyplot.boxplot(scores)
    pyplot.show()

完整的例子

我们需要一个能够驱动测试工具的功能。

这涉及调用所有定义函数。

# run the test harness for evaluating a modeldef run_test_harness():    # load dataset    trainX, trainY, testX, testY = load_dataset()    # prepare pixel data    trainX, testX = prep_pixels(trainX, testX)    # define model    model = define_model()    # evaluate model    scores, histories = evaluate_model(model, trainX, trainY)    # learning curves    summarize_diagnostics(histories)    # summarize estimated performance    summarize_performance(scores)
def run_test_harness():
    # load dataset
    trainX, trainY, testX, testY = load_dataset()
    # prepare pixel data
    trainX, testX = prep_pixels(trainX, testX)
    # define model
    model = define_model()
    # evaluate model
    scores, histories = evaluate_model(model, trainX, trainY)
    # learning curves
    summarize_diagnostics(histories)
    # summarize estimated performance
    summarize_performance(scores)

我们现在拥有我们需要的一切; 下面列出了MNIST数据集上基线卷积神经网络模型的完整代码示例。

# baseline cnn model for mnistfrom numpy import meanfrom numpy import stdfrom matplotlib import pyplotfrom sklearn.model_selection import KFoldfrom keras.datasets import mnistfrom keras.utils import to_categoricalfrom keras.models import Sequentialfrom keras.layers import Conv2Dfrom keras.layers import MaxPooling2Dfrom keras.layers import Densefrom keras.layers import Flattenfrom keras.optimizers import SGD# load train and test datasetdef load_dataset():    # load dataset    (trainX, trainY), (testX, testY) = mnist.load_data()    # reshape dataset to have a single channel    trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))    testX = testX.reshape((testX.shape[0], 28, 28, 1))    # one hot encode target values    trainY = to_categorical(trainY)    testY = to_categorical(testY)    return trainX, trainY, testX, testY# scale pixelsdef prep_pixels(train, test):    # convert from integers to floats    train_norm = train.astype('float32')    test_norm = test.astype('float32')    # normalize to range 0-1    train_norm = train_norm / 255.0    test_norm = test_norm / 255.0    # return normalized images    return train_norm, test_norm# define cnn modeldef define_model():    model = Sequential()    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))    model.add(MaxPooling2D((2, 2)))    model.add(Flatten())    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))    model.add(Dense(10, activation='softmax'))    # compile model    opt = SGD(lr=0.01, momentum=0.9)    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])    return model# evaluate a model using k-fold cross-validationdef evaluate_model(model, dataX, dataY, n_folds=5):    scores, histories = list(), list()    # prepare cross validation    kfold = KFold(n_folds, shuffle=True, random_state=1)    # enumerate splits    for train_ix, test_ix in kfold.split(dataX):        # select rows for train and test        trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]        # fit model        history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)        # evaluate model        _, acc = model.evaluate(testX, testY, verbose=0)        print('> %.3f' % (acc * 100.0))        # stores scores        scores.append(acc)        histories.append(history)    return scores, histories# plot diagnostic learning curvesdef summarize_diagnostics(histories):    for i in range(len(histories)):        # plot loss        pyplot.subplot(211)        pyplot.title('Cross Entropy Loss')        pyplot.plot(histories[i].history['loss'], color='blue', label='train')        pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')        # plot accuracy        pyplot.subplot(212)        pyplot.title('Classification Accuracy')        pyplot.plot(histories[i].history['acc'], color='blue', label='train')        pyplot.plot(histories[i].history['val_acc'], color='orange', label='test')    pyplot.show()# summarize model performancedef summarize_performance(scores):    # print summary    print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))    # box and whisker plots of results    pyplot.boxplot(scores)    pyplot.show()# run the test harness for evaluating a modeldef run_test_harness():    # load dataset    trainX, trainY, testX, testY = load_dataset()    # prepare pixel data    trainX, testX = prep_pixels(trainX, testX)    # define model    model = define_model()    # evaluate model    scores, histories = evaluate_model(model, trainX, trainY)    # learning curves    summarize_diagnostics(histories)    # summarize estimated performance    summarize_performance(scores)# entry point, run the test harnessrun_test_harness()
from numpy import mean
from numpy import std
from matplotlib import pyplot
from sklearn.model_selection import KFold
from keras.datasets import mnist
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
from keras.optimizers import SGD

# load train and test dataset
def load_dataset():
    # load dataset
    (trainX, trainY), (testX, testY) = mnist.load_data()
    # reshape dataset to have a single channel
    trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
    testX = testX.reshape((testX.shape[0], 28, 28, 1))
    # one hot encode target values
    trainY = to_categorical(trainY)
    testY = to_categorical(testY)
    return trainX, trainY, testX, testY

# scale pixels
def prep_pixels(train, test):
    # convert from integers to floats
    train_norm = train.astype('float32')
    test_norm = test.astype('float32')
    # normalize to range 0-1
    train_norm = train_norm / 255.0
    test_norm = test_norm / 255.0
    # return normalized images
    return train_norm, test_norm

# define cnn model
def define_model():
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
    model.add(Dense(10, activation='softmax'))
    # compile model
    opt = SGD(lr=0.01, momentum=0.9)
    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
    return model

# evaluate a model using k-fold cross-validation
def evaluate_model(model, dataX, dataY, n_folds=5):
    scores, histories = list(), list()
    # prepare cross validation
    kfold = KFold(n_folds, shuffle=True, random_state=1)
    # enumerate splits
    for train_ix, test_ix in kfold.split(dataX):
        # select rows for train and test
        trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]
        # fit model
        history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)
        # evaluate model
        _, acc = model.evaluate(testX, testY, verbose=0)
        print('> %.3f' % (acc * 100.0))
        # stores scores
        scores.append(acc)
        histories.append(history)
    return scores, histories

# plot diagnostic learning curves
def summarize_diagnostics(histories):
    for i in range(len(histories)):
        # plot loss
        pyplot.subplot(211)
        pyplot.title('Cross Entropy Loss')
        pyplot.plot(histories[i].history['loss'], color='blue', label='train')
        pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')
        # plot accuracy
        pyplot.subplot(212)
        pyplot.title('Classification Accuracy')
        pyplot.plot(histories[i].history['acc'], color='blue', label='train')
        pyplot.plot(histories[i].history['val_acc'], color='orange', label='test')
    pyplot.show()

# summarize model performance
def summarize_performance(scores):
    # print summary
    print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))
    # box and whisker plots of results
    pyplot.boxplot(scores)
    pyplot.show()

# run the test harness for evaluating a model
def run_test_harness():
    # load dataset
    trainX, trainY, testX, testY = load_dataset()
    # prepare pixel data
    trainX, testX = prep_pixels(trainX, testX)
    # define model
    model = define_model()
    # evaluate model
    scores, histories = evaluate_model(model, trainX, trainY)
    # learning curves
    summarize_diagnostics(histories)
    # summarize estimated performance
    summarize_performance(scores)

# entry point, run the test harness
run_test_harness()

运行该示例将打印交叉验证过程的每个折叠的分类准确性。这有助于了解模型评估正在进行中。

我们可以看到两种情况，即模型达到了完美的技能，一种情况下，它达到了低于99％的准确度。这些都是好结果。

> 98.558> 99.842> 99.992> 100.000> 100.000
> 99.842
> 99.992
> 100.000
> 100.000

接下来，显示诊断图，深入了解每个折叠的模型的学习行为。

在这种情况下，我们可以看到模型通常达到了良好的拟合，火车和测试学习曲线趋于一致。没有明显的过度或不合适的迹象。

640?wx_fmt=png

接下来，计算模型性能的摘要。我们可以看到，在这种情况下，该模型具有约99.6％的估计技能，这是令人印象深刻的，尽管它具有约0.5％的高标准偏差。

Accuracy: mean=99.678 std=0.563, n=50.563, n=5

最后，创建一个盒子和须状图来总结准确度分数的分布。

640?wx_fmt=png

正如我们所预期的那样，分布非常紧密，准确度高于99.8％，其中有一个异常值。

我们现在拥有强大的测试工具和良好的基线模型。

如何开发改进模型

我们可以通过多种方式探索基线模型的改进。

我们将研究通常会导致改进的模型配置区域，即所谓的低挂果。第一个是学习算法的改变，第二个是模型深度的增加。

改善学习

学习算法的许多方面可以被探索以进行改进。

也许最大的杠杆点是学习率，例如评估较小或较大学习率值可能产生的影响，以及在培训期间改变学习率的时间表。

另一种可以快速加速模型学习并可以带来大量性能改进的方法是批量标准化。我们将评估批量标准化对我们的基线模型的影响。

在卷积和完全连接的层之后可以使用批量归一化。它具有改变层输出分布的效果，特别是通过标准化输出。这具有稳定和加速学习过程的效果。

我们可以更新模型定义，以便在基线模型的卷积和密集层的激活函数之后使用批量标准化。下面列出了具有批量规范化的define_model（）函数的更新版本。

下面提供了此更改的完整代码清单。# define cnn modeldef define_model():    model = Sequential()    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))    model.add(BatchNormalization())    model.add(MaxPooling2D((2, 2)))    model.add(Flatten())    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))    model.add(BatchNormalization())    model.add(Dense(10, activation='softmax'))    # compile model    opt = SGD(lr=0.01, momentum=0.9)    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])    return model
def define_model():
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
    model.add(BatchNormalization())
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
    model.add(BatchNormalization())
    model.add(Dense(10, activation='softmax'))
    # compile model
    opt = SGD(lr=0.01, momentum=0.9)
    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
    return model

下面提供了此更改的完整代码清单。

# cnn model with batch normalization for mnistfrom numpy import meanfrom numpy import stdfrom matplotlib import pyplotfrom sklearn.model_selection import KFoldfrom keras.datasets import mnistfrom keras.utils import to_categoricalfrom keras.models import Sequentialfrom keras.layers import Conv2Dfrom keras.layers import MaxPooling2Dfrom keras.layers import Densefrom keras.layers import Flattenfrom keras.optimizers import SGDfrom keras.layers import BatchNormalization# load train and test datasetdef load_dataset():    # load dataset    (trainX, trainY), (testX, testY) = mnist.load_data()    # reshape dataset to have a single channel    trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))    testX = testX.reshape((testX.shape[0], 28, 28, 1))    # one hot encode target values    trainY = to_categorical(trainY)    testY = to_categorical(testY)    return trainX, trainY, testX, testY# scale pixelsdef prep_pixels(train, test):    # convert from integers to floats    train_norm = train.astype('float32')    test_norm = test.astype('float32')    # normalize to range 0-1    train_norm = train_norm / 255.0    test_norm = test_norm / 255.0    # return normalized images    return train_norm, test_norm# define cnn modeldef define_model():    model = Sequential()    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))    model.add(BatchNormalization())    model.add(MaxPooling2D((2, 2)))    model.add(Flatten())    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))    model.add(BatchNormalization())    model.add(Dense(10, activation='softmax'))    # compile model    opt = SGD(lr=0.01, momentum=0.9)    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])    return model# evaluate a model using k-fold cross-validationdef evaluate_model(model, dataX, dataY, n_folds=5):    scores, histories = list(), list()    # prepare cross validation    kfold = KFold(n_folds, shuffle=True, random_state=1)    # enumerate splits    for train_ix, test_ix in kfold.split(dataX):        # select rows for train and test        trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]        # fit model        history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)        # evaluate model        _, acc = model.evaluate(testX, testY, verbose=0)        print('> %.3f' % (acc * 100.0))        # stores scores        scores.append(acc)        histories.append(history)    return scores, histories# plot diagnostic learning curvesdef summarize_diagnostics(histories):    for i in range(len(histories)):        # plot loss        pyplot.subplot(211)        pyplot.title('Cross Entropy Loss')        pyplot.plot(histories[i].history['loss'], color='blue', label='train')        pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')        # plot accuracy        pyplot.subplot(212)        pyplot.title('Classification Accuracy')        pyplot.plot(histories[i].history['acc'], color='blue', label='train')        pyplot.plot(histories[i].history['val_acc'], color='orange', label='test')    pyplot.show()# summarize model performancedef summarize_performance(scores):    # print summary    print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))    # box and whisker plots of results    pyplot.boxplot(scores)    pyplot.show()# run the test harness for evaluating a modeldef run_test_harness():    # load dataset    trainX, trainY, testX, testY = load_dataset()    # prepare pixel data    trainX, testX = prep_pixels(trainX, testX)    # define model    model = define_model()    # evaluate model    scores, histories = evaluate_model(model, trainX, trainY)    # learning curves    summarize_diagnostics(histories)    # summarize estimated performance    summarize_performance(scores)# entry point, run the test harnessrun_test_harness()
from numpy import mean
from numpy import std
from matplotlib import pyplot
from sklearn.model_selection import KFold
from keras.datasets import mnist
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
from keras.optimizers import SGD
from keras.layers import BatchNormalization

# load train and test dataset
def load_dataset():
    # load dataset
    (trainX, trainY), (testX, testY) = mnist.load_data()
    # reshape dataset to have a single channel
    trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
    testX = testX.reshape((testX.shape[0], 28, 28, 1))
    # one hot encode target values
    trainY = to_categorical(trainY)
    testY = to_categorical(testY)
    return trainX, trainY, testX, testY

# scale pixels
def prep_pixels(train, test):
    # convert from integers to floats
    train_norm = train.astype('float32')
    test_norm = test.astype('float32')
    # normalize to range 0-1
    train_norm = train_norm / 255.0
    test_norm = test_norm / 255.0
    # return normalized images
    return train_norm, test_norm

# define cnn model
def define_model():
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
    model.add(BatchNormalization())
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
    model.add(BatchNormalization())
    model.add(Dense(10, activation='softmax'))
    # compile model
    opt = SGD(lr=0.01, momentum=0.9)
    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
    return model

# evaluate a model using k-fold cross-validation
def evaluate_model(model, dataX, dataY, n_folds=5):
    scores, histories = list(), list()
    # prepare cross validation
    kfold = KFold(n_folds, shuffle=True, random_state=1)
    # enumerate splits
    for train_ix, test_ix in kfold.split(dataX):
        # select rows for train and test
        trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]
        # fit model
        history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)
        # evaluate model
        _, acc = model.evaluate(testX, testY, verbose=0)
        print('> %.3f' % (acc * 100.0))
        # stores scores
        scores.append(acc)
        histories.append(history)
    return scores, histories

# plot diagnostic learning curves
def summarize_diagnostics(histories):
    for i in range(len(histories)):
        # plot loss
        pyplot.subplot(211)
        pyplot.title('Cross Entropy Loss')
        pyplot.plot(histories[i].history['loss'], color='blue', label='train')
        pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')
        # plot accuracy
        pyplot.subplot(212)
        pyplot.title('Classification Accuracy')
        pyplot.plot(histories[i].history['acc'], color='blue', label='train')
        pyplot.plot(histories[i].history['val_acc'], color='orange', label='test')
    pyplot.show()

# summarize model performance
def summarize_performance(scores):
    # print summary
    print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))
    # box and whisker plots of results
    pyplot.boxplot(scores)
    pyplot.show()

# run the test harness for evaluating a model
def run_test_harness():
    # load dataset
    trainX, trainY, testX, testY = load_dataset()
    # prepare pixel data
    trainX, testX = prep_pixels(trainX, testX)
    # define model
    model = define_model()
    # evaluate model
    scores, histories = evaluate_model(model, trainX, trainY)
    # learning curves
    summarize_diagnostics(histories)
    # summarize estimated performance
    summarize_performance(scores)

# entry point, run the test harness
run_test_harness()

我们可以看到，与交叉验证折叠中的基线相比，模型性能可能略有下降。再次运行该示例将报告交叉验证过程的每个折叠的模型性能。

> 98.592

> 99.792

> 99.933

> 99.992

> 99.983

创建学习曲线图，在这种情况下显示学习速度（对时期的改进）似乎与基线模型不同。

这些图表明，至少在这种情况下实施的批量标准化不会带来任何好处。

640?wx_fmt=png

接下来，给出了模型的估计性能，显示了性能，模型的平均精度略有下降：99.658，而基线模型为99.678，但标准偏差可能略有下降。

Accuracy: mean=99.658 std=0.538, n=50.538, n=5

640?wx_fmt=png

增加模型深度

有许多方法可以更改模型配置，以便探索基线模型的改进。

两种常见的方法包括改变能力模型的特征提取部分或改变模型的分类部分的容量和功能。也许最大的影响点是对特征提取器的改变。

我们可以增加模型的特征提取器部分的深度，遵循类似VGG的模式，即使用相同大小的过滤器添加更多卷积和池化层，同时增加过滤器的数量。在这种情况下，我们将添加一个双卷积层，每层包含64个过滤器，然后是另一个最大池层。

下面列出了具有此更改的define_model（）函数的更新版本。

# define cnn modeldef define_model():    model = Sequential()    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))    model.add(MaxPooling2D((2, 2)))    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform'))    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform'))    model.add(MaxPooling2D((2, 2)))    model.add(Flatten())    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))    model.add(Dense(10, activation='softmax'))    # compile model    opt = SGD(lr=0.01, momentum=0.9)    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])    return model
def define_model():
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform'))
    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
    model.add(Dense(10, activation='softmax'))
    # compile model
    opt = SGD(lr=0.01, momentum=0.9)
    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
    return model

为完整起见，下面提供了整个代码清单，包括此更改。

# deeper cnn model for mnistfrom numpy import meanfrom numpy import stdfrom matplotlib import pyplotfrom sklearn.model_selection import KFoldfrom keras.datasets import mnistfrom keras.utils import to_categoricalfrom keras.models import Sequentialfrom keras.layers import Conv2Dfrom keras.layers import MaxPooling2Dfrom keras.layers import Densefrom keras.layers import Flattenfrom keras.optimizers import SGD# load train and test datasetdef load_dataset():    # load dataset    (trainX, trainY), (testX, testY) = mnist.load_data()    # reshape dataset to have a single channel    trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))    testX = testX.reshape((testX.shape[0], 28, 28, 1))    # one hot encode target values    trainY = to_categorical(trainY)    testY = to_categorical(testY)    return trainX, trainY, testX, testY# scale pixelsdef prep_pixels(train, test):    # convert from integers to floats    train_norm = train.astype('float32')    test_norm = test.astype('float32')    # normalize to range 0-1    train_norm = train_norm / 255.0    test_norm = test_norm / 255.0    # return normalized images    return train_norm, test_norm# define cnn modeldef define_model():    model = Sequential()    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))    model.add(MaxPooling2D((2, 2)))    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform'))    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform'))    model.add(MaxPooling2D((2, 2)))    model.add(Flatten())    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))    model.add(Dense(10, activation='softmax'))    # compile model    opt = SGD(lr=0.01, momentum=0.9)    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])    return model# evaluate a model using k-fold cross-validationdef evaluate_model(model, dataX, dataY, n_folds=5):    scores, histories = list(), list()    # prepare cross validation    kfold = KFold(n_folds, shuffle=True, random_state=1)    # enumerate splits    for train_ix, test_ix in kfold.split(dataX):        # select rows for train and test        trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]        # fit model        history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)        # evaluate model        _, acc = model.evaluate(testX, testY, verbose=0)        print('> %.3f' % (acc * 100.0))        # stores scores        scores.append(acc)        histories.append(history)    return scores, histories# plot diagnostic learning curvesdef summarize_diagnostics(histories):    for i in range(len(histories)):        # plot loss        pyplot.subplot(211)        pyplot.title('Cross Entropy Loss')        pyplot.plot(histories[i].history['loss'], color='blue', label='train')        pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')        # plot accuracy        pyplot.subplot(212)        pyplot.title('Classification Accuracy')        pyplot.plot(histories[i].history['acc'], color='blue', label='train')        pyplot.plot(histories[i].history['val_acc'], color='orange', label='test')    pyplot.show()# summarize model performancedef summarize_performance(scores):    # print summary    print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))    # box and whisker plots of results    pyplot.boxplot(scores)    pyplot.show()# run the test harness for evaluating a modeldef run_test_harness():    # load dataset    trainX, trainY, testX, testY = load_dataset()    # prepare pixel data    trainX, testX = prep_pixels(trainX, testX)    # define model    model = define_model()    # evaluate model    scores, histories = evaluate_model(model, trainX, trainY)    # learning curves    summarize_diagnostics(histories)    # summarize estimated performance    summarize_performance(scores)# entry point, run the test harnessrun_test_harness()
from numpy import mean
from numpy import std
from matplotlib import pyplot
from sklearn.model_selection import KFold
from keras.datasets import mnist
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
from keras.optimizers import SGD

# load train and test dataset
def load_dataset():
    # load dataset
    (trainX, trainY), (testX, testY) = mnist.load_data()
    # reshape dataset to have a single channel
    trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
    testX = testX.reshape((testX.shape[0], 28, 28, 1))
    # one hot encode target values
    trainY = to_categorical(trainY)
    testY = to_categorical(testY)
    return trainX, trainY, testX, testY

# scale pixels
def prep_pixels(train, test):
    # convert from integers to floats
    train_norm = train.astype('float32')
    test_norm = test.astype('float32')
    # normalize to range 0-1
    train_norm = train_norm / 255.0
    test_norm = test_norm / 255.0
    # return normalized images
    return train_norm, test_norm

# define cnn model
def define_model():
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform'))
    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
    model.add(Dense(10, activation='softmax'))
    # compile model
    opt = SGD(lr=0.01, momentum=0.9)
    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
    return model

# evaluate a model using k-fold cross-validation
def evaluate_model(model, dataX, dataY, n_folds=5):
    scores, histories = list(), list()
    # prepare cross validation
    kfold = KFold(n_folds, shuffle=True, random_state=1)
    # enumerate splits
    for train_ix, test_ix in kfold.split(dataX):
        # select rows for train and test
        trainX, trainY, testX, testY = dataX[train_ix], dataY[train_ix], dataX[test_ix], dataY[test_ix]
        # fit model
        history = model.fit(trainX, trainY, epochs=10, batch_size=32, validation_data=(testX, testY), verbose=0)
        # evaluate model
        _, acc = model.evaluate(testX, testY, verbose=0)
        print('> %.3f' % (acc * 100.0))
        # stores scores
        scores.append(acc)
        histories.append(history)
    return scores, histories

# plot diagnostic learning curves
def summarize_diagnostics(histories):
    for i in range(len(histories)):
        # plot loss
        pyplot.subplot(211)
        pyplot.title('Cross Entropy Loss')
        pyplot.plot(histories[i].history['loss'], color='blue', label='train')
        pyplot.plot(histories[i].history['val_loss'], color='orange', label='test')
        # plot accuracy
        pyplot.subplot(212)
        pyplot.title('Classification Accuracy')
        pyplot.plot(histories[i].history['acc'], color='blue', label='train')
        pyplot.plot(histories[i].history['val_acc'], color='orange', label='test')
    pyplot.show()

# summarize model performance
def summarize_performance(scores):
    # print summary
    print('Accuracy: mean=%.3f std=%.3f, n=%d' % (mean(scores)*100, std(scores)*100, len(scores)))
    # box and whisker plots of results
    pyplot.boxplot(scores)
    pyplot.show()

# run the test harness for evaluating a model
def run_test_harness():
    # load dataset
    trainX, trainY, testX, testY = load_dataset()
    # prepare pixel data
    trainX, testX = prep_pixels(trainX, testX)
    # define model
    model = define_model()
    # evaluate model
    scores, histories = evaluate_model(model, trainX, trainY)
    # learning curves
    summarize_diagnostics(histories)
    # summarize estimated performance
    summarize_performance(scores)

# entry point, run the test harness
run_test_harness()

运行该示例将报告交叉验证过程的每个折叠的模型性能。

每次折叠分数可能表明相对于基线有一些改善。

> 98.925

> 99.867

> 99.983

> 99.992

> 100.000

创建学习曲线图，在这种情况下显示模型仍然很好地适应问题，没有明显的过度拟合迹象。这些情节甚至可能表明进一步的训练时代可能会有所帮助。

640?wx_fmt=png

接下来，给出了模型的估计性能，与99.678至99.753的基线相比，性能略有改善，标准偏差也略有下降。

Accuracy: mean=99.753 std=0.417, n=5

640?wx_fmt=png

如何敲定模型并做出预测

只要我们有想法以及测试它们的时间和资源，模型改进的过程可能会持续。

在某些时候，必须选择并采用最终的模型配置。在这种情况下，我们将选择更深的模型作为我们的最终模型。

首先，我们将最终确定我们的模型，但是在整个训练数据集上拟合模型并将模型保存到文件中供以后使用。然后，我们将加载模型并在保持测试数据集上评估其性能，以了解所选模型在实践中的实际执行情况。最后，我们将使用保存的模型对单个图像进行预测。

保存最终模型

最终模型通常适用于所有可用数据，例如所有列车和测试数据集的组合。

在本教程中，我们故意阻止测试数据集，以便我们可以估计最终模型的性能，这在实践中可能是一个好主意。因此，我们将仅在训练数据集上拟合我们的模型。

# fit model

model.fit(trainX, trainY, epochs=10, batch_size=32, verbose=0)

一旦适合，我们可以通过调用模型上的save（）函数将最终模型保存到H5文件并传入所选的文件名。

# save model

model.save('final_model.h5')

请注意，保存和加载Keras模型需要在工作站上安装h5py库。

下面列出了将最终深度模型拟合到训练数据集并将其保存到文件的完整示例。

# save the final model to filefrom keras.datasets import mnistfrom keras.utils import to_categoricalfrom keras.models import Sequentialfrom keras.layers import Conv2Dfrom keras.layers import MaxPooling2Dfrom keras.layers import Densefrom keras.layers import Flattenfrom keras.optimizers import SGD# load train and test datasetdef load_dataset():    # load dataset    (trainX, trainY), (testX, testY) = mnist.load_data()    # reshape dataset to have a single channel    trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))    testX = testX.reshape((testX.shape[0], 28, 28, 1))    # one hot encode target values    trainY = to_categorical(trainY)    testY = to_categorical(testY)    return trainX, trainY, testX, testY# scale pixelsdef prep_pixels(train, test):    # convert from integers to floats    train_norm = train.astype('float32')    test_norm = test.astype('float32')    # normalize to range 0-1    train_norm = train_norm / 255.0    test_norm = test_norm / 255.0    # return normalized images    return train_norm, test_norm# define cnn modeldef define_model():    model = Sequential()    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))    model.add(MaxPooling2D((2, 2)))    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform'))    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform'))    model.add(MaxPooling2D((2, 2)))    model.add(Flatten())    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))    model.add(Dense(10, activation='softmax'))    # compile model    opt = SGD(lr=0.01, momentum=0.9)    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])    return model# run the test harness for evaluating a modeldef run_test_harness():    # load dataset    trainX, trainY, testX, testY = load_dataset()    # prepare pixel data    trainX, testX = prep_pixels(trainX, testX)    # define model    model = define_model()    # fit model    model.fit(trainX, trainY, epochs=10, batch_size=32, verbose=0)    # save model    model.save('final_model.h5')# entry point, run the test harnessrun_test_harness()
from keras.datasets import mnist
from keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Dense
from keras.layers import Flatten
from keras.optimizers import SGD

# load train and test dataset
def load_dataset():
    # load dataset
    (trainX, trainY), (testX, testY) = mnist.load_data()
    # reshape dataset to have a single channel
    trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
    testX = testX.reshape((testX.shape[0], 28, 28, 1))
    # one hot encode target values
    trainY = to_categorical(trainY)
    testY = to_categorical(testY)
    return trainX, trainY, testX, testY

# scale pixels
def prep_pixels(train, test):
    # convert from integers to floats
    train_norm = train.astype('float32')
    test_norm = test.astype('float32')
    # normalize to range 0-1
    train_norm = train_norm / 255.0
    test_norm = test_norm / 255.0
    # return normalized images
    return train_norm, test_norm

# define cnn model
def define_model():
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', input_shape=(28, 28, 1)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform'))
    model.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(100, activation='relu', kernel_initializer='he_uniform'))
    model.add(Dense(10, activation='softmax'))
    # compile model
    opt = SGD(lr=0.01, momentum=0.9)
    model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
    return model

# run the test harness for evaluating a model
def run_test_harness():
    # load dataset
    trainX, trainY, testX, testY = load_dataset()
    # prepare pixel data
    trainX, testX = prep_pixels(trainX, testX)
    # define model
    model = define_model()
    # fit model
    model.fit(trainX, trainY, epochs=10, batch_size=32, verbose=0)
    # save model
    model.save('final_model.h5')

# entry point, run the test harness
run_test_harness()

运行此示例后，您将在当前工作目录中拥有一个名为“ final_model.h5 ” 的1.2 MB文件。

评估最终模型

我们现在可以加载最终模型并在保持测试数据集上对其进行评估。

如果我们有兴趣向项目利益相关者展示所选模型的性能，我们可能会这样做。

可以通过load_model（）函数加载模型。

下面列出了加载已保存模型并在测试数据集上对其进行评估的完整示例。

# evaluate the deep model on the test datasetfrom keras.datasets import mnistfrom keras.models import load_modelfrom keras.utils import to_categorical# load train and test datasetdef load_dataset():    # load dataset    (trainX, trainY), (testX, testY) = mnist.load_data()    # reshape dataset to have a single channel    trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))    testX = testX.reshape((testX.shape[0], 28, 28, 1))    # one hot encode target values    trainY = to_categorical(trainY)    testY = to_categorical(testY)    return trainX, trainY, testX, testY# scale pixelsdef prep_pixels(train, test):    # convert from integers to floats    train_norm = train.astype('float32')    test_norm = test.astype('float32')    # normalize to range 0-1    train_norm = train_norm / 255.0    test_norm = test_norm / 255.0    # return normalized images    return train_norm, test_norm# run the test harness for evaluating a modeldef run_test_harness():    # load dataset    trainX, trainY, testX, testY = load_dataset()    # prepare pixel data    trainX, testX = prep_pixels(trainX, testX)    # load model    model = load_model('final_model.h5')    # evaluate model on test dataset    _, acc = model.evaluate(testX, testY, verbose=0)    print('> %.3f' % (acc * 100.0))# entry point, run the test harnessrun_test_harness()
from keras.datasets import mnist
from keras.models import load_model
from keras.utils import to_categorical

# load train and test dataset
def load_dataset():
    # load dataset
    (trainX, trainY), (testX, testY) = mnist.load_data()
    # reshape dataset to have a single channel
    trainX = trainX.reshape((trainX.shape[0], 28, 28, 1))
    testX = testX.reshape((testX.shape[0], 28, 28, 1))
    # one hot encode target values
    trainY = to_categorical(trainY)
    testY = to_categorical(testY)
    return trainX, trainY, testX, testY

# scale pixels
def prep_pixels(train, test):
    # convert from integers to floats
    train_norm = train.astype('float32')
    test_norm = test.astype('float32')
    # normalize to range 0-1
    train_norm = train_norm / 255.0
    test_norm = test_norm / 255.0
    # return normalized images
    return train_norm, test_norm

# run the test harness for evaluating a model
def run_test_harness():
    # load dataset
    trainX, trainY, testX, testY = load_dataset()
    # prepare pixel data
    trainX, testX = prep_pixels(trainX, testX)
    # load model
    model = load_model('final_model.h5')
    # evaluate model on test dataset
    _, acc = model.evaluate(testX, testY, verbose=0)
    print('> %.3f' % (acc * 100.0))

# entry point, run the test harness
run_test_harness()

运行该示例将加载已保存的模型，并在保留测试数据集上评估模型。

计算并打印模型在测试数据集上的分类精度。在这种情况下，我们可以看到该模型的准确度达到了99.090％，或者仅低于1％，这一点并不差，并且合理地接近估计的99.753％，标准差约为半个百分点（例如99百分比）。

> 99.090

做预测

我们可以使用我们保存的模型对新图像进行预测。

该模型假设新图像是灰度的，它们已经对齐，使得一个图像包含一个居中的手写数字，并且图像的大小是正方形，大小为28×28像素。

下面是从MNIST测试数据集中提取的图像。您可以使用文件名“ sample_image.png ” 将其保存在当前工作目录中。

640?wx_fmt=png

样本手写数字

我们将假装这是一个全新的，看不见的图像，以所需的方式准备，并看看我们如何使用我们保存的模型来预测图像所代表的整数（例如，我们期望“ 7 ”）。

首先，我们可以加载图像，强制它以灰度格式，并强制大小为28×28像素。然后可以将加载的图像调整大小以具有单个通道并表示数据集中的单个样本。该load_image（）函数实现这一点，将返回加载图像准备进行分类。

重要的是，在拟合最终模型时，以与为训练数据集准备的像素值相同的方式准备像素值，在这种情况下，归一化。

# load and prepare the imagedef load_image(filename):    # load the image    img = load_img(filename, grayscale=True, target_size=(28, 28))    # convert to array    img = img_to_array(img)    # reshape into a single sample with 1 channel    img = img.reshape(1, 28, 28, 1)    # prepare pixel data    img = img.astype('float32')    img = img / 255.0    return img
def load_image(filename):
    # load the image
    img = load_img(filename, grayscale=True, target_size=(28, 28))
    # convert to array
    img = img_to_array(img)
    # reshape into a single sample with 1 channel
    img = img.reshape(1, 28, 28, 1)
    # prepare pixel data
    img = img.astype('float32')
    img = img / 255.0
    return img

接下来，我们可以像上一节一样加载模型，并调用predict_classes（）函数来预测图像所代表的数字。

# predict the classdigit = model.predict_classes(img)
digit = model.predict_classes(img)

下面列出了完整的示例。

# make a prediction for a new image.from keras.preprocessing.image import load_imgfrom keras.preprocessing.image import img_to_arrayfrom keras.models import load_model# load and prepare the imagedef load_image(filename):    # load the image    img = load_img(filename, grayscale=True, target_size=(28, 28))    # convert to array    img = img_to_array(img)    # reshape into a single sample with 1 channel    img = img.reshape(1, 28, 28, 1)    # prepare pixel data    img = img.astype('float32')    img = img / 255.0    return img# load an image and predict the classdef run_example():    # load the image    img = load_image('sample_image.png')    # load model    model = load_model('final_model.h5')    # predict the class    digit = model.predict_classes(img)    print(digit[0])# entry point, run the examplerun_example()