使用keras对mnist手写数据集进行训练

强风吹拂~~~

已于 2023-09-24 22:13:20 修改

阅读量564

点赞数

分类专栏：人工智能文章标签：深度学习 python

于 2020-10-20 21:29:09 首次发布

本文链接：https://blog.csdn.net/IT_frshman_number31/article/details/109189485

版权

人工智能专栏收录该内容

3 篇文章 0 订阅

订阅专栏

本章为学习笔记，并不是教程，如有错误，大家可以留言告诉我
上一章讲解了如何安装keras，现在我们开始上手对mnist数据集进行训练，并且对比两次训练情况来验证dropout的减少过拟合的作用。

废话少说先把代码贴上来

import numpy as np
import pandas as pd
from keras.utils import np_utils
import matplotlib.pyplot as plt
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout

# 获取mnist数据
(x_train_image, y_train_label), (x_test_image, y_test_label) = mnist.load_data()

# 对图像进行预处理
# 将图像转化为一维数组
x_Test = x_test_image.reshape(10000, 784).astype('float32')
x_Train = x_train_image.reshape(60000, 784).astype('float32')

# 将灰度控制在（0-1）的范围
x_Train_normalize = x_Train / 255
x_Test_normalize = x_Test / 255

# 将标签转化为onehot形式
y_Train_OneHot = np_utils.to_categorical(y_train_label)
y_Test_OneHot = np_utils.to_categorical(y_test_label)



# 显示图像和预测结果
def plot_image_prediction(images, labels, prediction, idx, num = 25):
    fig = plt.gcf()
    fig.set_size_inches(12, 14)
    if num > 25:
        num = 25
    for i in range(0, num):
        # 将画布分割为 5 * 5 个子图
        ax = plt.subplot(5, 5, 1 + i)
        ax.imshow(images[idx], cmap='binary')
        title = "label=" + str(labels[idx])
        if len(prediction) > 0:
            title = ",prediction=" + str(prediction[idx])
        ax.set_title(title, fontsize=10)
        ax.set_xticks([])
        ax.set_yticks([])
        idx += 1
    plt.show()

# 显示训练结果
def show_history(train_history, train, validation):
    plt.plot(train_history.history[train])
    plt.plot(train_history.history[validation])
    # print(train_history.history[train])
    # print(train_history.history[validation])
    plt.title('Train History')
    plt.ylabel(train)
    plt.xlabel('Epoch')
    plt.legend(['train', 'validation'], loc='upper left')
    plt.show()

# 第二种训练模式
def train1():
    # 采用顺序模型
    model = Sequential()
    # 设置输入层
    model.add(Dense(units=1000, input_dim=784, kernel_initializer='normal', activation='relu'))
    # 添加dropout 方法减少过拟合的情况
    model.add(Dropout(0.5))
    model.add(Dense(units=1000, kernel_initializer='normal', activation='relu'))
    model.add(Dropout(0.5))
    # 设置输出层，有10个输出结果对应手写数字的预测结果（0-9）
    model.add(Dense(units=10, kernel_initializer='normal', activation='softmax'))
    # 设置损失函数为交叉熵，利用adam方法对梯度进行操作，目标指数为精确度
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    # split 为验证数据分割比例（用于确定适合的超参数）
    train_history = model.fit(x=x_Train_normalize, y=y_Train_OneHot, validation_split=0.2, epochs=10, batch_size=200, verbose=2)
    # 显示训练结果
    show_history(train_history, 'acc', 'val_acc')
    show_history(train_history, 'loss', 'val_loss')
    # 利用test进行预测
    scores = model.evaluate(x_Test_normalize, y_Test_OneHot)
    print()
    # 输出预测经度
    print(scores[1])

    # 从340号图片开始显示25张图片与其预测结果
    prediction = model.predict_classes(x_Test)
    plot_image_prediction(x_test_image, y_test_label, prediction, idx=340)

    # 显示混淆矩阵
    crossTable = pd.crosstab(y_test_label, prediction, rownames=['label'], colnames=['predict'])
    print(crossTable)

# 第一种训练模式
def train():
    # 采用顺序模型
    model = Sequential()
    # 设置输入层
    model.add(Dense(units=256, input_dim=784, kernel_initializer='normal', activation='relu'))
    # 设置输出层，有10个输出结果对应手写数字的预测结果（0-9）
    model.add(Dense(units=10, kernel_initializer='normal', activation='softmax'))
    # 设置损失函数为交叉熵，利用adam方法对梯度进行操作，目标指数为精确度
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    # split 为验证数据分割比例（用于确定适合的超参数）
    # split 为验证数据分割比例（用于确定适合的超参数）
    train_history = model.fit(x=x_Train_normalize, y=y_Train_OneHot, validation_split=0.2, epochs=10, batch_size=200,
                              verbose=2)
    # 显示训练结果
    show_history(train_history, 'acc', 'val_acc')
    show_history(train_history, 'loss', 'val_loss')
    # 利用test进行预测
    scores = model.evaluate(x_Test_normalize, y_Test_OneHot)
    print()
    # 输出预测经度
    print(scores[1])

    # 从340号图片开始显示25张图片与其预测结果
    prediction = model.predict_classes(x_Test)
    plot_image_prediction(x_test_image, y_test_label, prediction, idx=340)

    # 显示混淆矩阵
    crossTable = pd.crosstab(y_test_label, prediction, rownames=['label'], colnames=['predict'])
    print(crossTable)

train()
train1()

上面的代码这里不再过多解释（可以看注解），我们进行了两次训练，输出训练结果和混淆矩阵，结果如下

1. 第一次训练结果

在这里插入图片描述
可以看到，验证数据和训练数据在不断训练的过程中精度一直在上升，但是需要注意的是，在训练结束后训练数据会略高于验证数据，这样就有了过拟合的嫌疑。

可以看到，第一次训练的混淆矩阵如下，test测试精度为0.9783

第二次训练结果

第二次训练我们添加了dropout方法，并且添加了一层隐藏层用于提高精确度，结果如下
在这里插入图片描述我们可以明显看到，添加了dropout的第二次训练，验证数据和训练数据在训练结束后并没有太大区别，即减少了训练数据自己验证的精确度，减少了过拟合的情况

由于添加了一层隐藏层，我们的预测进度也相对提高了

在这里插入图片描述
输出一部分识别结果，如图

强风吹拂~~~

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
使用keras对mnist手写数据集进行训练

本章为学习笔记，并不是教程，如有错误，大家可以留言告诉我上一章讲解了如何安装keras，现在我们开始上手对mnist数据集进行训练，并且对比两次训练情况来验证dropout的减少过拟合的作用。废话少说先把代码贴上来import numpy as npimport pandas as pdfrom keras.utils import np_utilsimport matplotlib.pyplot as pltfrom keras.datasets import mnistfrom kera
复制链接

扫一扫

专栏目录