深度学习算法--python实现用Keras研发构建一个多层感知器

用Keras研发构建一个多层感知器来分辨MNIST数据集中的手写数字

import os
import sys
import struct
import numpy as np
import matplotlib.pyplot as plt
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import tensorflow.keras as keras

def load_mnist(path, kind='train'):
    """Load MNIST data from `path`"""
    # 图像以字节的格式存储
    labels_path = os.path.join(path,
                               '%s-labels-idx1-ubyte\\%s-labels.idx1-ubyte' % (kind, kind))
    images_path = os.path.join(path,
                               '%s-images-idx3-ubyte\\%s-images.idx3-ubyte' % (kind, kind))

    with open(labels_path, 'rb') as lbpath:
        # 返回一个由解包数据(string)得到的一个元组(tuple), 即使仅有一个数据也会被解包成元组
        # struct.pack()和struct.unpack()
        # >:这是大端字节序,它定义一串字节存储的顺序
        # 计算机的内部处理都是小端字节序。但是,人类还是习惯读写大端字节序
        # I:代表无符号整数
        magic, n = struct.unpack('>II',
                                 lbpath.read(8))
        # 存取数组内容的文件操作函数,fromfile()函数读回数据时需要用户指定元素类型,
        # 并对数组的形状进行适当的修改
        labels = np.fromfile(lbpath,
                             dtype=np.uint8)

    with open(images_path, 'rb') as imgpath:
        magic, num, rows, cols = struct.unpack(">IIII",
                                               imgpath.read(16))
        images = np.fromfile(imgpath,
                             dtype=np.uint8).reshape(len(labels), 784)
        # 逐个像素调整图像比例
        images = ((images / 255.) - .5) * 2

    return images, labels

X_train, y_train = load_mnist('xxx',
                              kind='train')
print('Rows: %d, columns: %d' % (X_train.shape[0], X_train.shape[1]))
X_test, y_test = load_mnist('xxx',
                            kind='t10k')
print('Rows: %d, columns: %d' % (X_test.shape[0], X_test.shape[1]))

## mean centering and normalization:
mean_vals = np.mean(X_train, axis=0)
std_val = np.std(X_train)
X_train_centered = (X_train - mean_vals)/std_val
X_test_centered = (X_test - mean_vals)/std_val
del X_train, X_test
print(X_train_centered.shape, y_train.shape)
print(X_test_centered.shape, y_test.shape)

# 为NumPy和TensorFlow设置随机种子以确保可以得到一致的结果
np.random.seed(123)
tf.set_random_seed(123)
# 把分类标签(整数0-9)转换为独热格式
y_train_onehot = keras.utils.to_categorical(y_train)
print('First 3 labels: ', y_train[:3])
print('\nFirst 3 labels (one-hot):\n', y_train_onehot[:3])

# 用Sequential类来初始化一个新模型以实现前馈神经网络
# 通过设置kernel_initializer='glorot_uniform',为权重矩阵增
# 加了新的初始化算法即Glorot初始化(又称为沙维尔初始化)
model = keras.models.Sequential()
model.add(
    keras.layers.Dense(
        units=50,
        input_dim=X_train_centered.shape[1],
        kernel_initializer='glorot_uniform',
        bias_initializer='zeros',  # 把偏置初始化为零比较常见。事实上,这是Keras的默认设置
        activation='tanh'))

model.add(
    keras.layers.Dense(
        units=50,
        input_dim=50,
        kernel_initializer='glorot_uniform',
        bias_initializer='zeros',
        activation='tanh'))

model.add(
    keras.layers.Dense(
        units=y_train_onehot.shape[1],
        input_dim=50,
        kernel_initializer='glorot_uniform',
        bias_initializer='zeros',
        activation='softmax'))

# (BGD为批梯度下降,即所有样本计算完毕后才进行梯度更新;
# 而SGD为随机梯度下降,随机计算一次样本就进行梯度下降,所以速度快很多但容易陷入局部最优值。
# 折中的办法是采用小批的梯度下降,即把数据分成若干个批次,一批来进行一次梯度下降,减少随机性,计算量也不是很大。)
# 随机梯度下降作为优化的算法。设置权重衰减常数(decay)和动量学习(momentum)来调整每个迭代的学习速度
# lr:大或等于0的浮点数,学习率
# momentum:大或等于0的浮点数,动量参数
# decay:大或等于0的浮点数,每次更新后的学习率衰减值
#   LearningRate = LearningRate * 1/(1 + decay * epoch)
sgd_optimizer = keras.optimizers.SGD(  # SGD随机梯度下降优化器
        lr=0.001, decay=1e-7, momentum=.9)
# 把成本(或损失)函数设置为categorical_crossentropy
model.compile(optimizer=sgd_optimizer,
              loss='categorical_crossentropy')  # 多分类的对数损失函数

# 参数validation_split尤其有用,因为它储备了10%的训练数据(该例有 6000个样本)用于在每次迭代之后验证,
# 这样就能够监测到模型在训练过程中是否出现过拟合问题
history = model.fit(X_train_centered, y_train_onehot,
                    batch_size=64, epochs=50,
                    verbose=1,
                    validation_split=0.1)
# 预测
y_train_pred = model.predict_classes(X_train_centered, verbose=0)
# 打印模型在训练集和测试集上的准确度
# 训练集
correct_preds = np.sum(y_train == y_train_pred, axis=0)
train_acc = correct_preds / y_train.shape[0]
print('First 3 predictions: ', y_train_pred[:3])
print('Training accuracy: %.2f%%' % (train_acc * 100))
# 测试集
y_test_pred = model.predict_classes(X_test_centered,
                                    verbose=0)
correct_preds = np.sum(y_test == y_test_pred, axis=0)
test_acc = correct_preds / y_test.shape[0]
print('Test accuracy: %.2f%%' % (test_acc * 100))

运行结果:
Rows: 60000, columns: 784
Rows: 10000, columns: 784
(60000, 784) (60000,)
(10000, 784) (10000,)
First 3 labels: [5 0 4]

First 3 labels (one-hot):
[[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]]

Train on 54000 samples, validate on 6000 samples
Epoch 1/50
52992/54000 [>.] - ETA: 0s - loss: 0.7425
54000/54000 [
] - 2s 28us/sample - loss: 0.7370 - val_loss: 0.3618
Epoch 2/50
54000/54000 [
] - 2s 29us/sample - loss: 0.3656 - val_loss: 0.2734
Epoch 3/50
54000/54000 [
] - 2s 28us/sample - loss: 0.2994 - val_loss: 0.2343
Epoch 4/50
54000/54000 [
] - 2s 28us/sample - loss: 0.2629 - val_loss: 0.2111
Epoch 5/50
54000/54000 [
] - 2s 28us/sample - loss: 0.2377 - val_loss: 0.1941
Epoch 6/50
54000/54000 [
] - 2s 28us/sample - loss: 0.2186 - val_loss: 0.1792
Epoch 7/50
54000/54000 [
] - 2s 28us/sample - loss: 0.2028 - val_loss: 0.1691
Epoch 8/50
54000/54000 [
] - 1s 27us/sample - loss: 0.1897 - val_loss: 0.1610
Epoch 9/50
54000/54000 [
] - 2s 28us/sample - loss: 0.1782 - val_loss: 0.1546
Epoch 10/50
54000/54000 [
] - 2s 28us/sample - loss: 0.1684 - val_loss: 0.1485
Epoch 11/50
54000/54000 [
] - 2s 28us/sample - loss: 0.1593 - val_loss: 0.1432
Epoch 12/50
54000/54000 [
] - 2s 28us/sample - loss: 0.1514 - val_loss: 0.1390
Epoch 13/50
54000/54000 [
] - 2s 28us/sample - loss: 0.1442 - val_loss: 0.1354
Epoch 14/50
54000/54000 [
] - 2s 28us/sample - loss: 0.1377 - val_loss: 0.1323
Epoch 15/50
54000/54000 [
] - 1s 28us/sample - loss: 0.1317 - val_loss: 0.1292
Epoch 16/50
54000/54000 [
] - 2s 28us/sample - loss: 0.1262 - val_loss: 0.1275
Epoch 17/50
54000/54000 [
] - 1s 27us/sample - loss: 0.1212 - val_loss: 0.1244
Epoch 18/50
54000/54000 [
] - 2s 28us/sample - loss: 0.1166 - val_loss: 0.1220
Epoch 19/50
54000/54000 [
] - 1s 28us/sample - loss: 0.1122 - val_loss: 0.1215
Epoch 20/50
54000/54000 [
] - 1s 28us/sample - loss: 0.1083 - val_loss: 0.1195
Epoch 21/50
54000/54000 [
] - 2s 28us/sample - loss: 0.1044 - val_loss: 0.1182
Epoch 22/50
54000/54000 [
] - 2s 28us/sample - loss: 0.1010 - val_loss: 0.1165
Epoch 23/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0976 - val_loss: 0.1154
Epoch 24/50
54000/54000 [
] - 1s 28us/sample - loss: 0.0943 - val_loss: 0.1141
Epoch 25/50
54000/54000 [
] - 1s 28us/sample - loss: 0.0915 - val_loss: 0.1138
Epoch 26/50
54000/54000 [
] - 1s 28us/sample - loss: 0.0885 - val_loss: 0.1121
Epoch 27/50
54000/54000 [
] - 1s 28us/sample - loss: 0.0858 - val_loss: 0.1125
Epoch 28/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0832 - val_loss: 0.1111
Epoch 29/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0807 - val_loss: 0.1109
Epoch 30/50
54000/54000 [
] - 2s 29us/sample - loss: 0.0784 - val_loss: 0.1100
Epoch 31/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0761 - val_loss: 0.1098
Epoch 32/50
54000/54000 [
] - 2s 29us/sample - loss: 0.0740 - val_loss: 0.1098
Epoch 33/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0718 - val_loss: 0.1089
Epoch 34/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0698 - val_loss: 0.1087
Epoch 35/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0679 - val_loss: 0.1091
Epoch 36/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0660 - val_loss: 0.1083
Epoch 37/50
54000/54000 [
] - 2s 29us/sample - loss: 0.0643 - val_loss: 0.1087
Epoch 38/50
54000/54000 [
] - 2s 29us/sample - loss: 0.0625 - val_loss: 0.1085
Epoch 39/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0609 - val_loss: 0.1068
Epoch 40/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0592 - val_loss: 0.1077
Epoch 41/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0576 - val_loss: 0.1078
Epoch 42/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0560 - val_loss: 0.1082
Epoch 43/50
54000/54000 [
] - 1s 28us/sample - loss: 0.0545 - val_loss: 0.1086
Epoch 44/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0532 - val_loss: 0.1089
Epoch 45/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0519 - val_loss: 0.1078
Epoch 46/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0504 - val_loss: 0.1097
Epoch 47/50
54000/54000 [
] - 1s 28us/sample - loss: 0.0492 - val_loss: 0.1089
Epoch 48/50
54000/54000 [
] - 2s 29us/sample - loss: 0.0479 - val_loss: 0.1087
Epoch 49/50
54000/54000 [
] - 2s 28us/sample - loss: 0.0467 - val_loss: 0.1084
Epoch 50/50
54000/54000 [
============================] - 2s 28us/sample - loss: 0.0456 - val_loss: 0.1090

First 3 predictions: [5 0 4]
Training accuracy: 98.96%
Test accuracy: 96.39%

备注:这只是一个没有优化调优参数的非常简单的神经网络。如果你有兴趣更多地尝试Keras,可以进一步调整学习速度、动量、质量衰减和隐藏单元数。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值