mnist数据集分类:深度神经网络在网络中,激活函数、批归一化、Dropout的添加和使用。
import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import sklearn
import pandas as pd
import os
import tensorflow as tf
from tensorflow import keras
print(tf.__version__)
2.0.0
fashion_mnist = keras.datasets.fashion_mnist
(x_train_all,y_train_all),(x_test,y_test) = fashion_mnist.load_data()
x_valid,x_train = x_train_all[:5000],x_train_all[5000:]
y_valid,y_train = y_train_all[:5000],y_train_all[5000:]
print(x_valid.shape,y_valid.shape)
print(x_train.shape,y_train.shape)
print(x_test.shape,y_test.shape)
(5000, 28, 28) (5000,)
(55000, 28, 28) (55000,)
(10000, 28, 28) (10000,)
#归一化:x = (x-mu)/std 均值为0,方差为1
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
x_train_scaled = scaler.fit_transform(
x_train.astype(np.float32).reshape(-1,1)).reshape(-1,28,28)
x_valid_scaled = scaler.transform(
x_valid.astype(np.float32).reshape(-1,1)).reshape(-1,28,28)
x_test_scaled = scaler.transform(
x_test.astype(np.float32).reshape(-1,1)).reshape(-1,28,28)
print(np.max(x_train_scaled),np.min(x_train_scaled))
2.0231433 -0.8105136
深度神经网络太深:
- 参数太多,训练不充分
- 梯度下降过程中,梯度消失(复合函数,链式法则求导)
- 批归一化缓减梯度消失
model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28,28])) #输入28*28的图像,展平成28*28的1维向量
#深度的神经网络20层
for _ in range(20):
#激活函数selu自带归一化
model.add(keras.layers.Dense(100,activation="selu"))
#批归一化,每层都做
model.add(keras.layers.BatchNormalization())
'''
#批归一化也可以放在激活函数前
model.add(keras.layers.Dense(100))
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.Activation(relu))
'''
#Dropout:随机地禁用一些神经单元,作用:防止过拟合。
#modeld.add(keras.layers.Dropout(rate=0.5))
#AlphaDropout:1.均值和方差不变 2.归一化的性质不变
model.add(keras.layers.AlphaDropout(rate=0.5))
model.add(keras.layers.Dense(10,activation="softmax"))
model.compile(loss="sparse_categorical_crossentropy",
optimizer = "adam",
metrics = ["accuracy"])
#callbacks: Tensorboard , earlystopping , ModelCheckpoint
#设置一个文件夹
logdir = os.path.join('./dnn_callbacks')
if not os.path.exists(logdir):
os.mkdir(logdir)
output_model_file = os.path.join(logdir,"fashion_mnist_model.h5")
callbacks = [
# keras.callbacks.TensorBoard(logdir),
keras.callbacks.ModelCheckpoint(output_model_file,save_best_only = True),
keras.callbacks.EarlyStopping(monitor="val_loss",patience=5,min_delta=1e-3)
]
history = model.fit(x_train_scaled,y_train,epochs=10,
validation_data=(x_valid_scaled,y_valid))
Train on 55000 samples, validate on 5000 samples
Epoch 1/10
55000/55000 [==============================] - 23s 424us/sample - loss: 0.4710 - accuracy: 0.8436 - val_loss: 0.6472 - val_accuracy: 0.8574
Epoch 2/10
55000/55000 [==============================] - 23s 421us/sample - loss: 0.4525 - accuracy: 0.8482 - val_loss: 0.5506 - val_accuracy: 0.8576
Epoch 3/10
55000/55000 [==============================] - 24s 434us/sample - loss: 0.4371 - accuracy: 0.8544 - val_loss: 0.5437 - val_accuracy: 0.8822
Epoch 4/10
55000/55000 [==============================] - 22s 403us/sample - loss: 0.4145 - accuracy: 0.8591 - val_loss: 0.6947 - val_accuracy: 0.8454
Epoch 5/10
55000/55000 [==============================] - 23s 421us/sample - loss: 0.4122 - accuracy: 0.8627 - val_loss: 0.5254 - val_accuracy: 0.8842
Epoch 6/10
55000/55000 [==============================] - 24s 445us/sample - loss: 0.3988 - accuracy: 0.8656 - val_loss: 0.5548 - val_accuracy: 0.8626
Epoch 7/10
55000/55000 [==============================] - 21s 386us/sample - loss: 0.3883 - accuracy: 0.8696 - val_loss: 0.5432 - val_accuracy: 0.8808
Epoch 8/10
55000/55000 [==============================] - 22s 406us/sample - loss: 0.3779 - accuracy: 0.8728 - val_loss: 0.5272 - val_accuracy: 0.8884
Epoch 9/10
55000/55000 [==============================] - 21s 388us/sample - loss: 0.3686 - accuracy: 0.8743 - val_loss: 0.5292 - val_accuracy: 0.8832
Epoch 10/10
55000/55000 [==============================] - 23s 421us/sample - loss: 0.3599 - accuracy: 0.8781 - val_loss: 0.5088 - val_accuracy: 0.8792
def plot_learning_curve(history):
pd.DataFrame(history.history).plot(figsize=(8,5))
plt.grid(True)
plt.gca().set_ylim(0,1)
plt.show()
#DateFrame 数据类型的作图
plot_learning_curve(history)
y = model.evaluate(x_test_scaled,y_test)
# - 1s 110us/sample - loss: 0.3048 - accuracy: 0.8724