1. 数据预处理是必要的,这里以最简单的MNIST dataset的输入数据预处理为例。
A. 设置随机种子
np.random.seed(1337) # for reproducibility
np.random.seed(1337) # for reproducibility
B. 输入数据维度规格化,这里每个样本只是size为784的一维数组。
X_train = X_train.reshape(60000, 784)
X_train = X_train.reshape(60000, 784)
将类别标签转换为one-hot encoding, 这一步对多分类是必须的
one_hot_labels = keras.utils.np_utils.to_categorical(labels, num_classes=10)
one_hot_labels = keras.utils.np_utils.to_categorical(labels, num_classes=10)
train sets 和test sets可能需要shuffle处理?
C. 输入数据类型转换,数值归一化
X_train = X_train.astype('float32')
X_train /= 255
X_train /= 255
MNIST dataset的MLP完整代码如下:
'''Trains a simple deep NN on the MNIST dataset.
Gets to 98.40% test accuracy after 20 epochs
(there is *a lot* of margin for parameter tuning).
2 seconds per epoch on a K520 GPU.
'''
from __future__ import print_function
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD, Adam, RMSprop
from keras.utils i