return imageList
print(“开始加载数据”)
imageArr = loadImageData()
print(type(imageArr))
labelList = np.array(labelList)
print(“加载数据完成”)
print(labelList)
labelList = np_utils.to_categorical(labelList, classnum)
print(labelList)
做好数据之后,我们需要切分训练集和测试集,一般按照4:1或者7:3的比例来切分。切分数据集使用train_test_split()方法,需要导入from sklearn.model_selection import train_test_split 包。例:
trainX, valX, trainY, valY = train_test_split(imageArr, labelList, test_size=0.2, random_state=42)
ImageDataGenerator()是keras.preprocessing.image模块中的图片生成器,同时也可以在batch中对数据进行增强,扩充数据集大小,增强模型的泛化能力。比如进行旋转,变形,归一化等等。
keras.preprocessing.image.ImageDataGenerator(featurewise_center=False,samplewise_center
=False, featurewise_std_normalization=False, samplewise_std_normalization=False,zca_whitening=False,
zca_epsilon=1e-06, rotation_range=0.0, width_shift_range=0.0, height_shift_range=0.0,brightness_range=None, shear_range=0.0, zoom_range=0.0,channel_shift_range=0.0, fill_mode=‘nearest’, cval=0.0, horizontal_flip=False, vertical_flip=False, rescale=None, preprocessing_function=None,data_format=None,validation_split=0.0)
参数:
- featurewise_center: Boolean. 对输入的图片每个通道减去每个通道对应均值。
- samplewise_center: Boolan. 每张图片减去样本均值, 使得每个样本均值为0。
- featurewise_std_normalization(): Boolean()
- samplewise_std_normalization(): Boolean()
- zca_epsilon(): Default 12-6
- zca_whitening: Boolean. 去除样本之间的相关性
- rotation_range(): 旋转范围
- width_shift_range(): 水平平移范围
- height_shift_range(): 垂直平移范围
- shear_range(): float, 透视变换的范围
- zoom_range(): 缩放范围
- fill_mode: 填充模式, constant, nearest, reflect
- cval: fill_mode == 'constant’的时候填充值
- horizontal_flip(): 水平反转
- vertical_flip(): 垂直翻转
- preprocessing_function(): user提供的处理函数
- data_format(): channels_first或者channels_last
- validation_split(): 多