keras imageDataGenerator封装了很多,其中有一个flow from directory函数能够直接把文件夹里面的子文件夹当作label,然后做一个分类的训练。
rootDir='/home/eric/data/scene'
# train_path='group0_train.txt'
# valid_path='group0_valid.txt'
trainfilepath=os.path.join(rootDir,train_path)
trainDf=pd.read_csv(trainfilepath) #加载papa.txt,指定它的分隔符是 \t
trainDf.rename(columns={'image':"filename",'label':'class'},inplace=True)
validfilepath=os.path.join(rootDir,valid_path)
validDf=pd.read_csv(validfilepath,header=None) #加载papa.txt,指定它的分隔符是 \t
validDf.rename(columns={0:"filename",1:'class'},inplace=True)
datagen1 = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
rotation_range=90,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
datagen2 = ImageDataGenerator(rescale=1. / 255)
# train_generator = datagen1.flow_from_directory(
# ptrain,
# target_size=(size, size),
# batch_size=batch,
# class_mode='categorical')
train_generator = datagen1.flow_from_dataframe( dataframe=trainDf,
directory=rootDir ,
x_col="filename",
y_col="class",
subset="training",
# classes=labels,
target_size=[size, size],
batch_size=batch,
class_mode='categorical')
validation_generator = datagen2.flow_from_dataframe(
dataframe=validDf,
directory=rootDir,
x_col="filename",
y_col="class",
subset="training",
# classes=labels,
target_size=[size, size],
batch_size=batch,
class_mode='categorical')
上面的代码有一个问题,你不知道系统怎么把label 映射为index的,所以在做predict的时候就不知道怎样对应label了,其实是可以输出的:
print(train_generator.class_indices)
这样就能够把系统的class和index对应关系以字典的形式输出了。
参考文献
[1]. Keras flow_from_directory class index. https://stackoverflow.com/questions/43813393/keras-flow-from-directory-class-index