大家好,我是一个喜欢研究算法、机械学习和生物计算的小青年,我的CSDN博客是:一骑代码走天涯
如果您喜欢我的笔记,那么请点一下关注、点赞和收藏。如果內容有錯或者有改进的空间,也可以在评论让我知道。😄
之前我在做一个项目,遇到一个问题卡了很久 :
手上有一个 file directory (假设叫 training_path/),里面装了一些当作训练集的JPEG照片,它们分別属于不同的标签 (Labels),然后想用类似 One-hot encoding 的方法导入到 Keras 的模型 (softmax activation function) 做训练,但是应该怎么做好了?
我的做法:
# Pandas 资料表格
import pandas as pd
df_train = pd.read_csv("../train.csv")
df_dev = pd.read_csv("../dev.csv")
df_train .head()
img_id | Label |
---|---|
a.jpg | 0 |
b.jpg | 2 |
c.jpg | 1 |
d.jpg | 3 |
e.jpg | 4 |
如果要用 keras.preprocessing.image.ImageDataGenerator 作图片处理和图片分类,表格栏 “Label” 必须为 str.
# 数据类型转换
df["Label_str"] = df["Label"].astype("str")
然后直接弄 ImageDataGenerator:
batch_size = 32
train_datagen = ImageDataGenerator(rescale=1./255)
dev_datagen = ImageDataGenerator(rescale=1./255)
# training image data generator
train_generator = train_datagen.flow_from_dataframe(dataframe = df_train,
directory = training_path,
x_col = "img_id",
y_col = "Label_str",
target_size = (256, 256),
batch_size = batch_size,
class_mode = "categorical")
# similar generator for dev data
validation_generator = dev_datagen.flow_from_dataframe(dataframe = df_dev,
directory = dev_path,
x_col = "img_id",
y_col = "Label_str",
target_size = (256, 256),
batch_size = batch_size,
class_mode = "categorical")
建立模型:
Model = Sequential([Input((256, 256)),
Conv2D(32, (3, 3), strides = (1, 1), name = 'conv'),
Activation('relu'),
MaxPooling2D((3, 3), name='max_pool'),
Flatten(),
Dense(5, activation='softmax', name='fc')])
Model.compile(loss='CategoricalCrossentropy',
optimizer="adam",
metrics = ["CategoricalAccuracy"])
训练模型:
EPOCHS = 10
STEPS_EPOCH_TRAIN = len(df_train) // batch_size
STEPS_EPOCH_DEV = len(df_dev) // batch_size
histroy = Model.fit(train_generator,
steps_per_epoch = STEPS_EPOCH_TRAIN,
epochs = EPOCHS,
validation_data = validation_generator,
validation_steps = STEPS_EPOCH_DEV)