代码:
import numpy import os from keras import applications from keras.preprocessing.image import ImageDataGenerator from keras import optimizers from keras.models import Sequential, Model from keras.layers import Dropout, Flatten, Dense, GlobalAveragePooling2D from keras import backend as k from keras.callbacks import ModelCheckpoint, LearningRateScheduler, TensorBoard, EarlyStopping from keras.models import Sequential from keras.layers.normalization import BatchNormalization from keras.layers.convolutional import Conv2D from keras.layers.convolutional import MaxPooling2D from keras.initializers import TruncatedNormal from keras.layers.core import Activation from keras.layers.core import Flatten from keras.layers.core import Dropout from keras.layers.core import Dense files_train = 0 files_validation = 0 cwd = os.getcwd() folder = 'train_data/train' for sub_folder in os.listdir(folder): path, dirs, files = next(os.walk(os.path.join(folder,sub_folder))) files_train += len(files) folder = 'train_data/test' for sub_folder in os.listdir(folder): path, dirs, files = next(os.walk(os.path.join(folder,sub_folder))) files_validation += len(files) print(files_train,files_validation) img_width, img_height = 48, 48 train_data_dir = "train_data/train" validation_data_dir = "train_data/test" nb_train_samples = files_train nb_validation_samples = files_validation batch_size = 32 epochs = 15 num_classes = 2 model = applications.VGG16(weights='imagenet', include_top=False, input_shape = (img_width, img_height, 3)) for layer in model.layers[:10]: layer.trainable = False x = model.output x = Flatten()(x) predictions = Dense(num_classes, activation="softmax")(x) model_final = Model(input = model.input, output = predictions) model_final.compile(loss = "categorical_crossentropy", optimizer = optimizers.SGD(lr=0.0001, momentum=0.9), metrics=["accuracy"]) train_datagen = ImageDataGenerator( rescale = 1./255, horizontal_flip = True, fill_mode = "nearest", zoom_range = 0.1, width_shift_range = 0.1, height_shift_range=0.1, rotation_range=5) test_datagen = ImageDataGenerator( rescale = 1./255, horizontal_flip = True, fill_mode = "nearest", zoom_range = 0.1, width_shift_range = 0.1, height_shift_range=0.1, rotation_range=5) train_generator = train_datagen.flow_from_directory( train_data_dir, target_size = (img_height, img_width), batch_size = batch_size, class_mode = "categorical") validation_generator = test_datagen.flow_from_directory( validation_data_dir, target_size = (img_height, img_width), class_mode = "categorical") checkpoint = ModelCheckpoint("car1.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1) early = EarlyStopping(monitor='val_acc', min_delta=0, patience=10, verbose=1, mode='auto') history_object = model_final.fit_generator( train_generator, samples_per_epoch = nb_train_samples, epochs = epochs, validation_data = validation_generator, nb_val_samples = nb_validation_samples, callbacks = [checkpoint, early])
代码解读:
这段代码是用于训练一个基于VGG16模型的图像分类器。具体的解释如下:
1. 导入所需的库
- `import numpy`:导入numpy库,用于进行数值计算。
- `import os`:导入os库,用于处理文件路径。
- `from keras import applications`:从Keras库中导入applications模块,用于加载预训练的VGG16模型。
- `from keras.preprocessing.image import ImageDataGenerator`:从Keras库中导入ImageDataGenerator模块,用于生成图像的增强数据。
- `from keras import optimizers`:从Keras库中导入optimizers模块,用于配置模型的优化器。
- `from keras.models import Sequential, Model`:从Keras库中导入Sequential和Model模块,用于构建模型。
- `from keras.layers import Dropout, Flatten, Dense, GlobalAveragePooling2D`:从Keras库中导入Dropout、Flatten、Dense和GlobalAveragePooling2D模块,用于构建模型的不同层。
- `from keras import backend as k`:从Keras库中导入backend模块,并重命名为k,用于指定Keras的后端引擎。
- `from keras.callbacks import ModelCheckpoint, LearningRateScheduler, TensorBoard, EarlyStopping`:从Keras库中导入ModelCheckpoint、LearningRateScheduler、TensorBoard和EarlyStopping模块,用于定义回调函数,监控模型的训练过程。
2. 初始化变量
- `files_train = 0`:初始化训练样本数量为0。
- `files_validation = 0`:初始化验证样本数量为0。
- `cwd = os.getcwd()`:获取当前工作目录。
- `folder = 'train_data/train'`:训练数据的文件夹路径。
- `img_width, img_height = 48, 48`:设置图像的宽度和高度为48。
- `train_data_dir = "train_data/train"`:设置训练数据的文件夹路径。
- `validation_data_dir = "train_data/test"`:设置验证数据的文件夹路径。
- `nb_train_samples = files_train`:设置训练样本数量为files_train。
- `nb_validation_samples = files_validation`:设置验证样本数量为files_validation。
- `batch_size = 32`:设置批次大小为32。
- `epochs = 15`:设置迭代次数为15。
- `num_classes = 2`:设置类别数量为2。
3. 加载预训练的VGG16模型
- `model = applications.VGG16(weights='imagenet', include_top=False, input_shape = (img_width, img_height, 3))`:加载VGG16模型,并设置权重为预训练的ImageNet权重,include_top为False表示不包含顶部的全连接层,input_shape为模型的输入尺寸。
4. 冻结部分层的权重
- `for layer in model.layers[:10]:`:遍历模型的前10层。
- `layer.trainable = False`:将这些层的权重设置为不可训练。
5. 添加新的全连接层
- `x = model.output`:获取模型的输出。
- `x = Flatten()(x)`:将模型的输出展平。
- `predictions = Dense(num_classes, activation="softmax")(x)`:在展平后的输出上添加一个全连接层,输出大小为num_classes,激活函数为softmax。
6. 构建最终模型
- `model_final = Model(input = model.input, output = predictions)`:使用Keras中的Model类,将输入和输出连接起来,构建最终的模型。
7. 编译模型
- `model_final.compile(loss = "categorical_crossentropy", optimizer = optimizers.SGD(lr=0.0001, momentum=0.9), metrics=["accuracy"])`:编译模型,设置损失函数为交叉熵,优化器为随机梯度下降(SGD),学习率为0.0001,动量为0.9,评估指标为准确率。
8. 生成图像数据增强器
- `train_datagen`:训练集的图像数据增强器,包括对图像进行归一化、水平翻转、最近邻插值填充、缩放、宽度偏移和高度偏移。
- `test_datagen`:验证集的图像数据增强器,与训练集的数据增强器相同。
9. 生成训练和验证数据生成器
- `train_generator = train_datagen.flow_from_directory(train_data_dir, target_size = (img_height, img_width), batch_size = batch_size, class_mode = "categorical")`:生成训练数据生成器,从训练数据文件夹中加载图像,设置图像尺寸、批次大小和类别模式为分类。
- `validation_generator = test_datagen.flow_from_directory(validation_data_dir, target_size = (img_height, img_width), class_mode = "categorical")`:生成验证数据生成器,从验证数据文件夹中加载图像,设置图像尺寸和类别模式为分类。
10. 设置模型训练时的回调函数
- `checkpoint`:模型检查点,用于保存在验证集上表现最好的模型权重。
- `early`:提前停止,用于在验证集上的表现不再提升时停止训练。
11. 训练模型
- `history_object = model_final.fit_generator(train_generator, samples_per_epoch = nb_train_samples, epochs = epochs, validation_data = validation_generator, nb_val_samples = nb_validation_samples, callbacks = [checkpoint, early])`:使用生成的训练和验证数据生成器训练模型,并将训练过程的历史记录保存在history_object中。