猫狗大战【VGG16实现】

健达qi趣蛋

已于 2024-01-15 17:43:04 修改

阅读量1.9k

点赞数 30

文章标签： python

于 2024-01-15 17:31:53 首次发布

本文链接：https://blog.csdn.net/weixin_62094186/article/details/135599555

版权

1. 任务背景

猫狗大战是一个经典的机器学习问题，旨在通过图像识别技术将猫和狗的图像进行分类。该问题的背景是，给定一组包含猫和狗图像的训练集，我们希望训练一个模型，能够根据输入的图像判断是猫还是狗，并采用 OneAPI 进行加速。

2. 任务分析准备

2.1 数据集

猫狗大战的数据集通常包含大量的猫和狗的图像，每个图像都有相应的标签，指示图像是猫还是狗。

在这里插入图片描述

2.2 模型选择

猫狗大战可以使用各种图像分类算法进行解决，其中最常用的算法是卷积神经网络 ，该任务我们就采用经典的卷积神经网络模型—— VGG16

2.3 使用OneAPI加速

OneAPI 提供了一套通用的API和库，使开发者能够以统一的方式编写并优化可在多种硬件上运行的代码。

3. 任务开始

3.1 装包 and 导包

import os
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.image as img
import tensorflow as tf
from tensorflow.keras.models import load_model
from sklearn.metrics import f1_score
import time
import tensorflow.compat.v1 as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.layers import Flatten, Dense, Dropout
from tensorflow.keras.models import Sequential
from sklearn.model_selection import train_test_split
from tensorflow.keras.callbacks import ModelCheckpoint

3.2 加载并获取数据集

# 训练集和测试集文件的相对路径
train_path = './data/train'
test_path = './data/test'
 
# 获取 test 和  train 数据集的内容列表
train_lists = os.listdir(train_path)  
test_lists = os.listdir(test_path)
 
# 打印训练集和测试集文件数量
print('训练集的数量为：{}'.format(len(train_lists)))
print('测试集的数量为：{}'.format(len(test_lists)))

在这里插入图片描述

3.3 将train数据集转化为 DataFrame 对象，方便后续操作

DataFrame 是一种二维的数据结构，类似于表格或电子表格。它是 Pandas 库中最重要的数据结构之一，用于处理和分析结构化的数据。它提供了丰富的功能和方法，使得数据的操作和分析变得更加灵活和高效。一些常见的操作包括数据的读取与写入、数据的过滤和选择、数据的聚合和统计分析、数据的合并和连接等。

# 创建空列表用于存储图像文件路径和标签
images = []
labels = []
for train_item in train_lists:
    label = train_item.split('.')[0]
    labels.append(label)
    
    # 得到图像文件的完整路径
    image = os.path.join(train_path, train_item)
    images.append(image)
 
# 创建一个键值对对象来存储图像和标签
train_frame = pd.DataFrame()
train_frame['train_image_path'] = images 
train_frame['label'] = labels
train_frame.head()

在这里插入图片描述

3.4 同上，将test数据集转化为 DataFrame 对象，方便后续操作

# 创建空列表用于存储图像文件路径和标签
images = []
idxs = []
for test_item in test_lists:
    idx = test_item.split('.')[0]
    idxs.append(idx)
    image = os.path.join(test_path, test_item)
    images.append(image)
 
# 创建一个键值对对象来存储图像和标签
test_frame = pd.DataFrame()
test_frame['test_image_path'] = images 
 
test_frame.head()

在这里插入图片描述

【补】：打印 dataframe 对象，查看其内部结构如下：

在这里插入图片描述

3.5 将 train 数据集 -----> 训练集 + 验证集

from sklearn.model_selection import train_test_split
 
# 使用分层抽样将训练集划分为训练集和验证集
train_list, verification_list = train_test_split(train_frame, random_state=42, stratify=train_frame['label'])
 
print("训练集数量为: {}".format(len(train_list)))
print("验证集数量为: {}".format(len(verification_list)))
 
# 绘制直方图
train_list['label'].hist(color='yellow') # 样本无偏
verification_list['label'].hist(color='pink')

在这里插入图片描述

3.6 数据增强【借助ImageDataGenerator】

ImageDataGenerator是Keras库中的一个工具，用于进行图像数据增强。你可以定义一系列的数据增强操作，例如旋转、缩放、平移、剪切、翻转等，然后将这些操作应用于原始图像，生成增强后的图像。

数据增强 是一种在训练过程中通过对原始图像进行随机变换来生成新图像的技术，旨在扩充训练数据集，增加样本的多样性，提高模型的泛化能力。

分别对 train 、test 、verification 三个数据集增强，代码如下：

（1） train ：

from tensorflow.keras.preprocessing.image import ImageDataGenerator
 
# 创建一个 ImageDataGenerator 对象，用于数据增强
train_gen = ImageDataGenerator(
    zoom_range=0.1, # 随机缩放图像的大小范围
    rotation_range=10, # 随机旋转图像的角度范围
    rescale=1./255, # 缩放像素值到 [0,1] 区间
    shear_range=0.1, # 随机扭曲图像的剪切强度
    horizontal_flip=True, # 随机水平翻转图像
    width_shift_range=0.1, # 随机水平平移图像的宽度比例
    height_shift_range=0.1 # 随机垂直平移图像的高度比例
)
 
# 从 DataFrame 中读取训练集数据，并使用 train_gen 进行数据增强
train_generator = train_gen.flow_from_dataframe(
    dataframe=train_list, # 训练集 DataFrame
    x_col='train_image_path', # 图像文件路径列名
    y_col='label', # 标签列名
    target_size=(200,200), # 图像尺寸
    class_mode='binary', # 二分类问题
    batch_size=128, # 每个批次的图像数量
    shuffle=False # 不打乱数据顺序
)

（2）test ：

# 创建一个 ImageDataGenerator 对象，用于对测试集进行数据预处理
test_datagen = ImageDataGenerator(
    rescale=1./255 # 缩放像素值到 [0,1] 区间
)
 
# 从 DataFrame 中读取测试集数据，并使用 test_datagen 进行数据预处理
test_generator = test_datagen.flow_from_dataframe(
    dataframe=test_frame, # 测试集 DataFrame
    x_col='test_image_path', # 图像文件路径列名
    y_col=None, # 没有标签列
    target_size=(200,200), # 图像尺寸
    class_mode=None, # 没有标签，不需要返回类别
    batch_size=128, # 每个批次的图像数量
    shuffle=False # 不打乱数据顺序
)

（3） verification

Verification_gen = ImageDataGenerator(
    rescale=1./255
)
 
# 使用 flow_from_dataframe 函数从 DataFrame 中创建验证集的图像生成器
Verification_generator = Verification_gen.flow_from_dataframe(
    dataframe=verification_list,  # 验证集的 DataFrame
    x_col='train_image_path',  # 图像文件路径列名
    y_col='label',  # 标签列名
    target_size=(200, 200),  # 目标图像尺寸
    class_mode='binary',  # 类别模式（二分类）
    batch_size=128,  # 批量大小
    shuffle=False  # 不对数据进行洗牌
)

3.7 选择模型【选择RMSprop进行优化】、训练模型、保存模型

vgg16_model = VGG16(weights='imagenet', include_top=False, input_shape=(200, 200, 3))

# 冻结卷积层的权重
for layer in vgg16_model.layers:
    layer.trainable = False

# 新建顶部的全连接层
model = Sequential()
model.add(vgg16_model)
model.add(Flatten())
model.add(Dense(256, activation='relu'))  # 修改全连接层的神经元个数为256
model.add(Dropout(0.5))  # 修改Dropout的比例为0.5
model.add(Dense(1, activation='sigmoid'))

# 编译模型，选择的是RMSprop优化器，学习率为0.0001
model.compile(loss='binary_crossentropy', optimizer=tf.keras.optimizers.RMSprop(lr=0.0001), metrics=['accuracy'])

# 打印模型信息
model.summary()

import tensorflow as tf
from tensorflow.keras.callbacks import ModelCheckpoint

# 定义模型保存路径
checkpoint_save_path = "./checkpoint/mnist.ckpt"

# 检查之前保存的模型文件是否存在
if os.path.exists(checkpoint_save_path + '.index'):
    print('训练begin')
    model.load_weights(checkpoint_save_path)  # 加载模型权重

# 创建模型保存的回调函数
cp_callback = ModelCheckpoint(
    filepath=checkpoint_save_path,
    save_weights_only=True,  # 只保存权重
    save_best_only=True  # 只保存在验证集上表现最好的模型
)

# 训练模型
history = model.fit(
    train_generator,  # 训练数据集
    epochs=5,  # 增加训练轮数为5
    batch_size=64,  # 减小批次大小为64
    validation_data=val_generator,  # 验证数据集
    validation_freq=1,  # 验证频率
    callbacks=[cp_callback],  # 回调函数列表
    verbose=1  # 打印训练过程中的详细信息
)

# 保存模型
model.save('model')

训练过程如下：
在这里插入图片描述

3.8 评估模型性能——test 预测

 
# 加载保存的模型并进行预测
loaded_model = tf.keras.models.load_model('model')
 
predictions = loaded_model.predict(Verification_generator, steps=len(Verification_generator))
 
# 将预测结果转换为类别标签
predicted_classes = np.array([int(prediction > 0.5) for prediction in predictions])
 
# 计算并输出 F1 分数
true_labels = Verification_generator.classes
f1 = f1_score(true_labels, predicted_classes)
print("F1 分数为:", f1)

【运行结果】

在这里插入图片描述

3.9 使用 OneAPI 加速

 
# 加载模型
loaded_model = load_model('model')
 
# 加载 TensorFlow 模型
loaded_graph_def = tf.compat.v1.GraphDef()
with tf.io.gfile.GFile(model_path, 'rb') as f:
    loaded_graph_def.ParseFromString(f.read())
 
# 将 TensorFlow 模型转换为 OpenVINO IR 模型
ir_model_path = 'my_model.xml'
mo_tf_path = '/opt/intel/openvino/deployment_tools/model_optimizer/mo_tf.py'
!python {mo_tf_path} --input_model {model_path} --output_dir . --model_name my_model --input_shape [1,224,224,3] --data_type FP32
 
# 加载 OpenVINO 推理引擎
ie = IECore()
net = ie.read_network(model='my_model.xml', weights='my_model.bin')
 
# 获取输入输出节点名称
input_blob = next(iter(net.inputs))
output_blob = next(iter(net.outputs))
 
# 创建执行器
exec_net = ie.load_network(network=net, device_name="CPU")
 
# 从测试生成器中获取一个批次的数据进行推理
batch_size = 32
x_test, y_test = test_generator.next()
x_test = x_test.astype('float32')
 
# 计算推理时间
start_time = time.time()
 
# 使用加载的模型进行预测
res = exec_net.infer(inputs={input_blob: x_test})
 
end_time = time.time()
inference_time = end_time - start_time
 
# 获取预测结果
predictions = res[output_blob]
 
# 将预测结果转换为二进制形式
y_pred_binary = tf.round(predictions)
 
# 计算 F1 值
f1 = f1_score(y_test, y_pred_binary)
 
# 打印 F1 值
print(f'F1 Score: {f1}')
 
# 打印推理时间
print(f'Inference Time: {inference_time:.2f}s')

【运行结果】
在这里插入图片描述

综上，对比未加速前的预测时间以及 f1 分数，使用 OneAPI 加速后，预测时间还有准确度都得到了较好的优化！

四. 结果分析

提高推理速度：使用oneAPI加速可以显著提高VGG16模型的推理速度。由于oneAPI利用了硬件加速器（如GPU、FPGA等），可以加快模型的计算速度，从而更快地对图像进行分类。
加速效果因硬件而异： oneAPI的加速效果取决于你使用的硬件设备。不同的硬件设备具有不同的计算能力和优化特性。因此，同样的模型在不同的硬件上可能会有不同的加速效果。
精度保持一致：使用oneAPI加速不会影响模型的分类精度。加速是通过优化计算过程和利用硬件加速器来实现的，但不会改变模型的权重和参数，因此分类结果的准确性应该与未加速的模型保持一致。
更高的能耗：尽管oneAPI加速了模型的推理速度，但它可能会导致更高的能耗。硬件加速器通常需要更多的电力供应和散热措施，因此在使用oneAPI加速时需要考虑能源消耗和散热需求。

总体而言，通过使用VGG16模型和oneAPI加速进行猫狗大战实验，你可能会获得更快的推理速度而不损失分类精度。然而，要根据具体的硬件设备和应用场景来评估加速效果，并权衡速度、精度和能源消耗等因素。

健达qi趣蛋

关注

30
点赞
踩
37

收藏

觉得还不错? 一键收藏
0
评论
猫狗大战【VGG16实现】

一些常见的操作包括数据的读取与写入、数据的过滤和选择、数据的聚合和统计分析、数据的合并和连接等。是一个经典的机器学习问题，旨在通过图像识别技术将猫和狗的图像进行分类。该问题的背景是，给定一组包含猫和狗图像的训练集，我们希望训练一个模型，能够根据输入的图像判断是猫还是狗，并采用 OneAPI 进行加速。你可以定义一系列的数据增强操作，例如旋转、缩放、平移、剪切、翻转等，然后将这些操作应用于原始图像，生成增强后的图像。猫狗大战的数据集通常包含大量的猫和狗的图像，每个图像都有相应的标签，指示图像是猫还是狗。
复制链接

扫一扫