21个tensorflow项目（三）：打造自己的图像识别模型

最新推荐文章于 2024-06-25 23:19:16 发布

想看一次满天星

最新推荐文章于 2024-06-25 23:19:16 发布

阅读量572

点赞数

分类专栏： 21个tensorflow项目文章标签： tensorflow 深度学习 python

本文链接：https://blog.csdn.net/wstc2689784536/article/details/130198785

版权

21个tensorflow项目专栏收录该内容

4 篇文章 1 订阅

订阅专栏

21个tensorflow项目（三）：打造自己的图像识别模型

环境介绍
数据准备
自己对数据集的一些疑惑

环境介绍

Python版本：Python 3.8.16
TensorFlow版本：2.6.0

数据准备

书中选用的数据集为卫星图像数据集，在这里也采用同样的数据集
下载地址为：https://wwn.lanzout.com/i6KkX0t8ulfe
解压后的如图所示：
在这里插入图片描述

类别名	含义	示例图像
glacier	冰川

rock	岩石

urban	城市区域

water	水域

wetland	农田

wood	森林

制作TFrecord

在TensorFlow 2.x中，制作TFRecord主要涉及以下步骤：

定义特征信息：首先需要定义每个样本的特征信息，包括特征的名称、类型和形状等。可以使用tf.train.Feature类来定义每个特征。
将特征信息转换为Example格式：将每个样本的特征信息转换为tf.train.Example格式，可以使用tf.train.Example类来完成。
将Example序列化为字符串：将每个tf.train.Example序列化为字符串，可以使用tf.io.serialize_tensor()函数。
将序列化后的字符串写入TFRecord文件：使用tf.io.TFRecordWriter类将序列化后的字符串写入TFRecord文件中。

制作TFrecord代码:

import tensorflow as tf
import os

image_labels = {"glacier": 0, "rock": 1, "urban": 2, "water": 3 , "wetland": 4, "wood": 5 }


def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))


def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def image_example(image_string, label):
  image_shape = tf.image.decode_jpeg(image_string).shape

  feature = {
      'height': _int64_feature(image_shape[0]),
      'width': _int64_feature(image_shape[1]),
      'depth': _int64_feature(image_shape[2]),
      'label': _int64_feature(label),
      'image_raw': _bytes_feature(image_string),
  }

  return tf.train.Example(features=tf.train.Features(feature=feature))


def creat_tfrecord(path,path_data):
    with tf.io.TFRecordWriter(path_data) as writer:
        for label in os.listdir(path):
            if label not in image_labels:
                continue
            label_path = os.path.join(path, label)
            for image_name in os.listdir(label_path):
                image_path = os.path.join(label_path, image_name)
                tf_example = image_example(open(image_path, 'rb').read(), image_labels[label])
                writer.write(tf_example.SerializeToString())

train_folder = "data_prepare/pic/train"
validation_folder = "data_prepare/pic/validation"
train_output_file = "./train.tfrecord"
validation_output_file = "./validation.tfrecord"

creat_tfrecord(train_folder,train_output_file)
creat_tfrecord(validation_folder,validation_output_file)

读取TFrecord代码：

import tensorflow as tf

raw_image_dataset = tf.data.TFRecordDataset('train.tfrecord')

image_feature_description = {
    'height': tf.io.FixedLenFeature([], tf.int64),
    'width': tf.io.FixedLenFeature([], tf.int64),
    'depth': tf.io.FixedLenFeature([], tf.int64),
    'label': tf.io.FixedLenFeature([], tf.int64),
    'image_raw': tf.io.FixedLenFeature([], tf.string),
}

def _parse_image_function(example_proto):
  return tf.io.parse_single_example(example_proto, image_feature_description)

parsed_image_dataset = raw_image_dataset.map(_parse_image_function)


for raw_record in parsed_image_dataset.take(1):
  print(repr(raw_record))

训练模型代码

代码

import tensorflow as tf
from tensorflow.keras import layers, models, optimizers
from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.preprocessing.image import ImageDataGenerator

NUM_CLASSES = 6
IMG_SIZE = (299, 299)
BATCH_SIZE = 32
LEARNING_RATE = 0.001
EPOCHS = 20
TRAIN_TFRECORD = "./train.tfrecord"
VALID_TFRECORD = "./validation.tfrecord"

def parse_tfrecord(serialized_example):
    feature_description = {
        'height': tf.io.FixedLenFeature([], tf.int64),
        'width': tf.io.FixedLenFeature([], tf.int64),
        'depth': tf.io.FixedLenFeature([], tf.int64),
        'label': tf.io.FixedLenFeature([], tf.int64),
        'image_raw': tf.io.FixedLenFeature([], tf.string),
    }
    example = tf.io.parse_single_example(serialized_example, feature_description)
    image = tf.image.decode_jpeg(example['image_raw'], channels=3)
    image = tf.image.resize(image, IMG_SIZE)
    image = tf.cast(image, tf.float32) / 255.0
    label = tf.one_hot(example['label'], NUM_CLASSES)
    return image, label

def create_dataset(file_path):
    dataset = tf.data.TFRecordDataset(file_path)
    dataset = dataset.map(parse_tfrecord, num_parallel_calls=tf.data.AUTOTUNE)
    dataset = dataset.shuffle(1000)
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.prefetch(tf.data.AUTOTUNE)
    return dataset
  
train_dataset = create_dataset(TRAIN_TFRECORD)
valid_dataset = create_dataset(VALID_TFRECORD)

base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(*IMG_SIZE, 3))

x = layers.GlobalAveragePooling2D()(base_model.output)
x = layers.Dense(1024, activation='relu')(x)
x = layers.Dropout(0.5)(x)
x = layers.Dense(NUM_CLASSES, activation='softmax')(x)

model = models.Model(inputs=base_model.input, outputs=x)

for layer in base_model.layers:
    layer.trainable = False
  
model.compile(optimizer=optimizers.Adam(LEARNING_RATE),
              loss='categorical_crossentropy',
              metrics=['accuracy'])
         
history = model.fit(train_dataset, epochs=EPOCHS, validation_data=valid_dataset)

loss, accuracy = model.evaluate(valid_dataset)
print("Validation accuracy:", accuracy)

model.save("13.h5")

代码解析

定义常量和超参数：NUM_CLASSES 表示图像分类的类别数，IMG_SIZE 表示输入图像的尺寸，BATCH_SIZE 表示每个 batch 的大小，LEARNING_RATE 表示优化器的学习率，EPOCHS 表示模型训练的轮数，TRAIN_TFRECORD 和 VALID_TFRECORD 表示训练和验证数据集的 TFRecord 文件路径。
定义用于读取和预处理 TFRecord 数据的函数 parse_tfrecord，其中通过 tf.io.parse_single_example 解析 TFRecord 文件，然后将图像进行解码、缩放和类型转换，将标签进行 one-hot 编码，并返回图像和标签。
定义用于创建数据集的函数 create_dataset，其中通过 tf.data.TFRecordDataset 读取 TFRecord 文件，然后使用 map 函数应用 parse_tfrecord 函数对每个样本进行解析和预处理，再对数据进行 shuffle、batch 和 prefetch 操作，并返回处理后的数据集。
创建 train_dataset 和 valid_dataset 分别对应训练和验证数据集。
定义 InceptionV3 模型，使用 InceptionV3 函数从预训练的 InceptionV3 模型中加载权重，同时指定 include_top=False 表示只加载卷积部分的权重，将其输出作为新模型的输入。
在 InceptionV3 模型的输出上添加新的全局平均池化层、全连接层和 Dropout 层，并输出 NUM_CLASSES 个类别的概率分布。
构建模型，指定输入和输出，并将 InceptionV3 模型的层冻结。
使用 Adam 优化器、交叉熵损失函数和准确率指标编译模型。
对训练数据集进行 EPOCHS 轮训练，使用验证数据集评估模型性能。
最后将训练好的模型保存为 13.h5 文件。

运行结果

在这里插入图片描述我使用20个epoch来训练，每个epoch包含150个batch。训练过程中，损失函数的值从1.0670下降到了0.3733，准确率从0.7135上升到了0.8617。同时也可以看到，验证集的损失值从1.0338下降到了0.6949，准确率从0.6058上升到了0.7692。模型的训练过程看起来还不错，但我觉得不是很好。

预测代码

代码

import tensorflow as tf
import numpy as np
from PIL import Image

NUM_CLASSES = 6
IMG_SIZE = (299, 299)

model = tf.keras.models.load_model('13.h5')

def preprocess_image(image_path):
    img = Image.open(image_path)
    img = img.resize(IMG_SIZE)
    img = np.array(img) / 255.0
    img = np.expand_dims(img, axis=0)
    return img

image_path = 'data_prepare/pic/validation/wood/73987_98782_18.jpg'
preprocessed_image = preprocess_image(image_path)
prediction = model.predict(preprocessed_image)

predicted_class = np.argmax(prediction)
print("Predicted class:", predicted_class)

运行结果

在这里插入图片描述可以看出正确的预测出了图片的类别

自己对数据集的一些疑惑

本次的数据集我是从网上下载下来的数据集与书本上的一致，不知道是我的下载地址有误，还是数据集本身是这样的，我觉得数据集有些图片的标签是不是标错了例如：
在这里插入图片描述在数据集中这张图片的标签为wetland，但这明显根本不是wetland，而是urban。还有一部分的图片也有这样的情况，不知道是什么原因。所以我认为模型训练的效果不理想可能跟这个有关，当然，这只是我自己的看法。