TensorFlow 对象检测 API 教程2

最新推荐文章于 2025-06-05 18:38:41 发布

种子选手

最新推荐文章于 2025-06-05 18:38:41 发布

阅读量1.8k

点赞数

CC 4.0 BY-SA版权

本文链接：https://blog.csdn.net/qq_36148847/article/details/79307598

本文介绍如何将自定义数据集转换为TensorFlow所需的TFRecord格式，包括编写转换脚本及创建TFRecord文件的具体步骤。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

紧接上一篇，此时，已经选择了一个预先训练好的模型，以适应新的对象检测任务。在这篇文章中，将向展示如何将数据集转换为 TFRecord 文件，以便于调整模型。这是整个过程中最棘手的部分之一，并且需要动手编写一些代码，除非选择的数据集已经是特定的格式。

TensorFlow 对象检测API教程 - 第2部分：将现有数据集转换为 TFRecord

在本教程中，创建了一个可识别交通灯状态的交通灯分类器。预先训练的模型能够识别图像中的交通灯，但不是状态（绿色，黄色，红色等）。可以使用数据集 Bosch Small Traffic Light Dataset ，这数据集似乎是一个理想的选择。

一. Dataset Labels

TensorFlow Object Detection API 要求所有标记的训练数据采用 TFRecord 文件格式。如果数据集带有存储在单个 .xml 文件 （如 PASCAL VOC dataset ）中的标签，则存在名为 create_pascal_tf_record.py 的文件（可能需要稍作修改）可以将数据集转换为TFRecord 文件。

但是，如果不那么幸运，这个脚本工具无法转换你的数据集，那么将不得不编写自己的脚本，将相应的数据集转为 TFRecord 文件。The Bosch dataset labels 存储在单个 .yaml 文件中，其中的一段代码如下所示


- boxes:
  - {label: Green, occluded: false, x_max: 582.3417892052, x_min: 573.3726437481,
    y_max: 276.6271175345, y_min: 256.3114627642}
  - {label: Green, occluded: false, x_max: 517.6267821724, x_min: 510.0276868266,
    y_max: 273.164089267, y_min: 256.4279864221}
  path: ./rgb/train/2015-10-05-16-02-30_bag/720654.png
- boxes: []
  path: ./rgb/train/2015-10-05-16-02-30_bag/720932.png

注：图像 720654.png 包含两个绿灯，720932.png 不包含任何内容。

TFRecord 将整个数据集的所有标签（边界框）和图像合并到一个文件中。虽然创建 TFRecord 文件有点痛苦，但创建后使用它会非常方便。

二. 创建一个 TFRecord 条目

TensorFlow 中的 using_your_own_dataset.md 文件提供了一个示例脚本，下面将介绍这个脚本。


def create_tf_example(label_and_data_info):
  # TODO START: Populate the following variables from your example.
  height = None # Image height
  width = None # Image width
  filename = None # Filename of the image. Empty if image is not from file
  encoded_image_data = None # Encoded image bytes
  image_format = None # b'jpeg' or b'png'

  xmins = [] # List of normalized left x coordinates in bounding box (1 per box)
  xmaxs = [] # List of normalized right x coordinates in bounding box
             # (1 per box)
  ymins = [] # List of normalized top y coordinates in bounding box (1 per box)
  ymaxs = [] # List of normalized bottom y coordinates in bounding box
             # (1 per box)
  classes_text = [] # List of string class name of bounding box (1 per box)
  classes = [] # List of integer class id of bounding box (1 per box)
  # TODO END
  tf_label_and_data = tf.train.Example(features=tf.train.Features(feature={
      'image/height': dataset_util.int64_feature(height),
      'image/width': dataset_util.int64_feature(width),
      'image/filename': dataset_util.bytes_feature(filename),
      'image/source_id': dataset_util.bytes_feature(filename),
      'image/encoded': dataset_util.bytes_feature(encoded_image_data),
      'image/format': dataset_util.bytes_feature(image_format),
      'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
      'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
      'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
      'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
      'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
      'image/object/class/label': dataset_util.int64_list_feature(classes),
  }))
  return tf_label_and_data

上面的函数给出了从 .yaml 文件中提取的单个图像的标签和数据信息。使用这些信息，需要编写代码来填充所有给定的变量（注意：除了边界框和类别信息之外，还必须提供编码的图像数据）。可以使用 tensorflow.gifle.GFile() 函数完成这个操作。随着所有这些变量被填充，已经准备好操作脚本的第二部分。

三. 创建完整 TFRecord 文件

使用完成的 create_tf_record 函数，只需创建一个循环来使数据集中的每个标签调用该函数。TensorFlow 的`示例脚本为此提供了以下代码。


import tensorflow as tf
from object_detection.utils import dataset_util

flags = tf.app.flags
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS

def create_tf_example(data_and_label_info):
  ...
  ...
  return tf_data_and_label

def main(_):
  writer = tf.python_io.TFRecordWriter(FLAGS.output_path)

  # TODO START: Write code to read in your dataset to examples variable
  file_loc = None
  all_data_and_label_info = LOAD(file_loc)
  # TODO END

  for data_and_label_info in all_data_and_label_info:
    tf_example = create_tf_example(data_and_label_info)
    writer.write(tf_example.SerializeToString())

  writer.close()

if __name__ == '__main__':
  tf.app.run()

待完成之后，可以运行脚本。这里有 Bosch 数据集的 TFRecord 转换脚本，如果想查看完整的示例，Anthony Sarkis 给出了更清晰的实现方式。

如果之前并未修改 .bashrc 文件，请确保在运行此脚本之前，在终端窗口中执行 export PYTHON （教程第1部分中有说明）。打开终端，进入放有包含 TFRecord 脚本的文件夹和与.yaml`（或包含映像路径的其他文件）文件位于同一位置的数据（映像）文件目录中，运行以下命令

python tf_record.py --output_path training.record

为了确保所做的一切正确，可以比较所创建的训练记录文件的大小与包含所有训练图像的文件夹的大小。如果他们几乎完全一样，那就算是以上正确完成了！

下一篇文章将说明如何创建自己的数据集。