TensorFlow Object Detection

QSHigh

已于 2024-07-09 17:26:00 修改

阅读量658

点赞数 15

文章标签： tensorflow 目标检测 neo4j

于 2024-07-09 16:46:22 首次发布

本文链接：https://blog.csdn.net/qq_24090609/article/details/140300340

版权

TensorFlow Object Detection 基础讲解

重要步骤及代码详细解释

1. 环境准备

安装 Python 和 TensorFlow

sudo apt-get update
sudo apt-get install python3-pip
pip3 install tensorflow

sudo apt-get update: 更新包列表，确保获取最新的软件包信息。
sudo apt-get install python3-pip: 安装 Python 3 和 pip（Python 包管理器）。
pip3 install tensorflow: 使用 pip 安装 TensorFlow。

克隆 TensorFlow Models 仓库

git clone https://github.com/tensorflow/models.git
cd models/research

git clone https://github.com/tensorflow/models.git: 克隆 TensorFlow 的官方模型库，包含对象检测的相关模型和示例代码。
cd models/research: 进入克隆下来的仓库目录，进行后续操作。

安装依赖项

pip3 install -r requirements.txt

pip3 install -r requirements.txt: 安装 models/research 目录下 requirements.txt 文件中列出的所有依赖项。该文件包含 TensorFlow Object Detection 所需的库。

2. 配置和训练模型

准备数据集

使用 LabelImg 工具标注数据集，并将标注信息保存为 XML 文件。

数据集解释

1. 含义及对应关系

Training Dataset (训练数据集): 用于训练模型的数据集。它包含了用于模型学习的输入数据和对应的标注数据。训练集是整个数据集中最大的一部分，模型通过不断调整参数来拟合这些数据。
Validation Dataset (验证数据集): 在训练过程中用于评估模型性能的数据集。它不用于更新模型参数，而是用于选择最佳模型。验证集帮助确定模型是否过拟合或欠拟合。
Test Dataset (测试数据集): 用于在训练完成后评估模型性能的数据集。它完全独立于训练集和验证集，用于提供模型在实际数据上的性能评估。

2. 数据量及比例

对于数据量和比例，没有固定的标准，因为这取决于具体问题和数据集规模。不过，一般建议如下：

Training Dataset: 通常占总数据集的 60-80%。较大的训练集有助于模型学到更多的特征，提高模型的泛化能力。
Validation Dataset: 通常占总数据集的 10-20%。验证集的主要作用是用于模型选择和调参，确保模型在未见过的数据上的表现。
Test Dataset: 通常占总数据集的 10-20%。测试集用于最终评估模型的性能，保证其在真实环境中的效果。

数据量

小型数据集: 几百到几千张图片。例如，对于简单任务（如单一物体检测），几百张图片可能足够。
中型数据集: 几千到几万张图片。例如，对于中等复杂度的任务（如多物体检测），可能需要几千到几万张图片。
大型数据集: 几万到几十万张图片。例如，对于复杂任务（如多类别、多场景检测），可能需要几十万张图片。

数据比例的实践

假设有一个数据集包含 10,000 张图片：

训练集: 70% -> 7,000 张图片
验证集: 15% -> 1,500 张图片
测试集: 15% -> 1,500 张图片

实践建议

数据集的多样性: 确保数据集包含各种场景和条件，以提高模型的泛化能力。
数据扩增: 使用数据扩增技术（如旋转、缩放、翻转等）来增加训练集的多样性，从而提高模型的鲁棒性。
评估指标: 使用多种评估指标（如 Precision, Recall, mAP 等）来全面评估模型性能。

转换数据集为 TFRecord 格式

python3 object_detection/dataset_tools/create_pet_tf_record.py --data_dir=path/to/dataset --output_dir=path/to/output

python3 object_detection/dataset_tools/create_pet_tf_record.py: 运行脚本将数据集转换为 TFRecord 格式。
- --data_dir=path/to/dataset: 数据集路径。
- --output_dir=path/to/output: 转换后的 TFRecord 文件保存路径。

配置和训练模型

python3 object_detection/model_main.py --pipeline_config_path=path/to/config --model_dir=path/to/model --alsologtostderr

python3 object_detection/model_main.py: 运行模型训练脚本。
- --pipeline_config_path=path/to/config: 模型配置文件路径，定义了模型架构、训练参数等。
- --model_dir=path/to/model: 模型输出路径，保存训练过程中的检查点和日志。
- --alsologtostderr: 允许日志信息输出到标准错误输出流。

3. 验证和导出模型

验证模型性能

使用验证集评估模型性能。

导出模型用于推理

python3 object_detection/export_inference_graph.py --input_type=image_tensor --pipeline_config_path=path/to/config --trained_checkpoint_prefix=path/to/checkpoint --output_directory=path/to/output

python3 object_detection/export_inference_graph.py: 导出推理图。
- --input_type=image_tensor: 输入类型为图像张量。
- --pipeline_config_path=path/to/config: 模型配置文件路径。
- --trained_checkpoint_prefix=path/to/checkpoint: 训练检查点路径。
- --output_directory=path/to/output: 导出的推理图保存路径。

特殊名词讲解

TFRecord：一种用于存储训练数据的 TensorFlow 数据格式。
LabelImg：用于标注对象在图像中位置的工具，生成 XML 格式的标注文件。
Pipeline Config：配置文件，定义模型架构、训练参数等内容。

3. Java调用API进行推理

设置环境

安装 Java 和 Maven。

添加 TensorFlow Java 依赖

在项目的 pom.xml 文件中添加以下依赖：

<dependency>
  <groupId>org.tensorflow</groupId>
  <artifactId>tensorflow</artifactId>
  <version>1.15.0</version>
</dependency>

加载和调用模型

下面是一个使用 TensorFlow Java API 加载和调用训练好的模型的示例代码：

import org.tensorflow.Graph;
import org.tensorflow.Session;
import org.tensorflow.Tensor;
import org.tensorflow.Tensors;

import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.List;

public class ObjectDetection {
    public static void main(String[] args) throws Exception {
        // 加载冻结的推理图（frozen inference graph）
        byte[] graphDef = Files.readAllBytes(Paths.get("path/to/frozen_inference_graph.pb"));
        try (Graph graph = new Graph()) {
            graph.importGraphDef(graphDef);
            try (Session session = new Session(graph)) {
                // 加载图像并转换为 Tensor
                byte[] imageBytes = Files.readAllBytes(Paths.get("path/to/image.jpg"));
                Tensor<String> inputTensor = Tensors.create(imageBytes);

                // 执行推理
                List<Tensor<?>> results = session.runner()
                        .feed("image_tensor", inputTensor)
                        .fetch("detection_boxes")
                        .fetch("detection_scores")
                        .fetch("detection_classes")
                        .run();

                // 处理结果
                Tensor<Float> boxes = results.get(0).expect(Float.class);
                Tensor<Float> scores = results.get(1).expect(Float.class);
                Tensor<Float> classes = results.get(2).expect(Float.class);

                // 打印结果
                System.out.println("检测到的对象: " + classes);
                System.out.println("检测分数: " + scores);
                System.out.println("检测框: " + boxes);
            }
        }
    }
}