tensorflow object detection api v1训练自己的数据集详细教程

该博客详细介绍了如何使用Docker拉取NVIDIA的TensorFlow镜像,启动容器,并在容器内安装依赖,特别是TensorFlow物体检测APIv2。接着,它指导如何将数据集转换为VOC格式,过滤非JPEG图片,以及将数据集转换为TFRecord。此外,还提供了创建label_map.pbtxt文件的步骤。最后,展示了训练配置的修改方法以及训练和测试的命令。
摘要由CSDN通过智能技术生成

1. 安装

  • 拉取镜像

    docker pull nvcr.io/nvidia/tensorflow:21.02-tf1-py3
    
  • 启动容器

    container_name=tf21.02
    docker run -itd –name {container_name} -v /workspace:/workspace nvcr.io/nvidia/tensorflow:21.02-tf1-py3 /bin/bash
    
  • 进入容器

    docker exec -it {container_name} /bin/bash
    
  • 安装依赖


  apt-get update && apt-get install -y \
      git \
      gpg-agent \
      python3-cairocffi \
      protobuf-compiler \
      python3-pil \
      python3-lxml \
      python3-tk \
      wget

  • 安装tf object detection api v2

    cd /opt
    git clone https://github.com/tensorflow/models.git
    cp models/ /root/models
    cd /root/models/research
    protoc object_detection/protos/*.proto --python_out=.  # 编译proto文件
    cp object_detection/packages/tf2/setup.py .  
    pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple  # 换成清华源
    python -m pip install .
    python object_detection/builders/model_builder_tf2_test.py   # 测试是否安装成功
    

2. 制作自定义数据集

  • 首先将数据集格式制作成VOC格式

    VOCdevkit
    -- VOC2007
       -- Annotations
          -- *.xml
       -- JPEGImages
          -- *.jpg
       -- ImageSets
          -- Main
             -- train.txt
             -- test.txt  
    
  • 过滤掉不是JPEG编码的图片

    import PIL.Image
    import os
    imgs_folder = r'/workspace/VOCdevkit/VOC2007/JPEGImages'   # jpg 存放目录
    annos_folder = r'/workspace/VOCdevkit/VOC2007/Annotations'  # xml 存放目录
    imgs_filename = os.listdir(imgs_folder)
    
    for filename in imgs_filename:
        img_fullpath = os.path.join(img_folder, filename)
        image = PIL.Image.open(img_fullpath)
    
        if image.format != 'JPEG':
            # print(img_fullpath)
            os.remove(img_fullpath)
            anno_filename = filename[:-4] + '.xml'
            anno_fullpath = os.path.join(annos_folder, anno_filename)
            os.remove(anno_fullpath)
    
    
  • 转tfrecord

    • 在/root/models/research/object_detection 目录下创建create_pascal_tf_record.py 脚本

      vim create_pascal_tf_record.py
      
    • 将以下代码复制到create_pascal_tf_record.py中

      # Copyright 2017 The TensorFlow Authors. All Rights Reserved.
      #
      # Licensed under the Apache License, Version 2.0 (the "License");
      # you may not use this file except in compliance with the License.
      # You may obtain a copy of the License at
      #
      #     http://www.apache.org/licenses/LICENSE-2.0
      #
      # Unless required by applicable law or agreed to in writing, software
      # distributed under the License is distributed on an "AS IS" BASIS,
      # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
      # See the License for the specific language governing permissions and
      # limitations under the License.
      # ==============================================================================
      
      r"""Convert raw PASCAL dataset to TFRecord for object_detection.
      
      Example usage:
          python object_detection/dataset_tools/create_pascal_tf_record.py \
              --data_dir=/home/user/VOCdevkit \
              --year=VOC2012 \
              --output_path=/home/user/pascal.record
      """
      from __future__ import absolute_import
      from __future__ import division
      from __future__ import print_function
      
      import hashlib
      import io
      import logging
      import os
      
      from lxml import etree
      import PIL.Image
      import tensorflow.compat.v1 as tf
      
      from object_detection.utils import dataset_util
      from object_detection.utils import label_map_util
      
      
      flags = tf.app.flags
      flags.DEFINE_string('data_dir', '', 'Root directory to raw PASCAL VOC dataset.')
      flags.DEFINE_string('set', 'train', 'Convert training set, validation set or '
                          'merged set.')
      flags.DEFINE_string('annotations_dir', 'Annotations',
                          '(Relative) path to annotations directory.')
      flags.DEFINE_string('year', 'VOC2007', 'Desired challenge year.')
      flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
      flags.DEFINE_string('label_map_path', 'data/pascal_label_map.pbtxt',
                          'Path to label map proto')
      flags.DEFINE_boolean('ignore_difficult_instances', False, 'Whether to ignore '
                           'difficult instances')
      FLAGS = flags.FLAGS
      
      SETS = ['train', 'val', 'trainval', 'test']
      YEARS = ['VOC2007', 'VOC2012', 'merged']
      
      
      def dict_to_tf_example(data,
                             dataset_directory,
                             label_map_dict,
                             ignore_difficult_instances=False,
                             image_subdirectory='JPEGImages'):
        """Convert XML derived dict to tf.Example proto.
      
        Notice that this function normalizes the bounding box coordinates provided
        by the raw data.
      
        Args:
          data: dict holding PASCAL XML fields for a single image (obtained by
            running dataset_util.recursive_parse_xml_to_dict)
          dataset_directory: Path to root directory holding PASCAL dataset
          label_map_dict: A map from string label names to integers ids.
          ignore_difficult_instances: Whether to skip difficult instances in the
            dataset  (default: False).
          image_subdirectory: String specifying subdirectory within the
            PASCAL dataset directory holding the actual image data.
      
        Returns:
          example: The converted tf.Example.
      
        Raises:
          ValueError: if the image pointed to by data['filename'] is not a valid JPEG
        """
        img_path = os.path.join('VOC2007','JPEGImages',  data['filename'])
        full_path = os.path.join(dataset_directory, img_path)
        with tf.gfile.GFile(full_path, 'rb') as fid:
          encoded_jpg = fid.read()
        encoded_jpg_io = io.BytesIO(encoded_jpg)
        image = PIL.Image.open(encoded_jpg_io)
      
        if image.format != 'JPEG':
          print(full_path)
          raise ValueError('Image format not JPEG')
        key = hashlib.sha256(encoded_jpg).hexdigest()
      
        width = int(data['size']['width'])
        height = int(data['size']['height'])
      
        xmin = []
        ymin = []
        xmax = []
        ymax = []
        classes = []
        classes_text = []
        truncated = []
        poses = []
        difficult_obj = []
        if 'object' in data:
          for obj in data['object']:
            difficult = bool(int(obj['difficult']))
            if ignore_difficult_instances and difficult:
              continue
      
            difficult_obj.append(int(difficult))
      
            xmin.append(float(obj['bndbox']['xmin']) / width)
            ymin.append(float(obj['bndbox']['ymin']) / height)
            xmax.append(float(obj['bndbox']['xmax']) / width)
            ymax.append(float(obj['bndbox']['ymax']) / height)
            classes_text.append(obj['name'].encode('utf8'))
            classes.append(label_map_dict[obj['name']])
            truncated.append(int(obj['truncated']))
            poses.append(obj['pose'].encode('utf8'))
      
        example = tf.train.Example(features=tf.train.Features(feature={
            'image/height': dataset_util.int64_feature(height),
            'image/width': dataset_util.int64_feature(width),
            'image/filename': dataset_util.bytes_feature(
                data['filename'].encode('utf8')),
            'image/source_id': dataset_util.bytes_feature(
                data['filename'].encode('utf8')),
            'image/key/sha256': dataset_util.bytes_feature(key.encode('utf8')),
            'image/encoded': dataset_util.bytes_feature(encoded_jpg),
            'image/format': dataset_util.bytes_feature('jpeg'.encode('utf8')),
            'image/object/bbox/xmin': dataset_util.float_list_feature(xmin),
            'image/object/bbox/xmax': dataset_util.float_list_feature(xmax),
            'image/object/bbox/ymin': dataset_util.float_list_feature(ymin),
            'image/object/bbox/ymax': dataset_util.float_list_feature(ymax),
            'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
            'image/object/class/label': dataset_util.int64_list_feature(classes),
            'image/object/difficult': dataset_util.int64_list_feature(difficult_obj),
            'image/object/truncated': dataset_util.int64_list_feature(truncated),
            'image/object/view': dataset_util.bytes_list_feature(poses),
        }))
        return example
      
      
      def main(_):
        if FLAGS.set not in SETS:
          raise ValueError('set must be in : {}'.format(SETS))
        if FLAGS.year not in YEARS:
          raise ValueError('year must be in : {}'.format(YEARS))
      
        data_dir = FLAGS.data_dir
        years = ['VOC2007', 'VOC2012']
        if FLAGS.year != 'merged':
          years = [FLAGS.year]
      
        writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
      
        label_map_dict = label_map_util.get_label_map_dict(FLAGS.label_map_path)
      
        for year in years:
          logging.info('Reading from PASCAL %s dataset.', year)
          examples_path = os.path.join(data_dir, year, 'ImageSets', 'Main',
                                       FLAGS.set + '.txt')
          annotations_dir = os.path.join(data_dir, year, FLAGS.annotations_dir)
          examples_list = dataset_util.read_examples_list(examples_path)
          for idx, example in enumerate(examples_list):
            if idx % 100 == 0:
              logging.info('On image %d of %d', idx, len(examples_list))
            path = os.path.join(annotations_dir, example + '.xml')
            with tf.gfile.GFile(path, 'r') as fid:
              xml_str = fid.read()
            xml = etree.fromstring(xml_str.encode('utf-8'))
            data = dataset_util.recursive_parse_xml_to_dict(xml)['annotation']
      
            tf_example = dict_to_tf_example(data, FLAGS.data_dir, label_map_dict,
                                            FLAGS.ignore_difficult_instances)
            writer.write(tf_example.SerializeToString())
      
        writer.close()
      
      
      if __name__ == '__main__':
        tf.app.run()
      
    • 制作label_map.pbtxt

      vim /workspace/label_map.pbtxt
      
      # 将以下内容复制进去,类别id从1开始,不包括背景。
      item {
        id: 1
        name: 'car'
      }
      
    • 生成train.tfrecord, test.tfrecord

      
      
      dataset_dir = '{your_path}/VOCdevkit'  # eg: /Munyun/VOCdevkit
      # 生成train.tfrecord
      python create_pascal_tf_record.py \
             --data_dir={dataset_dir} \
             --set=train \
             --year=VOC2007 \
             --output_path={your_path}/train.tfrecord \
             --label_map_path={your_path}/label_map.txt 
             
      # 生成test.tfrecord
      python create_pascal_tf_record.py \
             --data_dir={dataset_dir} \
             --set=test \
             --year=VOC2007 \
             --output_path={your_path}/test.tfrecord \
             --label_map_path={your_path}/label_map.txt
      
训练:
python create_anfang_tfrecord.py --data_dir=/home/Mycode/VOCdevkit_anfang --set=train --year=VOC2007 --output_path=/home/Mycode/train.tfrecord --label_map_path=anfang_label_map.pbtxt
测试:
python create_anfang_tfrecord.py --data_dir=/home/Mycode/VOCdevkit_anfang --set=test --year=VOC2007 --output_path=/home/Mycode/test.tfrecord --label_map_path=anfang_label_map.pbtxt

3. 训练

修改model config

cp object_detection/configs/tf2/faster_rcnn_resnet101_v1_800x1333_coco17_gpu.config \
/Muyunfaster_rcnn_resnet101_v1_800x1333_coco17_gpu.config

(1)修改num_classes
    num_classes=4 (自己数据集的类别数)
(2)注释掉fine_tune_checkpoint
(3)修改fine_tune_checkpoint_type
    fine_tune_checkpoint_type="detection"  # detection 检测、classifcial  分类4)修改train_input_reader
    train_input_reader: {
      # 此处修改为{your_path}/label_map.txt
      label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
      tf_record_input_reader {
        # 此处修改为{your_path}/train.tfrecord
        input_path: "PATH_TO_BE_CONFIGURED/train.tfrecord"
      } 
    }5)修改eval_input_reader
    eval_input_reader: {
      # 此处修改为{your_path}/label_map.txt
      label_map_path: "PATH_TO_BE_CONFIGURED/label_map.txt"
      shuffle: false
      num_epochs: 1
      tf_record_input_reader {
        # 此处修改为{your_path}/test.tfrecord
        input_path: "PATH_TO_BE_CONFIGURED/val2017-?????-of-00032.tfrecord"
      }
    }
  • 执行训练命令

    # From the tensorflow/models/research/ directory
    PIPELINE_CONFIG_PATH={path to pipeline config file}
    MODEL_DIR={path to model directory}eg:/home/mycode/logs
    NUM_TRAIN_STEPS=50000
    SAMPLE_1_OF_N_EVAL_EXAMPLES=1
    python object_detection/model_main.py \
        --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
        --model_dir=${MODEL_DIR} \
        --num_train_steps=${NUM_TRAIN_STEPS} \
        --sample_1_of_n_eval_examples=${SAMPLE_1_OF_N_EVAL_EXAMPLES} \
        --alsologtostderr
        --checkpoint_dir=gs://${MODEL_DIR} # 评估加上即可 eg:/home/Mycode/logs/model.ckpt-3755
        (其中包含3个文件:model.ckpt-3755.data-00000-of-00001,model.ckpt-3755.index, model.ckpt-3755.meta)
    

5. Dockerfile

  • 拉镜像
  • 上传镜像
容器打包成镜像
docker commit [OPTIONS] CONTAINER [REPOSITORY[:TAG]]
-a :提交的镜像作者;

-c :使用Dockerfile指令来创建镜像;

-m :提交时的说明文字;

-p :在commit时,将容器暂停。

docker commit -a "Muyun" -m "tf目标检测api tf2.5" -p od-tf2.4 songguangfu/tf-od-api:od-tf2.5


首先登陆docker hub
然后创建一个仓库   用户名/仓库名    这里会得到一个uuid形式的密码
然后在服务器上先登陆服务器  docker login -u 用户名
然后输入密码

重命名TAG
docker tag 原io:原Tag 新IO:新Tag

docker push 用户名/仓库名:Tag

FROM nvcr.io/nvidia/tensorflow:21.02-tf2-py3
  
RUN apt-get update && apt-get install -y \
    git \
    gpg-agent \
    python3-cairocffi \
    protobuf-compiler \
    python3-pil \
    python3-lxml \
    python3-tk \
    wget
COPY models/ /root/models
WORKDIR /root/models/research
RUN  protoc object_detection/protos/*.proto --python_out=.
RUN  cp object_detection/packages/tf2/setup.py .
RUN  pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
# RUN  python setup.py install  # 用这个是更好一些
RUN  python -m pip install .
RUN  export CUDA_VISIBLE_DEVICES=0
RUN  python object_detection/builders/model_builder_tf2_test.py
https://github.com/tensorflow/models/blob/master/research/object_detection/export_inference_graph.py
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/oid_inference_and_evaluation.md
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
TensorFlow Object Detection API 是一个开源项目,它提供了一系列基于 TensorFlow 的工具和库,用于实现目标检测任务。对于 macOS 系统,我们可以通过以下步骤来使用 TensorFlow Object Detection API: 1. 安装 TensorFlow:在 macOS 上安装 TensorFlow 是使用 TensorFlow Object Detection API 的前提。你可以通过 pip 命令进行安装,例如在终端中执行 `pip install tensorflow`。 2. 下载 TensorFlow Object Detection API:打开终端并导航到适合你的工作目录中,然后使用 git 命令来克隆 TensorFlow Object Detection API 的 GitHub 仓库,例如执行 `git clone https://github.com/tensorflow/models.git`。 3. 安装依赖项:进入克隆的模型目录中,找到 research 文件夹并进入。然后运行 `pip install -r object_detection/requirements.txt` 命令来安装所需的依赖项。 4. 下载预训练模型:在 TensorFlow Object Detection API 中,我们可以使用预训练的模型来进行目标检测。你可以从 TensorFlow Model Zoo 中下载适合你任务的模型,并将其解压到你的工作目录中。 5. 运行实例代码:在 research/object_detection 目录中,你可以找到一些示例代码,用于训练、评估和使用目标检测模型。可以通过阅读这些示例代码并根据自己的需求进行修改。例如,你可以使用 `python object_detection/builders/model_builder_tf2_test.py` 命令来运行一个模型的测试。 以上是在 macOS 上使用 TensorFlow Object Detection API 的基本步骤,你可以根据你的具体需求进行更多的深入研究和调整。希望这些信息能帮助到你!
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值