tensorflow Object Detection API中训练ssd_mobilenet_v3_small

最新推荐文章于 2023-02-08 17:27:45 发布

橘也

最新推荐文章于 2023-02-08 17:27:45 发布

阅读量2.4k

点赞数 3

文章标签： tensorflow 深度学习计算机视觉

本文链接：https://blog.csdn.net/qq_45057749/article/details/109075797

版权

首先说明下本次实验的环境：
Ubuntu 18.04
tensorflow-gpu 1.15
CUDA 10.0
CUDNN 7.6.5
模型 ssd_mobilenet_v3_small

下面详细介绍实验过程

一、安装tensorflow Object Detection API

1. 下载该API到指定的文件夹

在现有的官方github下载models，安装的tensorflow-gpu==1.12.0，但是经过后续的编译，测试出错了，报错为ModuleNotFoundError: No module named ‘tensorflow.compat.v1’，在issues里找到了一个解决办法，下载 r1.13.0 branch即可。但是后续实验又出错了ImportError: cannot import name ‘device_spec’，在另一个issues里又找到了解决办法，重新安装了tensorflow-gpu 1.15.0版本，还是用了开始github下载的models。

将models下载到tensorflow文件夹下。

在这里插入图片描述

2. 在Anaconda中创建虚拟环境并安装tensorflow-gpu

conda create -n tf1_36 python=3.6  # 创建虚拟环境
conda activate tf1_36  # 激活虚拟环境
pip install tensorflow-gpu==1.15.0  # 安装tensorflow

在安装tensorflow的时候，如果出现了socket.timeout: The read operation timed out错误，可以使用换源安装，例如使用阿里云镜像：

pip install -i http://mirrors.aliyun.com/pypi/simple tensorflow-gpu==1.15.0

3. 在虚拟环境中安装其他包

如下面标红的包，后面测试时缺啥包安装啥包。
在这里插入图片描述

4. Protobuf 编译及添加环境变量

（1）编译
Tensorflow Object Detection API 用 Protobufs 来配置模型和训练参数。在用这个框架之前，必须先编译Protobuf 库，切换到这个目录下： tensorflow/models/research/

# Compile protos.
protoc object_detection/protos/*.proto --python_out=.

不报错就编译成功了。
（2）添加环境变量
tensorflow/models/research/ 和 slim 目录需要添加到PYTHONPATH环境变量中。从终端中切换到tensorflow/models/research/目录下：

# 添加环境变量
gedit ~/.bashrc
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim  #把这句话添加到.bashrc文件末尾
source ~/.bashrc  # 保存关闭后，执行这条命令，使之立即生效

5. 测试该API是否安装成功

# Test the installation.
python object_detection/builders/model_builder_tf1_test.py

出现OK则说明安装成功

在这里插入图片描述

二、数据集标注文件格式转换

将自己的数据集分为训练集train和测试集test，将训练和测试图片分别放在object_detection/images下。
在这里插入图片描述
我自己数据集的标注文件是xml格式，而Tensorflow需要将其转换成TFRecords格式使用，所以接下来就是标注文件格式转换。

分为两步：(1）将文件夹内的xml文件的信息统一记录到.csv表格中（xml_to_csv.py）；(2）从.csv表格中创建TFRecords格式（generate_tfrecord.py）。代码参考github。

xml —> .csv

直接用下面的代码没有问题，信息直接提取成功了，如下图。

# -*- coding: utf-8 -*-
import glob
import pandas as pd
import xml.etree.ElementTree as ET

path = '/home/jiao/PycharmProjects/Practise/train_xml'  # xml文件所在地址

def xml_to_csv(path):
    xml_list = []
    for xml_file in glob.glob(path + '/*.xml'):
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall('object'):
            value = (root.find('filename').text,
                     int(root.find('size')[0].text),
                     int(root.find('size')[1].text),
                     member[0].text,
                     int(member[4][0].text),
                     int(member[4][1].text),
                     int(member[4][2].text),
                     int(member[4][3].text)
                     )
            xml_list.append(value)
    column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
    xml_df = pd.DataFrame(xml_list, columns=column_name)
    return xml_df


def main():
    xml_path = path
    xml_df = xml_to_csv(xml_path)
    xml_df.to_csv('train_labels.csv', index=None)
    print('Successfully converted xml to csv.')

if __name__ == '__main__':
    main()

在这里插入图片描述

.csv —> TFRecords

方法一：
用上面github里的代码出错了。
首先出现了No module named ‘object_detection.utils’; ‘object_detection’ is not a package这个错，查了查说是Object Detection没有添加到系统环境变量里面去（见生成 record 文件时发生错误)，但是我确定自己是正确添加了的，所以又想了想。因为开始我是直接在下载的github代码项目中运行的generate_tfrecord.py，所以又重新把用到的生成TFRecords的两个py文件generate_tfrecord.py和object_detection.py单独复制到了tensorflow/models/research里面，这样错误消失了。

但是又出现了第二个错：AttributeError: module ‘tensorflow’ has no attribute ‘app’。又查了查发现是tensorflow版本问题（见AttributeError：module tensorflow no attribute app解决办法)。我用的是tensorflow2，而源代码是tensorflow1.几版本，所以出错了。
解决办法：将import tensorflow as tf 改为import tensorflow.compat.v1 as tf

至此，终于成功生成TFRecords格式的数据集了。

方法二：
直接在research/object_detection目录下新建generate_tfrecord.py文件，代码如下，然后在终端运行。

from __future__ import division
from __future__ import print_function
from __future__ import absolute_import

import os
import io
import pandas as pd
import tensorflow as tf

from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict

flags = tf.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('image_dir', '', 'Path to the image directory')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS

# M1，this code part need to be modified according to your real situation
def class_text_to_int(row_label):  # 类别及标号
    if row_label == 'head':
        return 1
    elif row_label == 'pole':
        return 2
    else:
        None

def split(df, group):
    data = namedtuple('data', ['filename', 'object'])
    gb = df.groupby(group)
    return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]

def create_tf_example(group, path):
    with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size

    filename = group.filename.encode('utf8')
    image_format = b'jpg'
    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []

    for index, row in group.object.iterrows():
        xmins.append(row['xmin'] / width)
        xmaxs.append(row['xmax'] / width)
        ymins.append(row['ymin'] / height)
        ymaxs.append(row['ymax'] / height)
        classes_text.append(row['class'].encode('utf8'))
        classes.append(class_text_to_int(row['class']))

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example

def main(_):
    writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
    path = os.path.join(os.getcwd(), FLAGS.image_dir)
    examples = pd.read_csv(FLAGS.csv_input)
    grouped = split(examples, 'filename')
    for group in grouped:
        tf_example = create_tf_example(group, path)
        writer.write(tf_example.SerializeToString())

    writer.close()
    output_path = os.path.join(os.getcwd(), FLAGS.output_path)
    print('Successfully created the TFRecords: {}'.format(output_path))

if __name__ == '__main__':
    tf.app.run()

在/research/object_detection终端下运行

python generate_tfrecord.py --csv_input=data/train_labels.csv --image_dir=images/train --output_path=train.record

在这里插入图片描述

三、模型训练

修改配置文件

在tensorflow/models/research/object_detection下创建training文件夹，将配置文件放在这里。配置文件从tensorflow/models/research/object_detection/samples/configs里面找自己要训练的模型。

在这里插入图片描述
接下来修改配置文件中的内容：

将 num_classes 按照实际情况更改;
可修改batch_size；
修改num_steps，此为训练最大迭代次数；
设置预训练模型：最后两句fine_tune_checkpoint、from_detection_checkpoint是从预先训练的模型中寻找checkpoint，如果配置到本地出现了问题，可以选择删除这两行，相当于自己从头开始训练。预训练模型从github上下载，解压后把它放在tensorflow/models/research/object_detection/目录下。

train_config: {
  batch_size: 8
  sync_replicas: true
  startup_delay_steps: 0
  replicas_to_aggregate: 32
  num_steps: 80000
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        cosine_decay_learning_rate {
          learning_rate_base: 0.4
          total_steps: 80000
          warmup_learning_rate: 0.13333
          warmup_steps: 2000
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  max_number_of_boxes: 100
  unpad_groundtruth_tensors: false
  fine_tune_checkpoint: "ssd_mobilenet_v3_small_coco_2020_01_14/model.ckpt"
  from_detection_checkpoint: true
}

设置label_map_path，input_path：将转好的train.record、test.record放在tensorflow/models/research/object_detection/data/目录下。同时在目录object_detection/data下创建一个label_map.pbtxt文件（可以复制一个其他名字的文件，然后打开修改），写入我们目标检测类的标签，注意id序号与前面.csv转TFRecords时保持一致，从1开始。label_map.pbtxt内容如下所示。

item {
  id: 1
  name: 'head'
}

item {
  id: 2
  name: 'pole'
}

label_map_path，input_path路径设置如下。

train_input_reader: {
  tf_record_input_reader {
    input_path: "data/train.record"
  }
  label_map_path: "data/label_map.pbtxt"
}

eval_config: {
  num_examples: 3073  
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "data/test.record"
  }
  label_map_path: "data/label_map.pbtxt"
  shuffle: false
  num_readers: 1
}

训练模型

在tensorflow/models/research/object_detection目录下运行如下命令进行训练。其中pipeline_config_path是配置文件的地址，model_dir是生成文件的存放地址，num_train_steps设置训练步数80000，num_eval_steps设置评估步数800。

python model_main.py --pipeline_config_path=training/ssdlite_mobilenet_v3_small_320x320_coco.config --model_dir=training --num_train_steps=80000 --num_eval_steps=800 --alsologtostderr

可能会出现的错：
(1) ModuleNotFoundError: No module named ‘pycocotools’，解决办法如下(依次运行）：

# 安装cython
pip install -U cython
# 安装pycocotools
## 从github上https://github.com/cocodataset/cocoapi下载到本地，在PythonAPI目录下运行
make
python setup.py build_ext install

(2) ModuleNotFoundError: No module named ‘object_detection’，解决办法如下(依次运行）：
cd到models/research目录，确保路径已经添加到环境变量中export PYTHONPATH=$PYTHONPATH:pwd:pwd/slim，运行

python setup.py build
python setup.py install

没错的话终于可以训练了！中途打断也不要紧，可以再次运行上述python命令，会从上次的checkpoint继续。

在这里插入图片描述
以上是在CPU上训练。

GPU训练

首先要看好tensorflow-gpu与cuda的版本是否对应，我这里用的是tensorflow-gpu 1.15与CUDA 10.0，训练

CUDA_VISIBLE_DEVICES=0 python model_main.py --pipeline_config_path=training/ssdlite_mobilenet_v3_small_320x320_coco.config --model_dir=training --num_train_steps=80000 --num_eval_steps=800 --alsologtostderr

可能会报错：Failed to get convolution algorithm. This is probably because cuDNN failed to initialize，报错翻译是：无法获取卷积算法。这可能是因为cuDNN初始化失败。原因可能是GPU内存不足造成的。需要在/object_detection/model_main.py程序前加上以下一段代码，意思是对GPU进行按需分配。

from tensorflow import ConfigProto
from tensorflow import InteractiveSession

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

【补充】：如果直接在GPU上训练的话报错，经查找发现还有用object_detection/legacy/train.py训练的方法。

export CUDA_VISIBLE_DEVICES=0
python legacy/train.py --pipeline_config_path=training/ssdlite_mobilenet_v3_small_320x320_coco.config --train_dir=training --alsologtostderr

这个可能会出现的错：
(1) ValueError: not enough values to unpack (expected 7, got 0)，解决办法如下：
在SSD的相关配置文件中，把sync_replicas的参数修改为false即可解决问题。

(2) Unsuccessful tensorslicereader constructor: failed to find any matching files for my_checkpoint ckpt，解决办法如下：
修改SSD的配置文件中的fine_tune_checkpoint，在model.ckpt后面具体加上-400000

fine_tune_checkpoint: "ssdlite_mobiledet_cpu_320x320_coco_2020_05_19/model.ckpt-400000"

训练上了！
在这里插入图片描述【注】：在object detection API中训练其他模型也是一样的步骤，例如训练ssd_resnet_50_fpn_coco 模型，用object_detection/legacy/train.py训练时存在的问题：ValueError: No variables to save
原因是在模型的config文件中只写了下面一句

fine_tune_checkpoint: "ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03/model.ckpt"

还需要在其下加上一句就不报错了。

fine_tune_checkpoint: "ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03/model.ckpt"
from_detection_checkpoint: true

训练可视化

Tensorflow还提供功能强大的Tensorboard来可视化训练过程。
在tensorflow/models/research/object_detection 文件夹下，运行

tensorboard --logdir=training

打开返回的网址即可。

导出Inference Graph

上面模型已经训练完了，接下来就是导出frozen_inference_graph.pb文件，该文件中包含了我们训练好的检测器以及网络架构信息和参数信息等，我们要的就是它。

python export_inference_graph.py --input_type image_tensor --pipeline_config_path training/ssdlite_mobilenet_v3_small_320x320_coco.config --trained_checkpoint_prefix training/model.ckpt-80000 --output_directory inference_graph

在tensorflow/models/research/object_detection目录下运行，其中trained_checkpoint_prefix代表checkpoint文件的存放位置，output_directory表示生成的.pb文件的路径，本实验是存放在inference_graph文件夹下。

如果导出时出现错误：ModuleNotFoundError: No module named ‘object_detection’，解决方法：在tensorflow/models/research目录下运行

python setup.py install

完成后再cd到object_detection目录下导出模型即可。

四、模型测试

在tensorflow/models/research/object_detection/test_images中放入要检测的图片，

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
import matplotlib

# Matplotlib chooses Xwindows backend by default.
matplotlib.use('Agg')

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util

MODEL_NAME = 'inference_graph'

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = '/home/jiaoda/PycharmProjects/tensorflow/models/research/object_detection/data/zuanjing_label_map.pbtxt'

NUM_CLASSES = 2

# Load a (frozen) Tensorflow model into memory.
print('Loading model...')
detection_graph = tf.Graph()

with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

# Loading label map
print('Loading label map...')
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES,
                                                            use_display_name=True)
category_index = label_map_util.create_category_index(categories)


# Helper code
def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape(
        (im_height, im_width, 3)).astype(np.uint8)


# Detection
# Path to test image
path = '/home/jiaoda/PycharmProjects/tensorflow/models/research/object_detection/test_images/'
graph = os.listdir(path)
for i in range(4):
    TEST_IMAGE_PATH = os.path.join(path, graph[i])
# Size, in inches, of the output images.
    IMAGE_SIZE = (12, 8)

    print('Detecting...')
    with detection_graph.as_default():
        with tf.Session(graph=detection_graph) as sess:
            print(TEST_IMAGE_PATH)
            image = Image.open(TEST_IMAGE_PATH)
            image_np = load_image_into_numpy_array(image)
            image_np_expanded = np.expand_dims(image_np, axis=0)
            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
            boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
            scores = detection_graph.get_tensor_by_name('detection_scores:0')
            classes = detection_graph.get_tensor_by_name('detection_classes:0')
            num_detections = detection_graph.get_tensor_by_name('num_detections:0')
        # Actual detection.
            (boxes, scores, classes, num_detections) = sess.run(
                [boxes, scores, classes, num_detections],
                feed_dict={image_tensor: image_np_expanded})
        # Print the results of a detection.
            print(scores)
            print(classes)
            print(category_index)
        # Visualization of the results of a detection.
            vis_util.visualize_boxes_and_labels_on_image_array(
                image_np,
                np.squeeze(boxes),
                np.squeeze(classes).astype(np.int32),
                np.squeeze(scores),
                category_index,
            use_normalized_coordinates=True,
            line_thickness=8)
            print(TEST_IMAGE_PATH.split('.')[0] + '.jpg')
            plt.figure(figsize=IMAGE_SIZE, dpi=300)
            plt.imshow(image_np)

            plt.savefig(TEST_IMAGE_PATH.split('.')[0] + '.jpg')

训练好模型后，若需要将模型部署到移动端，请参考我的下一篇博文：
将tensorflow训练模型转换成tflite模型在安卓端部署

参考博客：

橘也

关注

3
点赞
踩
16

收藏

觉得还不错? 一键收藏
1
评论
tensorflow Object Detection API中训练ssd_mobilenet_v3_small

安装tensorflow Object Detection API1. 下载该API到指定的文件夹在现有的官方github下载models，安装的tensorflow-gpu==1.12.0，但是经过后续的编译，测试出错了，报错为ModuleNotFoundError: No module named ‘tensorflow.compat.v1’，在issues里找到了一个解决办法，下载 r1.13.0 branch即可。将models-r1.13.0下载到tensorflow文件夹下。2. 在A
复制链接

扫一扫