tensorflow object detection API遇到的一些问题以及解决方法以及测试自己的数据集

最新推荐文章于 2024-06-17 13:39:20 发布

置顶 Hesitations

最新推荐文章于 2024-06-17 13:39:20 发布

阅读量4.7k

点赞数 6

文章标签：深度学习目标检测

本文链接：https://blog.csdn.net/qq_40146495/article/details/82984734

版权

本文参考了https://zhuanlan.zhihu.com/p/35854575文章

先在网盘下载所需的数据https://pan.baidu.com/s/1YRevOO-OSz1NKcBtYmRs9A

密码：bp9k

下载完数据后解压，在model-master中有一个research文件夹，，接下来所有的操作的步骤都是在research文件夹进行的。

1.安装protoc

最好用3.4版本的，解压protoc文件夹后会得到bin文件夹，将bin文件夹下的protoc复制到/usr/bin/文件夹下即安装成功

如上图所示就安装成功后

2编译protoc文件

记住在research文件夹下运行：

protoc object_detection/protocs/*.protoc --python_out=.

运行完成后，检查object_detection/protos/文件夹，可以看到生成了.py文件，就说明编译成功。

3 讲Slim加入PYTHONPATH

执行下面的命令

export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

这里的pwd是自己research文件夹的路径，执行完后，可以在python中运行import slim没有报错说明已经正确设置好了

4 安装完成后测试

在research文件夹下执行

python object_detection/builders/model_builder_test.py

这条命令会检查API是否正确安装，如果出现下面的信息，说明安装成功。

本文用的是GPU版本的tensorflow 1.9.0，protoc是3.4版本。

2 训练新的模型

使用目标标记工具labelimg标记数据集，生成xml格式的文件，在research/object_detection新建images文件夹，分别把标记好的数据放到train和test文件夹下。

然后把所有的xml集合成csv文件，需要用到Python代码来实现，代码如下，把如下代码复制粘贴到一个python文件里，命名为xml to csv，在终端python xml to csv.py，会分别生成train.csv 和test.csv文件。记得把程序中的路径修改成自己文件夹的路径。

import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET

os.chdir('/home/lf/models/research/object_detection/images/test')
path = '/home/lf/models/research/object_detection/images/test'

def xml_to_csv(path):
    xml_list = []
    for xml_file in glob.glob(path + '/*.xml'):
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall('object'):
            value = (root.find('filename').text,
                     int(root.find('size')[0].text),
                     int(root.find('size')[1].text),
                     member[0].text,
                     int(member[4][0].text),
                     int(member[4][1].text),
                     int(member[4][2].text),
                     int(member[4][3].text)
                     )
            xml_list.append(value)
    column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
    xml_df = pd.DataFrame(xml_list, columns=column_name)
    return xml_df


def main():
    image_path = path
    xml_df = xml_to_csv(image_path)
    xml_df.to_csv('text.csv', index=None)
    print('Successfully converted xml to csv.')


main()

因为Tensorflow object detection API的输入数据格式是TFRcords Format格式的，所以我们要把csv文件转化成record文件,先把上面生成的train.csv和test.csv复制粘贴到/home/lf/models/research/object_detection/data，然后需要用到Python代码来实现csv到record的转换，代码如下，把如下代码复制粘贴到一个下的名为/home/lf/models/research/object_detection的generate_TFR.py文件里




import os
import io
import pandas as pd
import tensorflow as tf

from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict

os.chdir('/home/lf/models/research/object_detection')

flags = tf.app.flags
flags.DEFINE_string('csv_input', '', '/home/lf/models/research/object_detection/data')
flags.DEFINE_string('output_path', '', '/home/lf/models/research/object_detection/data')
FLAGS = flags.FLAGS


# TO-DO replace this with label map
def class_text_to_int(row_label):
    if row_label == 'zjl':
        return 1
    elif row_label == 'cyx':
        return 2
    else:
        None


def split(df, group):
    data = namedtuple('data', ['filename', 'object'])
    gb = df.groupby(group)
    return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]


def create_tf_example(group, path):
    with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size

    filename = group.filename.encode('utf8')
    image_format = b'jpg'
    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []

    for index, row in group.object.iterrows():
        xmins.append(row['xmin'] / width)
        xmaxs.append(row['xmax'] / width)
        ymins.append(row['ymin'] / height)
        ymaxs.append(row['ymax'] / height)
        classes_text.append(row['class'].encode('utf8'))
        classes.append(class_text_to_int(row['class']))

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example


def main(_):
    writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
    path = os.path.join(os.getcwd(), '/home/lf/models/research/object_detection/images/test') #20180418做了修改
    examples = pd.read_csv(FLAGS.csv_input)
    grouped = split(examples, 'filename')
    for group in grouped:
        tf_example = create_tf_example(group, path)
        writer.write(tf_example.SerializeToString())

    writer.close()
    output_path = os.path.join(os.getcwd(), FLAGS.output_path)
    print('Successfully created the TFRecords: {}'.format(output_path))


if __name__ == '__main__':
    tf.app.run()

在/home/lf/models/research/object_detection的终端下，输入如下命令行：

转换train.csv对应的是
python generate_TFR.py --csv_input=data/train.csv --output_path=data/train.record
转换test.csv对应的是
python generate_TFR.py --csv_input=data/test.csv --output_path=data/test.record

转换成功后，数据准备完成。

选择模型

在/home/lf/models/research/object_detection/samples/configs的文件夹下，选择ssd_mobilenet_v1_coco.config ，点击打开并复制里面的代码到新建的名为ssd_mobilenet_v1_coco.config的文件里，并在/home/lf/models/research/object_detection目录下新建一个名为tranings的文件夹，并把ssd_mobilenet_v1_coco.config放到trainings文件夹中，如下图

ssd_mobilenet_v1_coco.config需要修改5处

需修改5处
1、train_input_reader: {
  tf_record_input_reader {
    input_path: "PATH_TO_BE_CONFIGURED/mscoco_train.record"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
}
这的input_path是训练数据的路径，改为对应的路径，这里是input_path:data/train.record
这的label_map_path是label路径，这里是label_map_path:data/ZJL_CYX.pbtxt
2、eval_input_reader: {
  tf_record_input_reader {
    input_path: "PATH_TO_BE_CONFIGURED/mscoco_val.record"
  }
  label_map_path: "PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt"
  shuffle: false
  num_readers: 1
}
这的input_path是训练数据的路径，改为对应的路径，这里是input_path:data/test.record
这的label_map_path是label路径，这里是label_map_path:data/ZJL_CYX.pbtxt
3、ssd {
    num_classes: 90
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
num_classes是标签类别数，这里只有杰伦和奕迅，所以 num_classes: 2
4、train_config: {
  batch_size: 24
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
}
batch_size是每次迭代的数据数，我这里设为2
5、注释掉fine_tune_checkpoint: "ssd_mobilenet_v1_coco_11_06_2017/model.ckpt"

上面代码注释的第1,2处的data/ZJL_CYX.pbtxt文件需要自己新建，可以复制一个文件然后把文件名改了即可

打开该文件，修改文件内容为

item {
name: "ZJL"
id: 1
display_name: "ZJL"
}
item {
name: "CYX"
id: 2
display_name: "CYX"
}

配置到此完成，开始训练

开始训练模型

打开终端，在research/object_decetion的文件夹下，执行python train.py --train_dir=trainings/ --pipeline_config_path=trainings/ssd_mobilenet_v1_coco.config，将会开始训练模型，训练20万步后停止。

导出模型，在research/object_decetion的文件夹下，执行python export_inference_graph.py \--input_type image_tensor \--pipeline_config_path trainings/ssd_mobilenet_v1_coco.config \--trained_checkpoint_prefix trainings/model.ckpt-200000 \--output_directory zjl__cyx__inference__graph

模型构建完成，开始测试自己的图片

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Wed Oct 24 18:40:03 2018

@author: lf
"""

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
import matplotlib.pyplot as plt
from PIL import Image

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from object_detection.utils import ops as utils_ops

if tf.__version__ < '1.4.0':
  raise ImportError('Please upgrade your tensorflow installation to v1.4.* or later!')



#get_ipython().magic('matplotlib inline')




from utils import label_map_util

from utils import visualization_utils as vis_util




# What model to download.
MODEL_NAME = 'zjl__cyx__inference__graph'

PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'zjl_cyx.pbtxt')

NUM_CLASSES = 2





detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')





label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)





def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)



PATH_TO_TEST_IMAGES_DIR = 'test_images'
#TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]
TEST_IMAGE_PATHS = os.listdir('/home/lf/models/research/object_detection/test_images')
os.chdir('/home/lf/models/research/object_detection/test_images')
# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)





def run_inference_for_single_image(image, graph):
  with graph.as_default():
    with tf.Session() as sess:
      # Get handles to input and output tensors
      ops = tf.get_default_graph().get_operations()
      all_tensor_names = {output.name for op in ops for output in op.outputs}
      tensor_dict = {}
      for key in [
          'num_detections', 'detection_boxes', 'detection_scores',
          'detection_classes', 'detection_masks'
      ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
          tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
              tensor_name)
      if 'detection_masks' in tensor_dict:
        # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes, image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(
            tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(
            detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict,
                             feed_dict={image_tensor: np.expand_dims(image, 0)})

      # all outputs are float32 numpy arrays, so convert types as appropriate
      output_dict['num_detections'] = int(output_dict['num_detections'][0])
      output_dict['detection_classes'] = output_dict[
          'detection_classes'][0].astype(np.uint8)
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
      output_dict['detection_scores'] = output_dict['detection_scores'][0]
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]
  return output_dict


for image_path in TEST_IMAGE_PATHS:
  image = Image.open(image_path)
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)
  # Actual detection.
  output_dict = run_inference_for_single_image(image_np, detection_graph)
  # Visualization of the results of a detection.
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks'),
      use_normalized_coordinates=True,
      line_thickness=8)
  plt.figure(figsize=IMAGE_SIZE)
  plt.imshow(image_np)

把上面代码放置在/home/lf/models/research/object_detection的目录下，打开spyder，运行该程序即可得到结果，记得修改程序中的路径。结果如图所示：