python训练mask rcnn模型&&C++调用训练好的模型--基于opencv4.0（干货满满）

最新推荐文章于 2024-09-21 10:19:16 发布

Tom Hardy

最新推荐文章于 2024-09-21 10:19:16 发布

阅读量1.1w

点赞数 6

分类专栏：深度学习计算机视觉

本文链接：https://blog.csdn.net/qq_29462849/article/details/85342153

版权

本文介绍如何使用Object Detection API训练Mask R-CNN模型，然后通过OpenCV4.0在C++环境中调用训练好的模型。首先，使用labelme创建训练数据并转换为tfrecords格式。接着，基于mask_rcnn_inception_v2_coco模型进行训练，并将训练结果转换为.pb和.pbtxt文件。最后，展示了C++代码以调用模型进行推理。

摘要由CSDN通过智能技术生成

更多干货请关注公众号[3D视觉工坊]~~~

介绍

我的第一篇关于mask rcnn训练自己数据的博文，基于python代码，虽然可以跑，但是不能真正用到工程领域中，工程领域更多的是基于C++和C，如果编译tensorflow C++ API也是可以，然后利用api调用模型，但是会比较麻烦，自己也尝试过，不是那么友好。

opencv4.0，终于等到你~~~，opencv4.0已经支持mask rcnn的调用，只需要.pb文件和.pbtxt文件即可进行推理，注意，现在opencv还不支持训练深度学习模型，不过很期待这一天的到来！同时，google推出的object detection api也支持mask rcnn的训练了，这款api支持.pb和.pbtxt文件的生成，正好可以传给opencv使用，这两家怕是在一起合作的吧。

本文就是基于object detection api对mask rcnn进行训练，然后使用opencv调用，完成从python到C++的转换。

训练数据

训练数据的制作，是通过labelme，这个工具我就不多介绍了，网上很多，我上一篇mask rcnn博客上也有，对图像进行处理，最终会生成json文件。

和上篇博文不同，这里只需要原图像和对应的json文件，因为object detection api需要tfrecords格式的训练数据，因此需要把json文件转换成tfrecords文件。当然了，这里有转换代码。

其中data.pbtxt文件，格式如下，有几种类别就写几种。

item {
  id: 1
  name: 'tank'
}
item {
  id: 2
  name: 'white'
}

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Sun Aug 26 10:57:09 2018

@author: shirhe-lyh
"""

"""Convert raw dataset to TFRecord for object_detection.

Please note that this tool only applies to labelme's annotations(json file).

Example usage:
    python3 create_tf_record.py \
        --images_dir=your absolute path to read images.
        --annotations_json_dir=your path to annotaion json files.
        --label_map_path=your path to label_map.pbtxt
        --output_path=your path to write .record.
"""

import cv2
import glob
import hashlib
import io
import json
import numpy as np
import os
import PIL.Image
import tensorflow as tf

import read_pbtxt_file

flags = tf.app.flags

flags.DEFINE_string('images_dir', default='train_data/mask_data/train/images',help='')
flags.DEFINE_string('annotations_json_dir', 'train_data/mask_data/train/json',
                   help='')
flags.DEFINE_string('label_map_path',default='train_data/mask_data/data.pbtxt',help='')
flags.DEFINE_string('output_path', default='train_data/mask_data/train/tf_record/train.record', help='')

FLAGS = flags.FLAGS


def int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))


def int64_list_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=value))


def bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))


def bytes_list_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=value))


def float_list_feature(value):
    return tf.train.Feature(float_list=tf.train.FloatList(value=value))


def create_tf_example(annotation_dict, label_map_dict=None):
    """Converts image and annotations to a tf.Example proto.

    Args:
        annotation_dict: A dictionary containing the following keys:
            ['height', 'width', 'filename', 'sha256_key', 'encoded_jpg',
             'format', 'xmins', 'xmaxs', 'ymins', 'ymaxs', 'masks',
             'class_names'].
        label_map_dict: A dictionary maping class_names to indices.

    Returns:
        example: The converted tf.Example.

    Raises:
        ValueError: If label_map_dict is None or is not containing a class_name.
    """
    if annotation_dict is None:
        return None
    if label_map_dict is None:
        raise ValueError('`label_map_dict` is None')

    height = annotation_dict.get('height', None)
    width = annotation_dict.get('width', None)
    filename = annotation_dict.get('filename', None)
    sha256_key = annotation_dict.get('sha256_key', None)
    encoded_jpg = annotation_dict.get('encoded_jpg', None)
    image_format = annotation_dict.get('format', None)
    xmins = annotation_dict.get('xmins', None)
    xmaxs = annotation_dict.get('xmaxs', None)
    ymins = annotation_dict.get('ymins', None)
    ymaxs = annotation_dict.get('ymaxs', None)
    masks = annotation_dict.get('masks', None)
    class_names = annotation_dict.get('class_names', None)

    labels = []
    for class_name in class_names:
        label = label_map_dict.get(class_name, None)
        if label is None:
            raise ValueError('`label_map_dict` is not containing {}.'.format(
                class_name))
        labels.append(label)

    encoded_masks = []
    for