玩转目标检测预训练模型（一）—tf的Model库的使用

最新推荐文章于 2024-08-01 14:59:22 发布

nerd呱呱

最新推荐文章于 2024-08-01 14:59:22 发布

阅读量3k

点赞数

分类专栏： ---OD

本文链接：https://blog.csdn.net/qq_36285879/article/details/88829298

版权

---OD 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

预训练模型

参考博客：

下载链接：https://github.com/tensorflow/models

我们尝试的是models/research/objection部分

下载预训练代码

我们把整个Models项目下载下来，github上直接下载或者使用git命令：

git clone https://github.com/tensorflow/models

我们使用的主要是models/research/object_detection/object_detection_tutorial.ipynb这个文件。

protobuf编译

由于github上的文件大小有限制，所以使用一个叫做Google Protocol Buffer（Protobuf）的标准储存数据，tensorflow和protobuf都是谷歌的项目，所以在tensorflow中需要用到protobuf并不奇怪。protobuf的优点是可以编译成 C++、Java、Python 三种语言，也就是一份protobuf可以变成三种语言形式，这非常方便，而且节省空间。

使用conda基本操作，进入环境。

之后安装protofbuf包：

conda install protobuf

安装完后：

（重要的事情说三遍！！！）

在 models/research 目录下的终端执行：！！！

protoc object_detection/protos/*.proto --python_out=.

运行代码

我尝试运行ipynb文件，结果运行时提示服务正重启，貌似挂掉了。于是把相同的代码转移到pycharm上，修改了几点后成功。

matplotlib在jupyter notebook的写法和pycharm上的写法不同，需要在from matplotlib import pyplot as plt之前添加

import matplotlib
matplotlib.use("TkAgg")

我还稍微改了改图片显示的部分，解决了白屏显示的问题。

自己建了一个目录叫test_out_images，用以存运行结果，而不是使用show方法。

我还添加了控制台输出的代码，好分析结果。

解决了显示的warning，添加两行代码即可：

# 解决warning问题
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"

此外，为了看mobilenet有多快，我添加了time库中的计时函数。

完整代码

把object_detection_tutorial.ipynb转换为pycharm代码如下，另外，我也试过参考链接中的方法，实测1有效，2无效。

其中MODEL_NAME需要我们修改，我们可以选择任何在https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md中的模型名字。当然你也可以不依靠代码中的下载功能，先下载对应模型到同级目录下。

import time

import matplotlib
matplotlib.use("TkAgg")

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

# 解决问题
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"


from distutils.version import StrictVersion
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from object_detection.utils import ops as utils_ops

if StrictVersion(tf.__version__) < StrictVersion('1.12.0'):
  raise ImportError('Please upgrade your TensorFlow installation to v1.12.*.')

# This is needed to display the images.
# %matplotlib inline

from utils import label_map_util
from utils import visualization_utils as vis_util

# What model to download.
MODEL_NAME = 'ssdlite_mobilenet_v2_coco_2018_05_09'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_FROZEN_GRAPH = MODEL_NAME + '/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'mscoco_label_map.pbtxt')


print("downloading...")

opener = urllib.request.URLopener()
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
    file_name = os.path.basename(file.name)
    if 'frozen_inference_graph.pb' in file_name:
        tar_file.extract(file, os.getcwd())

print("end...")

detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)

def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)

# For the sake of simplicity we will use only 2 images:
# image1.jpg
# image2.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test_images'
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3) ]

# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)

def run_inference_for_single_image(image, graph):
  with graph.as_default():
    with tf.Session() as sess:
      # Get handles to input and output tensors
      ops = tf.get_default_graph().get_operations()
      all_tensor_names = {output.name for op in ops for output in op.outputs}
      tensor_dict = {}
      for key in [
          'num_detections', 'detection_boxes', 'detection_scores',
          'detection_classes', 'detection_masks'
      ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
          tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
              tensor_name)
      if 'detection_masks' in tensor_dict:
        # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes, image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(
            tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(
            detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict,
                             feed_dict={image_tensor: np.expand_dims(image, 0)})

      # all outputs are float32 numpy arrays, so convert types as appropriate
      output_dict['num_detections'] = int(output_dict['num_detections'][0])
      output_dict['detection_classes'] = output_dict[
          'detection_classes'][0].astype(np.uint8)
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
      output_dict['detection_scores'] = output_dict['detection_scores'][0]
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]
    return output_dict

print("test begin")
number = 1


for image_path in TEST_IMAGE_PATHS:
    time_start = time.time()

    print("image_path:",image_path)
    image = Image.open(image_path)
    fig = plt.figure()
    # ax = fig.add_subplot(121)
    # ax.imshow(image)


    # the array based representation of the image will be used later in order to prepare the
    # result image with boxes and labels on it.
    image_np = load_image_into_numpy_array(image)
    # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
    image_np_expanded = np.expand_dims(image_np, axis=0)
    # Actual detection.
    output_dict = run_inference_for_single_image(image_np, detection_graph)
    # Visualization of the results of a detection.
    vis_util.visualize_boxes_and_labels_on_image_array(
        image_np,
        output_dict['detection_boxes'],
        output_dict['detection_classes'],
        output_dict['detection_scores'],
        category_index,
        instance_masks=output_dict.get('detection_masks'),
        use_normalized_coordinates=True,
        line_thickness=8)

    ax = fig.add_subplot(111)
    ax.imshow(image_np)  # 以灰度图显示图片
    # plt.axis("off")#不显示刻度
    plt.savefig("test_out_images/image"+str(number)+".jpg")
    number+=1
    # plt.show(image)  # 显示刚才所画的所有操作

    # 结果显示
    print("结果显示")
    print("----------.----------.----------.----------")
    # print(len(output_dict['detection_scores'])) #100
    for i in range(100):
        if (output_dict['detection_scores'][i] == 0.0):
            continue
        print("框坐标[%.2f,%.2f,%.2f,%.2f], 类别:%s, 概率%.2f"%(output_dict['detection_boxes'][i][0], output_dict['detection_boxes'][i][1], output_dict['detection_boxes'][i][2], output_dict['detection_boxes'][i][3]
                                                         , category_index[output_dict['detection_classes'][i]]
                                                         , output_dict['detection_scores'][i]))
    time_end = time.time()
    print('totally cost', time_end - time_start)

print("test end")