Tensorflow学习笔记——ObjectDetectionAPI学习使用

最新推荐文章于 2023-09-11 09:18:05 发布

darkeyers

最新推荐文章于 2023-09-11 09:18:05 发布

阅读量723

点赞数

分类专栏： Linux学习机器学习

本文链接：https://blog.csdn.net/darkeyers/article/details/80245119

版权

机器学习同时被 2 个专栏收录

6 篇文章 1 订阅

订阅专栏

Linux学习

5 篇文章 0 订阅

订阅专栏

写在前面：在开始ObjDetectionAPI学习之前请确认几个事情。

1、识别机器是否可以正常跑Tensorflow1.4版本以上的Demo （即jupyter notebook中那个RunAll的Demo）

吐槽一下：Demo中在每一次识别前都要下载对应的模型，而已经下载以后再重新下载就会报错（Python2.7）。

2、识别机器是Tensorflow-gpu版。当然如果要用CPU去跑也可以只不过非常耗时。

3、Linux环境下的使用者们请确保在训练过程中机器保持稳定（不断电），tf是一个贪心框架，会榨干GPU的每一分算力，如果断电很大程度上会导致显卡驱动崩溃，又要重新装。

如果需要重新装显卡驱动点这个

一：数据准备

因为懒的去编写TF格式转换的脚本，所以我这边使用

model\models-master\research\object_detection\dataset_tools

文件夹下create_pascal_tf_record.py然后进行相应的修改后运行即可。

其Image和XML的格式参考PASCAL VOC格式。

注意：在其中的Image大小确保在一定范围内要不然会报错。

得到数据集pascal_train.record和pascal_val.record

二：模型选择

research/object_detection/g3doc/detection_model_zoo.md

在其中选择自己需要的模型。并且下载。

三：训练

选择对应模型的.config

修改对应的num_classes:、fine_tune_checkpoint:、input_path:、label_map_path

可以创建批处理文件train.sh

mkdir -p logs/
now=$(date +"%Y%m%d_%H%M%S")
python ../../train.py \
    --logtostderr \
    --pipeline_config_path=ssd_mobilenet_v1.config \
    --train_dir=train_logs 2>&1 | tee logs/train_$now.txt &

运行即可

同时可以运行可视化log

tensorboard --logdir train_logs/

四：测试

在训练完成以后得到CheckPoint文件存放在train_logs中

graph.pbtxt
model.ckpt-200000.data-00000-of-00001
model.ckpt-200000.info
model.ckpt-200000.meta

其中meta保存了graph和metadata，ckpt保存了网络的weights。

mkdir -p output
CUDA_VISIBLE_DEVICES="1" python ../../export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path ssd_mobilenet_v1.config \
    --trained_checkpoint_prefix train_logs/model.ckpt-200000 \
    --output_directory output/

其中200000是指的当前训练的steps

测试方法：

1、运行object_detection_tutorial.ipynb修改模型和训练结果的路径

2、编写脚本

import sys
sys.path.append('..')
import os
import time
import tensorflow as tf
import numpy as np
from PIL import Image
from matplotlib import pyplot as plt
from utils import label_map_util
from utils import visualization_utils as vis_util
if len(sys.argv) < 3:
    print('Usage: python {} test_image_path checkpoint_path'.format(sys.argv[0]))
    exit()
PATH_TEST_IMAGE = sys.argv[1]
PATH_TO_CKPT = sys.argv[2]
PATH_TO_LABELS = 'data/pascal_label_map.pbtxt'
NUM_CLASSES = 21
IMAGE_SIZE = (18, 12)
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(
    label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
with detection_graph.as_default():
    with tf.Session(graph=detection_graph, config=config) as sess:
        start_time = time.time()
        print(time.ctime())
        image = Image.open(PATH_TEST_IMAGE)
        image_np = np.array(image).astype(np.uint8)
        image_np_expanded = np.expand_dims(image_np, axis=0)
        image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
        boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
        scores = detection_graph.get_tensor_by_name('detection_scores:0')
        classes = detection_graph.get_tensor_by_name('detection_classes:0')
        num_detections = detection_graph.get_tensor_by_name('num_detections:0')
        (boxes, scores, classes, num_detections) = sess.run(
            [boxes, scores, classes, num_detections],
            feed_dict={image_tensor: image_np_expanded})
        print('{} elapsed time: {:.3f}s'.format(time.ctime(), time.time() - start_time))
        vis_util.visualize_boxes_and_labels_on_image_array(
            image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores),
            category_index, use_normalized_coordinates=True, line_thickness=8)
        plt.figure(figsize=IMAGE_SIZE)
        plt.imshow(image_np)

运行一下就好了

python infer.py \
    ../test_images/image1.jpg \
    ssd_mobilenet/output/frozen_inference_graph.pb

批量测试代码如下

import sys
sys.path.append('..')
import os
import json
import time
import tensorflow as tf
import numpy as np
from PIL import Image
from utils import label_map_util
if len(sys.argv) < 5:
    print('Usage: python {} output_json_path checkpoint_path test_ids_path image_dir'.format(sys.argv[0]))
    exit()
PATH_OUTPUT = sys.argv[1]
PATH_TO_CKPT = sys.argv[2]
PATH_TEST_IDS = sys.argv[3]
DIR_IMAGE = sys.argv[4]
PATH_TO_LABELS = 'data/pascal_label_map.pbtxt'
NUM_CLASSES = 21
def get_results(boxes, classes, scores, category_index, im_width, im_height,
    min_score_thresh=.5):
    bboxes = list()
    for i, box in enumerate(boxes):
        if scores[i] > min_score_thresh:
            ymin, xmin, ymax, xmax = box
            bbox = {
                'bbox': {
                    'xmax': xmax * im_width,
                    'xmin': xmin * im_width,
                    'ymax': ymax * im_height,
                    'ymin': ymin * im_height
                },
                'category': category_index[classes[i]]['name'],
                'score': float(scores[i])
            }
            bboxes.append(bbox)
    return bboxes
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
test_ids = [line.split()[0] for line in open(PATH_TEST_IDS)]
total_time = 0
test_annos = dict()
flag = False
with detection_graph.as_default():
    with tf.Session(graph=detection_graph, config=config) as sess:
        for image_id in test_ids:
            image_path = os.path.join(DIR_IMAGE, image_id + '.jpg')
            image = Image.open(image_path)
            image_np = np.array(image).astype(np.uint8)
            im_width, im_height, _ = image_np.shape
            image_np_expanded = np.expand_dims(image_np, axis=0)
            image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
            boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
            scores = detection_graph.get_tensor_by_name('detection_scores:0')
            classes = detection_graph.get_tensor_by_name('detection_classes:0')
            num_detections = detection_graph.get_tensor_by_name('num_detections:0')
            start_time = time.time()
            (boxes, scores, classes, num_detections) = sess.run(
                [boxes, scores, classes, num_detections],
                feed_dict={image_tensor: image_np_expanded})
            end_time = time.time()
            print('{} {} {:.3f}s'.format(time.ctime(), image_id, end_time - start_time))
            if flag:
                total_time += end_time - start_time
            else:
                flag = True
            test_annos[image_id] = {'objects': get_results(
                np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index,
                im_width, im_height)}
print('total time: {}, total images: {}, average time: {}'.format(
    total_time, len(test_ids), total_time / len(test_ids)))
test_annos = {'imgs': test_annos}
fd = open(PATH_OUTPUT, 'w')
json.dump(test_annos, fd)
fd.close()

运行

python test.py \
    ssd_mobilenet/output/result_annos.json \
    ssd_mobilenet/output/frozen_inference_graph.pb \
    data/VOCdevkit/VOC2012/ImageSets/Main/train_val.txt \
    data/VOCdevkit/VOC2012/JPEGImages/

参考文章：

第一篇

第二篇