使用Tensorflow Object Detection API识别自己的图像

最新推荐文章于 2024-07-03 11:42:50 发布

一舫

最新推荐文章于 2024-07-03 11:42:50 发布

阅读量1.2w

点赞数 9

分类专栏： tensorflow 文章标签： tensorflow Object Detection API 机器学习图像分类

本文链接：https://blog.csdn.net/weixin_41997327/article/details/85934052

版权

tensorflow 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

1.安装Tensorflow Object Detection API

Tensorflow Object Detection API 存放在https://github.com/tensorflow/models上，可以通过git来下载：

git clone https://github.com/tensorflow/models/git

由于github的下载源在国外，下载速度较慢，可修改host文件提高下载速度，详情请参考我的错误集

安装的步骤如下（以research文件夹为相对目录）

1.1 安装protobuf

sudo pip install protobuf

安装的版本会默认与你的python版本匹配

1.2 编译proto文件

在research文件下执行以下命令，如果object_detection/protos/文件夹中每个proto文件都生成了对应以.py格式的代码文件，就说明编译成功了。

# From models/research
protoc object_detection/protos/*.proto --python_out=.

1.3 将Slim加入PYTHONPATH

因为要用到slim，所以得将slim加入python使用的path才能正常运行。还是在research文件下，执行以下命令：

export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

1.4 安装完成测试

在research下执行：

python object_detection/builders/model_builder_test.py

运行成功终端将会出现以下界面

.......
----------------------------------------------------------------------
Ran 7 tests in 0.163s

2.训练新的模型

深度学习通常需要足够的训练样本，数据总量越多，各类分布越平均，训练得到的模型效果就越好。但一般很少有充足的训练集，所以，需要对数据进行增强（Data Augmentation）。常用的图像数据增强方法如下：

平移：将图像在一定尺度范围内平移

旋转：将图像在一定角度范围内旋转

翻转：水平翻转或上下翻转图像

裁剪：在原有图像上裁剪出一块

缩放：将图像在一定尺度内放大或缩小

颜色变化：对图像的RGB颜色空间进行一些变化

噪声扰动：给图像加入一些人工生成的噪声

如果原数据集尺寸过大，特征区域不明显，可以适当裁剪，突出特征区域，这样也可以加快训练速度。博主对原本1024像素*768像素的图像裁剪到520像素*120像素的图像，并左右上下平移，加入高斯噪声，直方图等式，将原本数据集扩大12倍。

2.1 准备数据集

需要用到labelImg软件，在此参考labelImg安装教程安装成功。

最后在labelImg-master文件夹下执行：

 python labelImg.py

对其进行标注

标注完成后保存为同名的xml文件。

仿写两个小python脚本文件，第一个将文件夹内的xml文件内的信息统一记录到.csv表格中（xml_to_csv.py），第二个从.csv表格中创建TFRecords格式（generate_tfrecord.py），注意修改保存路径。

xml_to_csv.py

# -*- coding: utf-8 -*-

 
import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET
 
#用于改变当前工作目录到指定的路径。
#os.chdir('/home/xuyifang/Desktop/API/research/data_prepare/train/')
os.chdir('/home/xuyifang/Desktop/image/')

#图片路径
#path = '/home/xuyifang/Desktop/API/research/data_prepare/train'
path = '/home/xuyifang/Desktop/image/test'

def xml_to_csv(path):
    counter = 0;
    xml_list = []
    for xml_file in glob.glob(path + '/*.xml'):
        counter = counter + 1;
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall('object'):
            value = (root.find('filename').text,
                     int(root.find('size')[0].text),
                     int(root.find('size')[1].text),
                     member[0].text,
                     int(member[4][0].text),
                     int(member[4][1].text),
                     int(member[4][2].text),
                     int(member[4][3].text)
                     )
            xml_list.append(value)
			
    column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
    xml_df = pd.DataFrame(xml_list, columns=column_name)
    print(xml_df)
    print('执行',counter,'次！')
    return xml_df
 
 
def main():
    image_path = path
    xml_df = xml_to_csv(image_path)
#    xml_df.to_csv('arthritis_test.csv', index=None)
    xml_df.to_csv('arthritis_test.csv', index=None)
    print('Successfully converted xml to csv.')
 
 
main()

generate_tfrecord.py

# -*- coding: utf-8 -*-
 
"""
Usage:
  # From tensorflow/models/
  # Create train data:
python generate_tfrecord.py \
	--csv_input=/home/xuyifang/Desktop/API/research/data_prepare/train/arthritis_train.csv \
	--output_path=train.record
	
  # Create test data:
python generate_tfrecord.py \
	--csv_input=/home/xuyifang/Desktop/API/research/data_prepare/validation/arthritis_validation.csv \
	--output_path=validation.record
"""
 
 
 
import os
import io
import pandas as pd
import tensorflow as tf
 
from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict
 
#图片路径
#os.chdir('/home/xuyifang/Desktop/image/test/')
os.chdir('/home/xuyifang/Desktop/image/test/')
 
flags = tf.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS
 
 
# TO-DO replace this with label map
#注意将对应的label改成自己的类别！！！！！！！！！！
def class_text_to_int(row_label):
    if row_label == 'c0':
        return 1
    elif row_label == 'c1':
        return 2
    elif row_label == 'c2':
        return 3
    elif row_label == 'c3':
        return 4
    else:
        None
 
 
def split(df, group):
    data = namedtuple('data', ['filename', 'object'])
    gb = df.groupby(group)
    return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
 
 
def create_tf_example(group, path):
    with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size
 
    filename = group.filename.encode('utf8')
    image_format = b'jpg'
    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []
 
    for index, row in group.object.iterrows():
        xmins.append(row['xmin'] / width)
        xmaxs.append(row['xmax'] / width)
        ymins.append(row['ymin'] / height)
        ymaxs.append(row['ymax'] / height)
        classes_text.append(row['class'].encode('utf8'))
        classes.append(class_text_to_int(row['class']))
 
    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example
 
 
def main(_):
    counter = 0;
    writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
	#    path = os.path.join(os.getcwd(),'images')
    path = os.path.join(os.getcwd())
    examples = pd.read_csv(FLAGS.csv_input)
    grouped = split(examples, 'filename')
    for group in grouped:
        counter = counter + 1;
        print(group)
        tf_example = create_tf_example(group, path)
        writer.write(tf_example.SerializeToString())
 
    writer.close()
    output_path = os.path.join(os.getcwd(), FLAGS.output_path)
    print('Successfully created the TFRecords: {}'.format(output_path))
    print('一共执行',counter,'次！')
 
 
if __name__ == '__main__':
    tf.app.run()

最后可以得到train.record与test.record文件。

2.2 配置文件模型及参数

在上一步我们已经获得了训练与测试数据集，在此我们需要在data_prepare下加入train.pbtxt与test.pbtxt，在object_detection加入 ssd_inception_v2_coco.config，最终文件结构如下：

research/

object_detection/

arthritis_export/

data_prepare/

train.record

train.csv

test.record

test.csv

train.pbtxt

test.pbtxt

train_dir/

eval_dir/

ssd_inception_v2_coco.config

接下来需要设置配置文件，进入 Object Detection github 对应页面寻找配置文件的Sample。

以 ssd_inception_v2_coco.config 为例，其中 ssd_inception_v2_coco.config 文件的配置如下：

1. num_classes 根据实际情况更改，在此改为4

2. batch_size 如果GPU够好可以调大一点

3. num_steps 训练轮数，在此改为20w轮

4. input_path：‘.record文件所在路径’ ， label_map_path：‘.pbtxt文件所在路径’ ，注意train和test的不要搞混了

5. num_examples，test中文件张数

6.如果想要更好的效果可以把以下参数调高，调到不会溢出GPU内存的值即可

image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }

7.ine_tune_checkpoint: "ssd_mobilenet_v1_coco_11_06_2017/model.ckpt"
from_detection_checkpoint: false #原本是ture，也可以删除以上两句话

ssd_inception_v2_coco.config

# SSD with Inception v2 configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  ssd {
    num_classes: 4
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
        reduce_boxes_in_lowest_layer: true
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 3
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_inception_v2'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
          anchorwise_output: true
        }
      }
      localization_loss {
        weighted_smooth_l1 {
          anchorwise_output: true
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config: {
#  keep_checkpoint_every_n_hours:0.5  # 添加这一行，其他时间可以自行调整
  batch_size: 10
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.000556622
          decay_steps: 2000
          decay_factor: 0.98
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
														#20w次
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  num_steps: 250000				#20w次
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "/home/xuyifang/Desktop/API/research/data_prepare/arthritis_train.record"
  }
  label_map_path: "/home/xuyifang/Desktop/API/research/data_prepare/train.pbtxt"
}

eval_config: {
  num_examples: 572			#168 180 27 572
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 10
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "/home/xuyifang/Desktop/API/research/data_prepare_/arthritis_train.record"
  }
  label_map_path: "/home/xuyifang/Desktop/API/research/data_prepare_/train.pbtxt"
  shuffle: false
  num_readers: 1
  num_epochs: 1
}

此时在对应目录（/data_prepare）下，创建一个 train.pbtxt 与 test.pbtxt 的文本文件（可以复制一个其他名字的文件，然后用文本编辑软件打开修改），写入我们的标签，我的例子中是四个，id序号注意与前面创建CSV文件时保持一致，从1开始。

item {
id: 1
name: 'c0'
}

item {
id: 2
name: 'c1'
}

item {
id: 3
name: 'c2'
}

item {
id: 4
name: 'c3'
}

2.3 训练模型

在object_detection下执行：

python train.py \
	--logtostderr \
	--train_dir=/home/xuyifang/Desktop/API/research/train_dir/ \
	--pipeline_config_path=/home/xuyifang/Desktop/API/research/ssd_inception_v2_coco.config

在Tensorflow Object Detection API 2.0版本中，训练文件已经改为了model_main.py , 对应命令也需要做出更改，也可以查阅API给出的训练说明手册

python object_detection/model_main.py \
    --pipeline_config_path=/home/xuyifang/Desktop/API2/research/object_detection/all/faster_rcnn_inception_v2_coco.config \
    --model_dir=/home/xuyifang/Desktop/API2/research/object_detection/all/train_dir \
    --num_train_steps=10000 \
    --alsologtostderr

出现下列界面就算开始训练了：

2.4 训练结果查看

训练的日志与模型都会被保存在train_dir中，可以使用Tensorboard来监控训练情况：

tensorboard --logdir=train_dir

之后在Firefox浏览器中打开，训练情况如下：

2.5 验证结果查看

在object_detection文件夹下执行 eval.py 会在eval_dir里生成以mAP为预测值的对应验证模型，可以使用Tensorboard查看

python eval.py \
        --logtostderr \
        --checkpoint_dir=/home/xuyifang/Desktop/API/research/train_dir/ \
        --eval_dir=/home/xuyifang/Desktop/API/research/eval_dir/ \
        --pipeline_config_path=/home/xuyifang/Desktop/API/research/ssd_inception_v2_coco.config

使用Tensorboard查看：

tensorboard --logdir=eval_dir

结果如下：

2.6 导出模型并预测单张图片

在object_detection文件夹下执行：

python export_inference_graph.py \
	--input_type image_tensor \
	--pipeline_config_path /home/xuyifang/Desktop/API/research/ssd_inception_v2_coco.config \
	--trained_checkpoint_prefix /home/xuyifang/Desktop/API/research/train_dir/model.ckpt-250000 \
	--output_directory /home/xuyifang/Desktop/API/research/object_detection/arthritis_export/

--trained_checkpoint_prefix /home/xuyifang/Desktop/API/research/train_dir/model.ckpt-250000 ，250000是训练的轮数，每次导出模型需要更改，而且要删除上一次导出模型的存放位置。

--output_directory /home/xuyifang/Desktop/API/research/object_detection/arthritis_export 改成自己定义的输出文件名字

运行完可以在对应的arthritis_export文件夹里找到saved_model、 checkpoint 、 .ckpt文件与 .pb文件

最后就是检测了

#Imports
import time
start = time.time()
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
import cv2
 
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
 
  
os.chdir('/home/xuyifang/Desktop/API/research/object_detection')
  
#Env setup 
# This is needed to display the images.
#%matplotlib inline
 
# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
 
#Object detection imports
from utils import label_map_util
 
from utils import visualization_utils as vis_util
 
 
#Model preparation
# What model to download.
 
#这是我们刚才训练的模型
MODEL_NAME = '/home/xuyifang/Desktop/API/research/object_detection/arthritis_export'
 
#对应的Frozen model位置
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'
 
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('/home/xuyifang/Desktop/API/research/data_prepare', 'train.pbtxt')
 
#改成自己例子中的类别数，4
NUM_CLASSES = 4
 
#Load a (frozen) Tensorflow model into memory.    
detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')    
    
#Loading label map
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
 
 
#Helper code
def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)
 
 
#Detection
 
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
#测试图片位置
PATH_TO_TEST_IMAGES_DIR = '/home/xuyifang/Desktop/API/research/object_detection/test_images'
os.chdir(PATH_TO_TEST_IMAGES_DIR)
TEST_IMAGE_PATHS = os.listdir(PATH_TO_TEST_IMAGES_DIR)
 
# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)
 
output_path = ('/home/xuyifang/Desktop/API/research/object_detection/test_out/')
 
 
with detection_graph.as_default():
  with tf.Session(graph=detection_graph) as sess:
    # Definite input and output Tensors for detection_graph
    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
    # Each box represents a part of the image where a particular object was detected.
    detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
    # Each score represent how level of confidence for each of the objects.
    # Score is shown on the result image, together with the class label.
    detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
    detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
    num_detections = detection_graph.get_tensor_by_name('num_detections:0')
    for image_path in TEST_IMAGE_PATHS:
		
      image = cv2.imread(image_path, 0)
      image_RGB = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)
      image_np = image_RGB
	
      image_np_expanded = np.expand_dims(image_np, axis=0)

      (boxes, scores, classes, num) = sess.run(
        [detection_boxes, detection_scores, detection_classes, num_detections],
        feed_dict={image_tensor: image_np_expanded})

      vis_util.visualize_boxes_and_labels_on_image_array(
            image_np,
            np.squeeze(boxes),
            np.squeeze(classes).astype(np.int32),
            np.squeeze(scores),
            category_index,
            use_normalized_coordinates=True,
            line_thickness=8)
	
      cv2.imwrite(output_path+image_path.split('\\')[-1],image_np)	
      
end =  time.time()
print("Execution Time: ", end - start)

需要更改的地方

os.chdir('/home/xuyifang/Desktop/API/research/object_detection') #定位到object_detection目录下

MODEL_NAME = '/home/xuyifang/Desktop/API/research/object_detection/mammary_export' #这是我们刚才训练的模型

NUM_CLASSES = 4 #改成自己例子中的类别数，4

PATH_TO_LABELS = os.path.join('/home/xuyifang/Desktop/API/research/data_prepare', 'train.pbtxt') #.pbtxt文件位置
PATH_TO_TEST_IMAGES_DIR = '/home/xuyifang/Desktop/API/research/object_detection/test_images' #测试图片位置

output_path = ('/home/xuyifang/Desktop/API/research/object_detection/test_out/') #输出照片的位置

最后输出的照片就类似与下图：