0. Ubuntu安装tensorflow-gpu
有关于ubuntu下使用anaconda安装tensorflow-gpu的方法可以参考我这篇文章ubuntu一行命令在Anaconda下安装tensorflow-gpu
安装好之后,tensorflow默认的路径为~/anaconda3/lib/site-packages/tensorflow
1.object detection API下载地址及目录
https://github.com/tensorflow/models
文件下载下来得到models-master.zip
解压之后,将其放在安装好的tensorflow文件夹下,并且将models-master改名字为models,object_detection路径为models/research/object_detection下。
最终得到的我们需要的目录结构为:~/anaconda3/lib/site-packages/tensorflow/models/research/object_detection
2. protoc对object_detection/protos下的文件进行编译
在ubuntu下查看protoc版本
protoc --version
tensorflow object detection API要求protoc版本为2.6.0以上,否则会编译失败,如果版本过低,需要进行升级,升级好之后就可以进行编译了,我的版本是3.6.0。编译命令为:
cd ~/anaconda3/lib/python3.5/site-packages/tensorflow/models/research/
protoc object_detection/protos/*.proto --python_out=.
3.添加环境变量
tensorflow/models/research/ 和 tensorflow/models/research/slim 目录需要添加到PYTHONPATH环境变量中.
gedit ~/.bashrc #打开.bashrc
添加环境变量
export PYTHONPATH="/home/leo/anaconda3/lib/python3.5/site-packages/tensorflow/models/research:$PYTHONPATH"
export PYTHONPATH="/home/leo/anaconda3/lib/python3.5/site-packages/tensorflow/models/research/slim:$PYTHONPATH"
source ~/.bashrc #使修改立即生效
4.测试object_detection api:
cd ~/anaconda3/lib/python3.5/site-packages/tensorflow/models/research
python object_detection/builders/model_builder_test.py
如果返回ok表明配置正确
5. 测试已经训练好的模型
COCO数据集是Microsoft发布的用于图像识别训练的数据集,图像中的目标都经过准确的分割及位置定位的,共包括90类目标。Object Detection API默认提供了5个预训练模型,它们都是COCO数据集训练的:
SSD + MobileNet
Inception V2 + SSD
ResNet101 + R-CNN
ResNet101 + Faster R-CNN
Inception-ResNet V2 + Faster R-CNN
这些预训练模型下载地址为:
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
*新建一个文件夹my_object_detection
-
新建my_object_detection/models/
将下载的ssd_mobilenet_v1_coco_2018_01_28解压到其中。形成my_object_detection/models/ssd_mobilenet_v1_coco_2018_01_28/目录结构 -
新建my_object_detection/images/
在里面新建test_images/,形成my_object_detection/images/test_images/目录结构
在里面放两张测试图片或者将原object_detection/test_images/中的两张图片复制过来 -
新建my_object_detection/data/,将原object_detection/data/中的mscoco_label_map.pbtxt文件复制过来
-
新建my_object_detection/leo_object_detection_APP_ssd_mobilenet_v1_coco_2018_01_28.py,内容如下:
#encoding:utf-8
import tensorflow as tf
import numpy as np
import os
from matplotlib import pyplot as plt
from PIL import Image
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_utils
#下载下来的模型的目录
MODEL_DIR = 'models/ssd_mobilenet_v1_coco_2018_01_28'
#下载下来的模型的文件
MODEL_CHECK_FILE = os.path.join(MODEL_DIR, 'frozen_inference_graph.pb')
#数据集对于的label
MODEL_LABEL_MAP = os.path.join('data', 'mscoco_label_map.pbtxt')
#数据集分类数量,可以打开mscoco_label_map.pbtxt文件看看
MODEL_NUM_CLASSES = 90
#这里是获取实例图片文件名,将其放到数组中
PATH_TO_TEST_IMAGES_DIR = 'images/test_images'
TEST_IMAGES_PATHS = [os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(1, 3)]
#输出图像大小,单位是in
IMAGE_SIZE = (12, 8)
tf.reset_default_graph()
#将模型读取到默认的图中
with tf.gfile.GFile(MODEL_CHECK_FILE, 'rb') as fd:
_graph = tf.GraphDef()
_graph.ParseFromString(fd.read())
tf.import_graph_def(_graph, name='')
#加载COCO数据标签,将mscoco_label_map.pbtxt的内容转换成
# {1: {'id': 1, 'name': u'person'}...90: {'id': 90, 'name': u'toothbrush'}}格式
label_map = label_map_util.load_labelmap(MODEL_LABEL_MAP)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=MODEL_NUM_CLASSES)
category_index = label_map_util.create_category_index(categories)
#将图片转化成numpy数组形式
def load_image_into_numpy_array(image):
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)
#在图中开始计算
detection_graph = tf.get_default_graph()
with tf.Session(graph=detection_graph) as sess:
for image_path in TEST_IMAGES_PATHS:
print(image_path)
#读取图片
image = Image.open(image_path)
#将图片数据转成数组
image_np = load_image_into_numpy_array(image)
#增加一个维度
image_np_expanded = np.expand_dims(image_np, axis=0)
#下面都是获取模型中的变量,直接使用就好了
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
#存放所有检测框
boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
#每个检测结果的可信度
scores = detection_graph.get_tensor_by_name('detection_scores:0')
#每个框对应的类别
classes = detection_graph.get_tensor_by_name('detection_classes:0')
#检测框的个数
num_detections = detection_graph.get_tensor_by_name('num_detections:0')
#开始计算
(boxes, scores, classes, num_detections) = sess.run([boxes, scores, classes, num_detections],
feed_dict={image_tensor : image_np_expanded})
#打印识别结果
print(num_detections)
print(boxes)
print(classes)
print(scores)
#得到可视化结果
vis_utils.visualize_boxes_and_labels_on_image_array(
image_np,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True,
line_thickness=8
)
#显示
plt.figure(figsize=IMAGE_SIZE)
plt.imshow(image_np)
plt.show()
运行结果:
6. 初步训练自己的模型
下载数据集:
下载PASCAL VOC 2012数据集,这是别人已经整理好的数据集,一共有17125张图片,每张图片都有标注,标注的内容包括人、动物、交通工具、家具等20个类别。
将下载好的数据集解压到my_object_detection/images/文件夹下,形成my_object_detection/images/VOCdevkit/VOC2012/的目录结构,VOC2012/目录包含了五个文件夹,其中JPEGImages文件夹中存储了所有的图片,Annotations文件夹存储了每一张图片对应目标的标注。
数据格式转换:
tensorflow训练需要的数据格式是tfrecord类型的,所以在开始训练之前要将PASCAL VOC2012数据集转换为tfrecord格式。
新建my_object_detection/create_pascal_tf_record_train.sh,写入如下内容:
python dataset_tools/create_pascal_tf_record.py --data_dir=my_images/VOCdevkit/ --year=VOC2012 --output_path=my_images/VOCdevkit/pascal_train.record --set=train
新建my_object_detection/create_pascal_tf_record_val.sh,写入如下内容:
python dataset_tools/create_pascal_tf_record.py --data_dir=my_images/VOCdevkit/ --year=VOC2012 --output_path=my_images/VOCdevkit/pascal_val.record --set=val
执行bash create_pascal_tf_record_train.sh
执行bash create_pascal_tf_record_val.sh
可以在my_object_detection/images/VOCdevkit/文件夹下得到pascal_train.record和pascal_val.record两个文件,说明转换成功。
下载预训练模型:
下载faster_rcnn_inception_resnet_v2_atrous_coco模型,将其解压到my_object_detection/models/中。
导入PASCAL_VOC2012数据集的pbtxt文件:
将原object_detection/data/中的pascal_label_map.pbtxt文件复制到my_object_detection/data/中
配置训练参数:
新建my_object_detection/training_config/,将原object_detection/samples/config/faster_rcnn_inception_resnet_v2_atrous_coco.config文件复制到该文件夹下,改名字为faster_rcnn_inception_resnet_v2_atrous_coco_voc2012.config,并对其进行修改:
将num_classes: 90改为num_classes: 20
将num_examples: 8000改为num_examples: 5823
5处PATH_TO_BE_CONFIGURED的修改为对应的目录
我最终修改的目录为:
# Faster R-CNN with Inception Resnet v2, Atrous version;
# Configured for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.
model {
faster_rcnn {
num_classes: 20
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
feature_extractor {
type: 'faster_rcnn_inception_resnet_v2'
first_stage_features_stride: 8
}
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 8
width_stride: 8
}
}
first_stage_atrous_rate: 2
first_stage_box_predictor_conv_hyperparams {
op: CONV
regularizer {
l2_regularizer {
weight: 0.0
}
}
initializer {
truncated_normal_initializer {
stddev: 0.01
}
}
}
first_stage_nms_score_threshold: 0.0
first_stage_nms_iou_threshold: 0.7
first_stage_max_proposals: 300
first_stage_localization_loss_weight: 2.0
first_stage_objectness_loss_weight: 1.0
initial_crop_size: 17
maxpool_kernel_size: 1
maxpool_stride: 1
second_stage_box_predictor {
mask_rcnn_box_predictor {
use_dropout: false
dropout_keep_probability: 1.0
fc_hyperparams {
op: FC
regularizer {
l2_regularizer {
weight: 0.0
}
}
initializer {
variance_scaling_initializer {
factor: 1.0
uniform: true
mode: FAN_AVG
}
}
}
}
}
second_stage_post_processing {
batch_non_max_suppression {
score_threshold: 0.0
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 100
}
score_converter: SOFTMAX
}
second_stage_localization_loss_weight: 2.0
second_stage_classification_loss_weight: 1.0
}
}
train_config: {
batch_size: 1
optimizer {
momentum_optimizer: {
learning_rate: {
manual_step_learning_rate {
initial_learning_rate: 0.0003
schedule {
step: 900000
learning_rate: .00003
}
schedule {
step: 1200000
learning_rate: .000003
}
}
}
momentum_optimizer_value: 0.9
}
use_moving_average: false
}
gradient_clipping_by_norm: 10.0
fine_tune_checkpoint: "./models/faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28/model.ckpt"
from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options {
random_horizontal_flip {
}
}
}
train_input_reader: {
tf_record_input_reader {
input_path: "./images/VOCdevkit/pascal_train.record"
}
label_map_path: "./data/pascal_label_map.pbtxt"
}
eval_config: {
num_examples: 5823
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
}
eval_input_reader: {
tf_record_input_reader {
input_path: "./images/VOCdevkit/pascal_val.record"
}
label_map_path: "./data/pascal_label_map.pbtxt"
shuffle: false
num_readers: 1
}
开始训练:
新建my_object_detection/training_dir/文件夹,用于存放训练结果
新建my_object_detection/leo_object_detection_COCO_VOC2012_training.sh,写入内容如下:
python ~/anaconda3/lib/python3.5/site-packages/tensorflow/models/research/object_detection/legacy/train.py \
--train_dir=./train_dir/ \
--pipeline_config_path=./training_config/faster_rcnn_inception_resnet_v2_atrous_coco_voc2012.config \
执行bash leo_object_detection_COCO_VOC2012_training.sh就可以开始训练了:)
7.导出自己训练的模型
新建my_object_detection/export_dir/文件夹,用于存放导出模型
新建my_object_detection/leo_export_inference_graph.sh文件,写入如下内容:
python ~/anaconda3/lib/python3.5/site-packages/tensorflow/models/research/object_detection/export_inference_graph.py \
--input_type=image_tensor \
--pipeline_config_path=./training_config/faster_rcnn_inception_resnet_v2_atrous_coco_voc2012.config \
--trained_checkpoint_prefix=./train_dir/model.ckpt-200000 \
--output_directory=./export_dir/ \
执行bash leo_export_inference_graph.sh就可以将训练的模型导出了。
导出之后,可以在my_object_detection/export_dir/看到如下结果:
其中frozen_inference_graph.pb就是最终的结果。
新建my_object_detection/models/faster_rcnn_inception_resnet_v2_atrous_COCO_VOC2012文件夹,
将my_object_detection/export_dir/中的内容剪切到该文件夹中,以便调用。
8.测试自己训练的模型
修改第5步中my_object_detection/leo_object_detection_APP_ssd_mobilenet_v1_coco_2018_01_28.py文件中的相应路径,新生成一个python文件,起名字为my_object_detection/leo_object_detection_APP_faster_rcnn_inception_resnet_v2_atrous_COCO_VOC2012.py,测试训练结果。网上随便找了张图,虽然有的时候把马识别成了狗,但是整个流程工作的很好。
9.训练自己真正的模型
实际应用中,需要识别的目标千差万别,因此需要对特定目标进行训练。
9.1目标标注
标注工具选用labelImage,github上已经给出了详细的安装和使用教程。
标注完成之后,每张图片会对应一个xml文件,其中存储着标注信息,我这里用爬虫爬取了很多GYY的图片,并对其进行了标注,结果如下。
- 新建my_object_detection/images/myimages/文件夹,用于存放自己的图片及标注信息。
- 新建my_object_detection/images/myimages/my_train_images/GYY/文件夹,并将训练图片以及标注好的xml文件copy进去。
- 新建my_object_detection/images/myimages/my_test_images/GYY/文件夹,并将测试图片以及标注好的xml文件copy进去。
9.2数据预处理:将xml转换成csv
- 新建my_object_detection/images/myimages/xml2csv_my_train_images_GYY.py文件,并在其中写入如下内容:
# xml2csv.py
import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET
os.chdir('/home/leo/Desktop/object_detection_API_work/my_object_detection/images/myimages/my_train_images/GYY/')
path = '/home/leo/Desktop/object_detection_API_work/my_object_detection/images/myimages/my_train_images/GYY/'
def xml_to_csv(path):
xml_list = []
for xml_file in glob.glob(path + '/*.xml'):
tree = ET.parse(xml_file)
root = tree.getroot()
for member in root.findall('object'):
value = (root.find('filename').text,
int(root.find('size')[0].text),
int(root.find('size')[1].text),
member[0].text,
int(member[4][0].text),
int(member[4][1].text),
int(member[4][2].text),
int(member[4][3].text)
)
xml_list.append(value)
column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
xml_df = pd.DataFrame(xml_list, columns=column_name)
return xml_df
def main():
image_path = path
xml_df = xml_to_csv(image_path)
xml_df.to_csv('GYY_train.csv', index=None)
print('Successfully converted xml to csv.')
main()
- 新建my_object_detection/images/myimages/xml2csv_my_test_images_GYY.py文件,并在其中写入如下内容:
# xml2csv.py
import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET
os.chdir('/home/leo/Desktop/object_detection_API_work/my_object_detection/images/myimages/my_test_images/GYY/')
path = '/home/leo/Desktop/object_detection_API_work/my_object_detection/images/myimages/my_test_images/GYY/'
def xml_to_csv(path):
xml_list = []
for xml_file in glob.glob(path + '/*.xml'):
tree = ET.parse(xml_file)
root = tree.getroot()
for member in root.findall('object'):
value = (root.find('filename').text,
int(root.find('size')[0].text),
int(root.find('size')[1].text),
member[0].text,
int(member[4][0].text),
int(member[4][1].text),
int(member[4][2].text),
int(member[4][3].text)
)
xml_list.append(value)
column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
xml_df = pd.DataFrame(xml_list, columns=column_name)
return xml_df
def main():
image_path = path
xml_df = xml_to_csv(image_path)
xml_df.to_csv('GYY_test.csv', index=None)
print('Successfully converted xml to csv.')
main()
上面两个python文件实现的是将xml标注信息转换成csv格式,cd 到my_object_detection/images/myimages/,执行:
python xml2csv_my_train_images_GYY.py
python xml2csv_my_test_images_GYY.py
然后查看my_object_detection/images/myimages/my_train_images/GYY和my_object_detection/images/myimages/my_test_images/GYY/文件夹,正常情况在两个文件夹下分别会生成GYY_train.csv和GYY_test.csv文件,打开这两个文件,确保里面的内容正确。
9.3数据预处理:将csv转换成tfrecord
- 新建my_object_detection/images/myimages/leo_generate_tfrecord_train_GYY.py文件,写入:
# leo_generate_tfrecord_train_GYY.py
# -*- coding: utf-8 -*-
"""
Usage:
# From tensorflow/models/
# Create train data:
python generate_tfrecord.py --csv_input=data/tv_vehicle_labels.csv --output_path=train.record
# Create test data:
python generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=test.record
"""
import os
import io
import pandas as pd
import tensorflow as tf
from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict
flags = tf.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS
# TO-DO replace this with label map
def class_text_to_int(row_label):
if row_label == 'GaoYY': # 需改动
return 1
else:
None
def split(df, group):
data = namedtuple('data', ['filename', 'object'])
gb = df.groupby(group)
return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
def create_tf_example(group, path):
with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
encoded_jpg = fid.read()
encoded_jpg_io = io.BytesIO(encoded_jpg)
image = Image.open(encoded_jpg_io)
width, height = image.size
filename = group.filename.encode('utf8')
image_format = b'jpg'
xmins = []
xmaxs = []
ymins = []
ymaxs = []
classes_text = []
classes = []
for index, row in group.object.iterrows():
xmins.append(row['xmin'] / width)
xmaxs.append(row['xmax'] / width)
ymins.append(row['ymin'] / height)
ymaxs.append(row['ymax'] / height)
classes_text.append(row['class'].encode('utf8'))
classes.append(class_text_to_int(row['class']))
tf_example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename),
'image/source_id': dataset_util.bytes_feature(filename),
'image/encoded': dataset_util.bytes_feature(encoded_jpg),
'image/format': dataset_util.bytes_feature(image_format),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return tf_example
def main(_):
writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
path = os.path.join(os.getcwd(), 'my_train_images/GYY') # 需改动
examples = pd.read_csv(FLAGS.csv_input)
grouped = split(examples, 'filename')
for group in grouped:
tf_example = create_tf_example(group, path)
writer.write(tf_example.SerializeToString())
writer.close()
output_path = os.path.join(os.getcwd(), FLAGS.output_path)
print('Successfully created the TFRecords: {}'.format(output_path))
if __name__ == '__main__':
tf.app.run()
- 新建my_object_detection/images/myimages/leo_generate_tfrecord_test_GYY.py文件,写入:
# leo_generate_tfrecord_test_GYY.py
# -*- coding: utf-8 -*-
"""
Usage:
# From tensorflow/models/
# Create train data:
python generate_tfrecord.py --csv_input=data/tv_vehicle_labels.csv --output_path=train.record
# Create test data:
python generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=test.record
"""
import os
import io
import pandas as pd
import tensorflow as tf
from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict
flags = tf.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS
# TO-DO replace this with label map
def class_text_to_int(row_label):
if row_label == 'GaoYY': # 需改动
return 1
else:
None
def split(df, group):
data = namedtuple('data', ['filename', 'object'])
gb = df.groupby(group)
return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
def create_tf_example(group, path):
with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
encoded_jpg = fid.read()
encoded_jpg_io = io.BytesIO(encoded_jpg)
image = Image.open(encoded_jpg_io)
width, height = image.size
filename = group.filename.encode('utf8')
image_format = b'jpg'
xmins = []
xmaxs = []
ymins = []
ymaxs = []
classes_text = []
classes = []
for index, row in group.object.iterrows():
xmins.append(row['xmin'] / width)
xmaxs.append(row['xmax'] / width)
ymins.append(row['ymin'] / height)
ymaxs.append(row['ymax'] / height)
classes_text.append(row['class'].encode('utf8'))
classes.append(class_text_to_int(row['class']))
tf_example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename),
'image/source_id': dataset_util.bytes_feature(filename),
'image/encoded': dataset_util.bytes_feature(encoded_jpg),
'image/format': dataset_util.bytes_feature(image_format),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return tf_example
def main(_):
writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
path = os.path.join(os.getcwd(), 'my_test_images/GYY') # 需改动
examples = pd.read_csv(FLAGS.csv_input)
grouped = split(examples, 'filename')
for group in grouped:
tf_example = create_tf_example(group, path)
writer.write(tf_example.SerializeToString())
writer.close()
output_path = os.path.join(os.getcwd(), FLAGS.output_path)
print('Successfully created the TFRecords: {}'.format(output_path))
if __name__ == '__main__':
tf.app.run()
- 新建my_object_detection/images/myimages/leo_generate_tfrecord_train_GYY.sh文件,写入:
python leo_generate_tfrecord_train_GYY.py \
--csv_input=./my_train_images/GYY/GYY_train.csv \
--output_path=./my_train_images/GYY_train.record \
- 新建my_object_detection/images/myimages/文件,写入:
python leo_generate_tfrecord_test_GYY.py \
--csv_input=./my_test_images/GYY/GYY_test.csv \
--output_path=./my_test_images/GYY_test.record \
- cd到my_object_detection/images/myimages/文件夹下,执行
bash leo_generate_tfrecord_train_GYY.sh
bash leo_generate_tfrecord_test_GYY.sh
- 正常生成成功的话,会在my_object_detection/images/myimages/my_train_images文件夹下生成GYY_train.record文件,在my_object_detection/images/myimages/my_test_images文件夹下生成GYY_test.record文件。
9.4设置pbtxt文件
新建my_object_detection/data/GYY_label_map.pbtxt文件,写入:
9.5配置训练参数
复制my_object_detection/training_config/faster_rcnn_inception_resnet_v2_atrous_coco.config文件到该文件夹下,改名字为faster_rcnn_inception_resnet_v2_atrous_coco_GYY.config。并对其进行修改:
将num_classes: 20改为num_classes: 1 #我只有一类,即GaoYY
将num_examples: 5823改为num_examples: 21 #我的验证集图片数量只有21张
5处PATH_TO_BE_CONFIGURED的修改为对应的目录
我最终修改的目录为:
# Faster R-CNN with Inception Resnet v2, Atrous version;
# Configured for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.
model {
faster_rcnn {
num_classes: 1
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
}
}
feature_extractor {
type: 'faster_rcnn_inception_resnet_v2'
first_stage_features_stride: 8
}
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 8
width_stride: 8
}
}
first_stage_atrous_rate: 2
first_stage_box_predictor_conv_hyperparams {
op: CONV
regularizer {
l2_regularizer {
weight: 0.0
}
}
initializer {
truncated_normal_initializer {
stddev: 0.01
}
}
}
first_stage_nms_score_threshold: 0.0
first_stage_nms_iou_threshold: 0.7
first_stage_max_proposals: 300
first_stage_localization_loss_weight: 2.0
first_stage_objectness_loss_weight: 1.0
initial_crop_size: 17
maxpool_kernel_size: 1
maxpool_stride: 1
second_stage_box_predictor {
mask_rcnn_box_predictor {
use_dropout: false
dropout_keep_probability: 1.0
fc_hyperparams {
op: FC
regularizer {
l2_regularizer {
weight: 0.0
}
}
initializer {
variance_scaling_initializer {
factor: 1.0
uniform: true
mode: FAN_AVG
}
}
}
}
}
second_stage_post_processing {
batch_non_max_suppression {
score_threshold: 0.0
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 100
}
score_converter: SOFTMAX
}
second_stage_localization_loss_weight: 2.0
second_stage_classification_loss_weight: 1.0
}
}
train_config: {
batch_size: 1
optimizer {
momentum_optimizer: {
learning_rate: {
manual_step_learning_rate {
initial_learning_rate: 0.0003
schedule {
step: 900000
learning_rate: .00003
}
schedule {
step: 1200000
learning_rate: .000003
}
}
}
momentum_optimizer_value: 0.9
}
use_moving_average: false
}
gradient_clipping_by_norm: 10.0
fine_tune_checkpoint: "./models/faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28/model.ckpt"
from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options {
random_horizontal_flip {
}
}
}
train_input_reader: {
tf_record_input_reader {
input_path: "./images/myimages/my_train_images/GYY_train.record"
}
label_map_path: "./data/GYY_label_map.pbtxt"
}
eval_config: {
num_examples: 21
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
}
eval_input_reader: {
tf_record_input_reader {
input_path: "./images/myimages/my_test_images/GYY_test.record"
}
label_map_path: "./data/GYY_label_map.pbtxt"
shuffle: false
num_readers: 1
}
9.6开始训练
新建my_object_detection/train_dir/GYY/文件夹,用于存放bolt的训练结果
新建my_object_detection/leo_object_detection_COCO_GYY_training.sh,写入内容如下:
python ~/anaconda3/lib/python3.5/site-packages/tensorflow/models/research/object_detection/legacy/train.py \
--train_dir=./train_dir/GYY/ \
--pipeline_config_path=./training_config/faster_rcnn_inception_resnet_v2_atrous_coco_GYY.config \
执行bash leo_object_detection_COCO_GYY_training.sh就可以开始训练了:)
9.7训练监测
在my_object_detection/文件夹下输入:
tensorboard --logdir=./train_dir/GYY
然后,在浏览器输入127.0.0.1:6006就可以实时看到训练情况了
9.7导出训练好的模型
新建my_object_detection/export_dir/GYY/ 文件夹,用于存放导出模型
新建my_object_detection/leo_export_inference_graph_GYY.sh文件,写入如下内容:
python ~/anaconda3/lib/python3.5/site-packages/tensorflow/models/research/object_detection/export_inference_graph.py \
--input_type=image_tensor \
--pipeline_config_path=./training_config/faster_rcnn_inception_resnet_v2_atrous_coco_GYY.config \
--trained_checkpoint_prefix=./train_dir/GYY/model.ckpt-5739 \
--output_directory=./export_dir/GYY/ \
执行bash leo_export_inference_graph_GYY.sh就可以将训练的模型导出了。
9.8测试自己的模型
从网上找几张图片,作为测试,放在my_object_detection/images/test_images/文件夹下,名字分别为GaoYY_test0.jpeg,GaoYY_test1.jpeg,GaoYY_test2.jpeg,GaoYY_test3.jpeg,GaoYY_test4.jpeg,GaoYY_test5.jpeg,GaoYY_test6.jpeg
新建my_object_detection/leo_object_detection_APP_faster_rcnn_inception_resnet_v2_atrous_COCO_GYY.py文件
写入:
#encoding:utf-8
import tensorflow as tf
import numpy as np
import os
from matplotlib import pyplot as plt
from PIL import Image
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_utils
#下载下来的模型的目录
MODEL_DIR = 'export_dir/GYY'
#下载下来的模型的文件
MODEL_CHECK_FILE = os.path.join(MODEL_DIR, 'frozen_inference_graph.pb')
#数据集对于的label
MODEL_LABEL_MAP = os.path.join('data', 'GYY_label_map.pbtxt')
#数据集分类数量,可以打开mscoco_label_map.pbtxt文件看看
MODEL_NUM_CLASSES = 1
#这里是获取实例图片文件名,将其放到数组中
PATH_TO_TEST_IMAGES_DIR = 'images/test_images'
TEST_IMAGES_PATHS = [os.path.join(PATH_TO_TEST_IMAGES_DIR, 'GaoYY_test{}.jpeg'.format(i)) for i in range(0, 7)]
#输出图像大小,单位是in
IMAGE_SIZE = (12, 8)
tf.reset_default_graph()
#将模型读取到默认的图中
with tf.gfile.GFile(MODEL_CHECK_FILE, 'rb') as fd:
_graph = tf.GraphDef()
_graph.ParseFromString(fd.read())
tf.import_graph_def(_graph, name='')
#加载COCO数据标签,将mscoco_label_map.pbtxt的内容转换成
# {1: {'id': 1, 'name': u'person'}...90: {'id': 90, 'name': u'toothbrush'}}格式
label_map = label_map_util.load_labelmap(MODEL_LABEL_MAP)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=MODEL_NUM_CLASSES)
category_index = label_map_util.create_category_index(categories)
#将图片转化成numpy数组形式
def load_image_into_numpy_array(image):
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)
#在图中开始计算
detection_graph = tf.get_default_graph()
with tf.Session(graph=detection_graph) as sess:
for image_path in TEST_IMAGES_PATHS:
print(image_path)
#读取图片
image = Image.open(image_path)
#将图片数据转成数组
image_np = load_image_into_numpy_array(image)
#增加一个维度
image_np_expanded = np.expand_dims(image_np, axis=0)
#下面都是获取模型中的变量,直接使用就好了
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
#存放所有检测框
boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
#每个检测结果的可信度
scores = detection_graph.get_tensor_by_name('detection_scores:0')
#每个框对应的类别
classes = detection_graph.get_tensor_by_name('detection_classes:0')
#检测框的个数
num_detections = detection_graph.get_tensor_by_name('num_detections:0')
#开始计算
(boxes, scores, classes, num_detections) = sess.run([boxes, scores, classes, num_detections],
feed_dict={image_tensor : image_np_expanded})
#打印识别结果
print(num_detections)
print(boxes)
print(classes)
print(scores)
#得到可视化结果
vis_utils.visualize_boxes_and_labels_on_image_array(
image_np,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True,
line_thickness=8
)
#显示
plt.figure(figsize=IMAGE_SIZE)
plt.imshow(image_np)
plt.show()
在my_object_detection/文件夹下执行python leo_object_detection_APP_faster_rcnn_inception_resnet_v2_atrous_COCO_GYY.py,结果如下:
化妆成这样还能识别出来?怕是水平有点厉害
等等,难道她老公跟她的关系都能识别出来?水平相当不一样啊
什么情况,所有的女生都被识别成了GaoYY?水平有所下降啊
连ChaiJ都被识别成GaoYY?水平一般啊
连男生都识别成了GaoYY?水平也是有限啊
什么???连高老师都沦陷了!?这水平真的相当相当不行和有限啊
10.总结
Object Detection Api是google提供的基于tensorflow的目标检测API,通过数据标注,确定了训练前的类别,将标注xml转换成csv,然后再转换成tensorflow喜欢的tfrecord,配置一些训练参数,然后开始训练,训练完成后,将训练结果导出成模型,然后再调用这个模型进行目标检测就可以了。
我使用的特定人脸数据进行标注,训练结果在预测范围内,对人脸这类大目标检测的很好,连高老师也被检测出来了,我很严肃,搞通了目标识别API的使用方法只是瞎玩一玩,还是要多看论文,多看文献啊。