目录
1、数据的标注工具:labelIImg ----我也是在github上下载的,这里我提供我的链接
2、数据集的文件夹:由于我的只涉及目标检测,故只需以下几个文件目录:
2、将数据集转换为tfrecord格式,书中提供了create_pascal_tf_record.py,在这里,需要对书中的代码修改部分。如下:
3、voc.config文件:这个按照书中的方式添加参数即可,下面是我的:
6、导出模型:export_inference_graph.py
前言:
由于毕业设计涉及到图像识别的目标检测,故需要训练一个识别家具的目标检测模型。而训练模型的代码是参考《21个项目玩转深度学习》中的第五章的目标检测代码。但是书中代码的环境是python2.x版本的,而我的是python3.6。故需要在他的代码基础上做些修改,才可以正确训练,书中已经大致介绍了模型训练的整个过程,而我会在他的基础简单补充一下我的步骤以及部分代码的修改。
一、VOC数据集的制作
书中的代码是在VOC数据集的基础上训练模型的,但是由于我的毕业设计要识别是室内家具等,故我要制作自己的VOC格式的数据集。而制作voc数据集的方法步骤网上很多,我也是按照他们的步骤制作的。
附上书的链接以及书中的代码:
链接:https://pan.baidu.com/s/1RGlJkxfqE9ba71kUGmwfWA
提取码:3mfx
1、数据的标注工具:labelIImg ----我也是在github上下载的,这里我提供我的链接
链接:https://pan.baidu.com/s/17oAJU2_tfp5X5yjrFI4v1g
提取码:kqdw
2、数据集的文件夹:由于我的只涉及目标检测,故只需以下几个文件目录:
- 首先将训练图片放到JPEGImages,然后将图片重命名为VOC2007的“000001.jpg”形式
- 图片批量重名代码参考:http://blog.csdn.net/u011574296/article/details/72956446
- Annotations文件夹中存放的是每张图片对应的标注文件,注意:其中标注的标签名最好是小写英文字母,大写的可能会报错。
- 最后是ImageSets文件夹,该文件夹下存放的是4个txt,分别是:
test.txt是测试集
train.txt是训练集
val.txt是验证集
trainval.txt是训练和验证集
如图:
而这四个txt可以用代码生成:
import os
import random
trainval_percent = 0.8
train_percent = 0.8
xmlfilepath = 'D:/PycharmProjects/chapter_5/research/object_detection/voc/VOCdevkit/VOC2012/Annotations'
txtsavepath = 'D:/PycharmProjects/chapter_5/research/object_detection/voc/VOCdevkit/VOC2012/ImageSets/Main'
total_xml = os.listdir(xmlfilepath)
num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)
ftrainval = open('D:/PycharmProjects/chapter_5/research/object_detection/voc/VOCdevkit/VOC2012/ImageSets/Main/trainval.txt', 'w')
ftest = open('D:/PycharmProjects/chapter_5/research/object_detection/voc/VOCdevkit/VOC2012/ImageSets/Main/test.txt', 'w')
ftrain = open('D:/PycharmProjects/chapter_5/research/object_detection/voc/VOCdevkit/VOC2012/ImageSets/Main/train.txt', 'w')
fval = open('D:/PycharmProjects/chapter_5/research/object_detection/voc/VOCdevkit/VOC2012/ImageSets/Main/val.txt', 'w')
for i in list:
name = total_xml[i][:-4] + '\n'
if i in trainval:
ftrainval.write(name)
if i in train:
ftrain.write(name)
else:
fval.write(name)
else:
ftest.write(name)
ftrainval.close()
ftrain.close()
fval.close()
ftest.close()
至此数据集已经制作完成
二、实现
1、Pascal_label_map.pbtxt文件格式:
2、将数据集转换为tfrecord格式,书中提供了create_pascal_tf_record.py,在这里,需要对书中的代码修改部分。如下:
第84、85行需要修改,这是书的代码:
修改为:
第162、163行代码也需要修改,书中代码:
修改为:
然后在create_pascal_tf_record.py中还需添加以下文件路径:
在生成pascal_train.record时:
在生成pascal_val.record时:
随后便会生成相应的两个tfrecord文件:
3、voc.config文件:这个按照书中的方式添加参数即可,下面是我的:
4、最后配置完后的文件目录:
5、开始训练:train.py
添加两行参数:
注:当你在训练时,出现内存不足的错误时,可以调整voc.config中的:
分别减小一半。
6、导出模型:export_inference_graph.py
添加一些参数:
注意:第83行中模型的步数是你实际训练的步数。
参考代码:
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import pandas as pd
import zipfile
import cv2
from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image
# # This is needed to display the images.
# %matplotlib inline
# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
# from utils import label_map_util
# from utils import visualization_utils as vis_util
from research.object_detection.utils import label_map_util
from research.object_detection.utils import visualization_utils as vis_util
MODEL_NAME = 'frozen_inference_graph.pb'
# # Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = r'voc\export\frozen_inference_graph.pb'
#
# # List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = r'voc\pascal_label_map.pbtxt'
NUM_CLASSES = 12
# download model # 这部分也是下载模型的,注释掉即可
#opener = urllib.request.URLopener()
# 下载模型,如果已经下载好了下面这句代码可以注释掉
# opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
# tar_file = tarfile.open(MODEL_FILE)
# for file in tar_file.getmembers():
# file_name = os.path.basename(file.name)
# if 'frozen_inference_graph.pb' in file_name:
# tar_file.extract(file, os.getcwd())
# Load a (frozen) Tensorflow model into memory.
detection_graph = tf.Graph()
with detection_graph.as_default():
od_graph_def = tf.GraphDef()
with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
serialized_graph = fid.read()
od_graph_def.ParseFromString(serialized_graph)
tf.import_graph_def(od_graph_def, name='')
# Loading label map
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES,
use_display_name=True)
category_index = label_map_util.create_category_index(categories)
# Helper code
def load_image_into_numpy_array(image):
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape(
(im_height, im_width, 3)).astype(np.uint8)
# For the sake of simplicity we will use only 2 images: # 这里说明测试图片的命名规则为imagen.jpg, 遵守规则即可
# image1.jpg
# image.jpg
# If you want to test the code with your images, just add path to the images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = r'test_images' # 存放测试图片的路径
imagename = 'test.jpg'
TEST_IMAGE_PATHS = [os.path.join(PATH_TO_TEST_IMAGES_DIR, imagename)] # 修改测试图片的张数range(1, n + 1), 为测试图片的张数
# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)
with detection_graph.as_default():
with tf.Session(graph=detection_graph) as sess:
image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
# Each box represents a part of the image where a particular object was detected.
boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
# Each score represent how level of confidence for each of the objects.
# Score is shown on the result image, together with the class label.
scores = detection_graph.get_tensor_by_name('detection_scores:0')
classes = detection_graph.get_tensor_by_name('detection_classes:0')
num_detections = detection_graph.get_tensor_by_name('num_detections:0')
for image_path in TEST_IMAGE_PATHS:
image = Image.open(image_path)
width,height = image.size
# the array based representation of the image will be used later in order to prepare the
# result image with boxes and labels on it.
image_np = load_image_into_numpy_array(image)
# Expand dimensions since the model expects images to have shape: [1, None, None, 3]
image_np_expanded = np.expand_dims(image_np, axis=0)
# 开始检测
(boxes, scores, classes, num_detections) = sess.run(
[boxes, scores, classes, num_detections],
feed_dict={image_tensor: image_np_expanded})
# 可视化
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
np.squeeze(boxes),
np.squeeze(classes).astype(np.int32),
np.squeeze(scores),
category_index,
use_normalized_coordinates=True,
line_thickness=8)
plt.figure(figsize=IMAGE_SIZE)
plt.imshow(image_np)
plt.show()
print(num_detections)
plt.imsave('save_image/' + imagename,image_np)
我的检测效果: