题记:断断续续弄了半个月之久,终于能够顺利跑出自己的模型,为此我要认真发我人生中第一篇CSDN博客,诸位跑模型时踩坑的话,欢迎评论,看到一定帮忙解决。
环境:python2.7 tesorflow1.14.0
1.收集图片,最好是jpg格式的,我做的是交通标志的分类,因此收集了170多张交通标志的图片作为训练集(后来改为99张,因为只想先顺利跑通模型,之后再做更改),又找了6张比较模糊的交通标志作为测试集。分别放在
/Users/cathy/Desktop/models/research/object_detection/test_images/train /Users/cathy/Desktop/models/research/object_detection/test_images/test
中。
2.对于图片名称的处理,不要使用中文,否则运行程序的过程中会出错,因此我写了一个py文件,分别对训练集和测试集进行命名,例如000001~00000n。批量重命名图片.py代码如下:
_*_coding:utf-8
import os
pic_path = “/Users/cathy/Desktop/models/research/object_detection/test_images/test”
def rename():
piclist = os.listdir(pic_path)
total_num = len(piclist)
i = 1
for pic in piclist:
if pic.endswith(".jpg"):
old_path = os.path.join(os.path.abspath(pic_path), pic) # os.path.abspath获得绝对路径
new_path = os.path.join(os.path.abspath(pic_path), '000' + format(str(i), '0>3') + '.jpg')
os.renames(old_path, new_path)
print
u"把原图片命名格式:" + old_path + u"转换为新图片命名格式:" + new_path
# print "把原图片路径:%s,转换为新图片路径:%s" %(old_path,new_path)
i = i + 1
print("总共" + str(total_num) + "张图片被重命名为:" "00001.jpg~" + '000' + format(str(i - 1), '0>3') + ".jpg形式")
rename()
3.在训练模型的过程中,图片必须是RGB通道,否则也会出错。接下来我写了两个py文件进行图片处理。
第一个py文件:判定文件夹中的图片是否为RGB,检查图片是否RGB.py
from PIL import Image
import os
path = ‘/Users/cathy/Desktop/models/research/object_detection/test_images/train/’ # 图片目录
for file in os.listdir(path):
extension = file.split('.')[-1]
if extension == 'jpg':
fileLoc = path + file
img = Image.open(fileLoc)
if img.mode != 'RGB':
print(file + ', ' + img.mode)
第二个文件:将不是RGB的图片变为RGB,图片变为RGB.py
from PIL import Image
import os
path = ‘/Users/cathy/Desktop/models/research/object_detection/test_images/train/’ # 图片目录
for file in os.listdir(path):
extension = file.split('.')[-1]
if extension == 'jpg':
fileLoc = path + file
img = Image.open(fileLoc)
if img.mode != 'RGB':
print(file + ', ' + img.mode)
img_rgb = img.convert("RGB")#转化成RGB模式
os.remove(path + file)
img_rgb.save(path + file)
4.上面对于图片的处理告一段落,接下来对图片进行标注,使用的是labellmg,地址在这里https://github.com/tzutalin/labelImg#macos
我用的是mac,因此用的mac底下的那些命令进行安装
环境:python3.7 Qt5
brew install qt # Install qt-5.x.x by Homebrew
brew install libxml2
pyrcc5-o/Users/cathy/Desktop/labelImg-master/resources.py /Users/cathy/Desktop/labelImg-master/resources.qrc#根据自己的地址进行相应的修改
python3 /Users/cathy/Desktop/labelImg-master/labelImg.py
做完以上工作,会跑出来一个小程序,接下来我指出较为重要的功能:
选择目录和存放目录后,点击w,然后选择物体,然后标注名称,接着点击保存。点击d下一张继续以上的工作。
我将xml文件分别存放在了一下文件夹中:
/Users/cathy/Desktop/models/research/object_detection/test_images/train_xml
/Users/cathy/Desktop/models/research/object_detection/test_images/test_xml
5.生成了xml文件之后,还要将他们转为csv文件,xml_to_csv.py文件如下,运行即可:
-- coding: utf-8 --
“”"
Created on Tue Jan 16 00:52:02 2018
@author: Xiang Guo
“”"
import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET
os.chdir(’/Users/cathy/Desktop/models/research/object_detection/test_images/train_xml’)#需要修改
path = ‘/Users/cathy/Desktop/models/research/object_detection/test_images/train_xml’
def xml_to_csv(path):
xml_list = []
for xml_file in glob.glob(path + ‘/*.xml’):
tree = ET.parse(xml_file)
root = tree.getroot()
for member in root.findall(‘object’):
value = (root.find(‘filename’).text,
int(root.find(‘size’)[0].text),
int(root.find(‘size’)[1].text),
member[0].text,
int(member[4][0].text),
int(member[4][1].text),
int(member[4][2].text),
int(member[4][3].text)
)
xml_list.append(value)
column_name = [‘filename’, ‘width’, ‘height’, ‘class’, ‘xmin’, ‘ymin’, ‘xmax’, ‘ymax’]
xml_df = pd.DataFrame(xml_list, columns=column_name)
return xml_df
def main():
image_path = path
xml_df = xml_to_csv(image_path)
xml_df.to_csv(‘street_sign_train_labels.csv’, index=None)#需要修改
print(‘Successfully converted xml to csv.’)
main()
6.此时分别产生了训练集和测试集的csv文件,我们可以把它拖出到test_images,准备生成record格式的文件:
命令如下:
python2 generate_tfrecord.py --csv_input=/Users/cathy/Desktop/models/research/object_detection/test_images/train_xml/street_sign_train_labels.csv --output_path=/Users/cathy/Desktop/models/research/object_detection/test_images/street_sign_train.record 生成训练record
python2 generate_tfrecord.py --csv_input=/Users/cathy/Desktop/models/research/object_detection/test_images/test_xml/street_sign_test_labels.csv --output_path=/Users/cathy/Desktop/models/research/object_detection/test_images/street_sign_test.record 生成测试record
生成record文件的时候,肯定会遇见很多问题,
我遇见的一个奇葩问题:报错没有文件,后来我在object_detection下新建了一个images文件夹,分别把训练集和测试集放入,运行命令即可。
generate_tfrecord.py文件如下:
-- coding: utf-8 --
“”"
Created on Tue Jan 16 01:04:55 2018
@author: Xiang Guo
“”"
“”"
Usage:
From tensorflow/models/
Create train data:
python generate_tfrecord.py --csv_input=data/tv_vehicle_labels.csv --output_path=train.record
Create test data:
python generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=test.record
“”"
import os
import io
import pandas as pd
import tensorflow as tf
from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict
import sys
sys.path.append(’/Users/cathy/Desktop/models/research/’)
os.chdir(’/Users/cathy/Desktop/models/research/object_detection/’)#需要修改
flags = tf.app.flags
flags.DEFINE_string(‘csv_input’, ‘’, ‘Path to the CSV input’)
flags.DEFINE_string(‘output_path’, ‘’, ‘Path to output TFRecord’)
FLAGS = flags.FLAGS
TO-DO replace this with label map
def class_text_to_int(row_label):
if row_label == ‘street_sign’:#需要修改
return 1
else:
None
def split(df, group):
data = namedtuple(‘data’, [‘filename’, ‘object’])
gb = df.groupby(group)
return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
def create_tf_example(group, path):
with tf.gfile.GFile(os.path.join(path, ‘{}’.format(group.filename)), ‘rb’) as fid:
encoded_jpg = fid.read()
encoded_jpg_io = io.BytesIO(encoded_jpg)
image = Image.open(encoded_jpg_io)
width, height = image.size
filename = group.filename.encode('utf8')
image_format = b'jpg'
xmins = []
xmaxs = []
ymins = []
ymaxs = []
classes_text = []
classes = []
for index, row in group.object.iterrows():
xmins.append(row['xmin'] / width)
xmaxs.append(row['xmax'] / width)
ymins.append(row['ymin'] / height)
ymaxs.append(row['ymax'] / height)
classes_text.append(row['class'].encode('utf8'))
classes.append(class_text_to_int(row['class']))
tf_example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename),
'image/source_id': dataset_util.bytes_feature(filename),
'image/encoded': dataset_util.bytes_feature(encoded_jpg),
'image/format': dataset_util.bytes_feature(image_format),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return tf_example
def main(_):
writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
path = os.path.join(os.getcwd(), ‘images’)
examples = pd.read_csv(FLAGS.csv_input)
grouped = split(examples, ‘filename’)
for group in grouped:
tf_example = create_tf_example(group, path)
writer.write(tf_example.SerializeToString())
writer.close()
output_path = os.path.join(os.getcwd(), FLAGS.output_path)
print('Successfully created the TFRecords: {}'.format(output_path))
if name == ‘main’:
tf.app.run()
7.下载模型:https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
为了方便,我用的第一个模型,之后解压存放进object_detection下。
将
/Users/cathy/Desktop/models/research/object_detection/samples/configs/ssd_mobilenet_v1_coco.config 拷贝到training目录下,修改,修改后的文件如下:
SSD with Mobilenet v1 configuration for MSCOCO Dataset.
Users should configure the fine_tune_checkpoint field in the train config as
well as the label_map_path and input_path fields in the train_input_reader and
eval_input_reader. Search for “PATH_TO_BE_CONFIGURED” to find the fields that
should be configured.
model {
ssd {
num_classes: 1#修改分类数目
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}
}
matcher {
argmax_matcher {
matched_threshold: 0.5
unmatched_threshold: 0.5
ignore_thresholds: false
negatives_lower_than_unmatched: true
force_match_for_each_row: true
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
num_layers: 6
min_scale: 0.2
max_scale: 0.95
aspect_ratios: 1.0
aspect_ratios: 2.0
aspect_ratios: 0.5
aspect_ratios: 3.0
aspect_ratios: 0.3333
}
}
image_resizer {
fixed_shape_resizer {
height: 300
width: 300
}
}
box_predictor {
convolutional_box_predictor {
min_depth: 0
max_depth: 0
num_layers_before_predictor: 0
use_dropout: false
dropout_keep_probability: 0.8
kernel_size: 1
box_code_size: 4
apply_sigmoid_to_scores: false
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.9997,
epsilon: 0.001,
}
}
}
}
feature_extractor {
type: ‘ssd_mobilenet_v1’
min_depth: 16
depth_multiplier: 1.0
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.00004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
train: true,
scale: true,
center: true,
decay: 0.9997,
epsilon: 0.001,
}
}
}
loss {
classification_loss {
weighted_sigmoid {
}
}
localization_loss {
weighted_smooth_l1 {
}
}
hard_example_miner {
num_hard_examples: 3000
iou_threshold: 0.99
loss_type: CLASSIFICATION
max_negatives_per_positive: 3
min_negatives_per_image: 0
}
classification_weight: 1.0
localization_weight: 1.0
}
normalize_loss_by_num_matches: true
post_processing {
batch_non_max_suppression {
score_threshold: 1e-8
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 100
}
score_converter: SIGMOID
}
}
}
train_config: {
batch_size: 1#可以修改的地方1:batch的大小,这里使用cpu版本的tensorflow可以改小一点
optimizer {
rms_prop_optimizer: {
learning_rate: {
exponential_decay_learning_rate {
initial_learning_rate: 0.004
decay_steps: 800720
decay_factor: 0.95
}
}
momentum_optimizer_value: 0.9
decay: 0.9
epsilon: 1.0
}
}
fine_tune_checkpoint: “/Users/cathy/Desktop/models/research/object_detection/ssd_mobilenet_v1_coco_2018_01_28/model.ckpt”#修改
from_detection_checkpoint: true #可以修改的地方:是否进行finetune
Note: The below line limits the training process to 200K steps, which we
empirically found to be sufficient enough to train the pets dataset. This
effectively bypasses the learning rate schedule (the learning rate will
never decay). Remove the below line to train indefinitely.
num_steps: 20000 #可以修改的地方,迭代次数,时间有限的话可以改小一点,但是效果差
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
ssd_random_crop {
}
}
}
train_input_reader: {
tf_record_input_reader {
input_path: “/Users/cathy/Desktop/models/research/object_detection/data/street_sign_train.record”
#需要修改的地方 训练集的地址
}
label_map_path: “/Users/cathy/Desktop/models/research/object_detection/data/street_sign.pbtxt”
#需要修改的地方 映射关系文件的目录,后面讲如何生成
}
eval_config: {
num_examples: 6 #需要修改的地方:测试集的样本个数
Note: The below line limits the evaluation process to 10 evaluations.
Remove the below line to evaluate indefinitely.
max_evals: 10
}
eval_input_reader: {
tf_record_input_reader {
input_path: “/Users/cathy/Desktop/models/research/object_detection/data/street_sign_test.record”
#需要修改的地方 测试集的地址
}
label_map_path: “/Users/cathy/Desktop/models/research/object_detection/data/street_sign.pbtxt”
#需要修改的地方 映射关系文件的目录,后面讲如何生成
shuffle: false
num_readers: 1
}
8.随便找一个pbtxt文件,对应标注自己的类别集。我的如下:
item {
id: 1
name: ‘street_sign’
}
9.将生成的csv,pbtxt,record文件放入data文件夹中,开始训练模型
命令如下:
python2 /Users/cathy/Desktop/models/research/object_detection/model_main.py --logtostderr --model_dir=/Users/cathy/Desktop/models/research/object_detection/training/ --pipeline_config_path=/Users/cathy/Desktop/models/research/object_detection/training/ssd_mobilenet_v1_coco.config
训练过程我截个图
10.可以进入tensorboard观赏一下损失,学习率等,丢个命令给你们:
tensorboard --logdir=’/Users/cathy/Desktop/models/research/object_detection/training’
11.接下来还有过程,不过我要去跑步了等回来继续