使用预训练模型训练YOLOv3
文章目录
前言
本次训练的测试图片由我的好徒儿荔枝友情提供,荔枝这个衣服好像三明治,同学们快来给我的好徒儿换个衣服搭配!
经典论文:YOLOv3: An Incremental Improvement
论文下载地址:https://pjreddie.com/media/files/papers/YOLOv3.pdf
中文译本:https://www.taominze.com/index.php/2019/09/11/512/
YOLOv3模型特点
YOLOv3模型是以DarkNet53作为底层网络,在网络中加入了Residual Block,同时借鉴了SSD的多层特征,利用不同层的特征图检测大小不同的目标,从而提升了小目标的检测精度。
YOLO v3的特点和改进在于:
- 多尺度预测 (引入FPN);
- 更好的基础分类网络(darknet-53, 类似于ResNet引入残差结构);
- 分类器不在使用Softmax,分类损失采用binary cross-entropy loss(二分类交叉损失熵);
YOLOv3不使用Softmax对每个框进行分类,主要考虑因素有两个:
- Softmax使得每个框分配一个类别(score最大的一个),而对于Open Image这种数据集,目标可能有重叠的类别标签,因此Softmax不适用于多标签分类;
- Softmax可被独立的多个logistic分类器替代,且准确率不会下降。
模型架构
这里配上一位大佬的手写笔记,手写的东西就是比较有味道,有内味了哈!
准备数据
# 定义文件路径
from train import get_classes, get_anchors
# 数据文件路径
data_path = "./coco/coco_data"
# coco类型定义文件存储位置
classes_path = './model_data/coco_classes.txt'
# coco数据anchor值文件存储位置
anchors_path = './model_data/yolo_anchors.txt'
# coco数据标注信息文件存储位置
annotation_path = './coco/coco_train.txt'
# 预训练权重文件存储位置
weights_path = "./model_data/yolo.h5"
# 模型文件存储位置
save_path = "./result/models/"
classes = get_classes(classes_path)
anchors = get_anchors(anchors_path)
# 获取类型数量和anchor数量变量
num_classes = len(classes)
num_anchors = len(anchors)
选取标注数据
import numpy as np
# 训练集与验证集划分比例
val_split = 0.1
with open(annotation_path) as f:
lines = f.readlines()
np.random.seed(10101)
np.random.shuffle(lines)
np.random.seed(None)
num_val = int(len(lines)*val_split)
num_train = len(lines) - num_val
构建数据生成器,并做数据增强
def data_generator(annotation_lines, batch_size, input_shape, data_path,anchors, num_classes):
n = len(annotation_lines)
i = 0
while True:
image_data = []
box_data = []
for b in range(batch_size):
if i==0:
np.random.shuffle(annotation_lines)
image, box = get_random_data(annotation_lines[i], input_shape, data_path,random=True) # 随机挑选一个批次的数据
image_data.append(image)
box_data.append(box)
i = (i+1) % n
image_data = np.array(image_data)
box_data = np.array(box_data)
y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes) # 对标注框预处理,过滤异常标注框
yield [image_data, *y_true], np.zeros(batch_size)
def data_generator_wrapper(annotation_lines, batch_size, input_shape, data_path,anchors, num_classes):
n = len(annotation_lines)
if n==0 or batch_size<=0: return None
return data_generator(annotation_lines, batch_size, input_shape, data_path,anchors, num_classes)
模型训练
本案例使用Keras深度学习框架搭建YOLOv3神经网络。
可以进入相应的文件夹路径查看源码实现。
构建神经网络
import keras.backend as K
from yolo3.model import preprocess_true_boxes, yolo_body, yolo_loss
from keras.layers import Input, Lambda
from keras.models import Model
# 初始化session
K.clear_session()
# 图像输入尺寸
input_shape = (416, 416)
image_input = Input(shape=(None, None, 3))
h, w = input_shape
# 设置多尺度检测的下采样尺寸
y_true = [Input(shape=(h//{0:32, 1:16, 2:8}[l], w//{0:32, 1:16, 2:8}[l], num_anchors//3, num_classes+5))
for l in range(3)]
# 构建YOLO模型结构
model_body = yolo_body(image_input, num_anchors//3, num_classes)
# 将YOLO权重文件加载进来,如果希望不加载预训练权重,从头开始训练的话,可以删除这句代码
model_body.load_weights(weights_path, by_name=True, skip_mismatch=True)
# 定义YOLO损失函数
model_loss = Lambda(yolo_loss, output_shape=(1,), name='yolo_loss',
arguments={'anchors': anchors, 'num_classes': num_classes, 'ignore_thresh': 0.5})([*model_body.output, *y_true])
# 构建Model,为训练做准备
model = Model([model_body.input, *y_true], model_loss)
# 打印模型各层结构
model.summary()
定义训练回调函数
from keras.callbacks import ReduceLROnPlateau, EarlyStopping, ModelCheckpoint
import os
# 定义回调方法
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=3, verbose=1) # 学习率衰减策略
early_stopping = EarlyStopping(monitor='val_loss', min_delta=0, patience=5, verbose=1) # 早停策略
# 断点训练,有利于恢复和保存模型
checkpoint = ModelCheckpoint(os.path.join(save_path, 'trained_weights_final.h5'), \
monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=True, mode='auto', period=1)
开始训练
from keras.optimizers import Adam
from yolo3.utils import get_random_data
from keras.models import load_model
from yolo3.model import yolo_body
from keras.layers import Input
# 设置所有的层可训练
for i in range(len(model.layers)):
model.layers[i].trainable = True
# 选择Adam优化器,设置学习率
learning_rate = 1e-5
model.compile(optimizer=Adam(lr=learning_rate), loss={'yolo_loss': lambda y_true, y_pred: y_pred})
# 设置批大小和训练轮数
batch_size = 16
max_epochs = 10
print('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size))
if not os.path.exists(save_path):
os.makedirs(save_path)
if os.path.exists(os.path.join(save_path, 'trained_weights_final.h5')):
model.load_weights(os.path.join(save_path, 'trained_weights_final.h5'))
# 开始训练
model.fit_generator(data_generator_wrapper(lines[:num_train], batch_size, input_shape, data_path,anchors, num_classes),
steps_per_epoch=max(1, num_train//batch_size),
validation_data=data_generator_wrapper(lines[num_train:], batch_size, input_shape, data_path,anchors, num_classes),
validation_steps=max(1, num_val//batch_size),
epochs=max_epochs,
initial_epoch=0,
callbacks=[reduce_lr, early_stopping, checkpoint])
Train on 179 samples, val on 19 samples, with batch size 16.
Epoch 1/10
11/11 [==============================] - 28s 3s/step - loss: 36.3566 - val_loss: 52.7574
Epoch 00001: val_loss did not improve from 44.60529
Epoch 2/10
11/11 [==============================] - 7s 631ms/step - loss: 36.2843 - val_loss: 36.2607
Epoch 00002: val_loss improved from 44.60529 to 36.26068, saving model to ./result/models/trained_weights_final.h5
Epoch 3/10
11/11 [==============================] - 8s 713ms/step - loss: 36.0089 - val_loss: 64.8398
Epoch 00003: val_loss did not improve from 36.26068
Epoch 4/10
11/11 [==============================] - 12s 1s/step - loss: 33.5629 - val_loss: 41.3809
Epoch 00004: val_loss did not improve from 36.26068
Epoch 5/10
11/11 [==============================] - 12s 1s/step - loss: 35.3097 - val_loss: 55.5916
Epoch 00005: ReduceLROnPlateau reducing learning rate to 9.999999747378752e-07.
Epoch 00005: val_loss did not improve from 36.26068
Epoch 6/10
11/11 [==============================] - 10s 866ms/step - loss: 34.0317 - val_loss: 56.6968
Epoch 00006: val_loss did not improve from 36.26068
Epoch 7/10
11/11 [==============================] - 12s 1s/step - loss: 34.5008 - val_loss: 38.1627
Epoch 00007: val_loss did not improve from 36.26068
Epoch 00007: early stopping
模型测试
打开一张测试图片
from PIL import Image
import numpy as np
# 测试文件路径
test_file_path = './lizhi.jpg'
# 打开测试文件
image = Image.open(test_file_path)
image_ori = np.array(image)
image_ori.shape
(978, 978, 3)
图片预处理
from yolo3.utils import letterbox_image
new_image_size = (image.width - (image.width % 32), image.height - (image.height % 32))
boxed_image = letterbox_image(image, new_image_size)
image_data = np.array(boxed_image, dtype='float32')
image_data /= 255.
image_data = np.expand_dims(image_data, 0)
image_data.shape
(1, 960, 960, 3)
import keras.backend as K
sess = K.get_session()
构建模型
from yolo3.model import yolo_body
from keras.layers import Input
# coco数据anchor值文件存储位置
anchor_path = "./model_data/yolo_anchors.txt"
with open(anchor_path) as f:
anchors = f.readline()
anchors = [float(x) for x in anchors.split(',')]
anchors = np.array(anchors).reshape(-1, 2)
yolo_model = yolo_body(Input(shape=(None,None,3)), len(anchors)//3, num_classes)
加载模型权重,或将模型路径替换成上一步训练得出的模型路径
# 模型权重存储路径
weights_path = os.path.join(save_path, 'trained_weights_final.h5')
yolo_model.load_weights(weights_path)
定义IOU以及score:
IOU: 将交并比大于IOU的边界框作为冗余框去除
score:将预测分数大于score的边界框筛选出来
iou = 0.45
score = 0.8
构建输出[boxes, scores, classes]
from yolo3.model import yolo_eval
input_image_shape = K.placeholder(shape=(2, ))
boxes, scores, classes = yolo_eval(
yolo_model.output,
anchors,
num_classes,
input_image_shape,
score_threshold=score,
iou_threshold=iou)
进行预测
out_boxes, out_scores, out_classes = sess.run(
[boxes, scores, classes],
feed_dict={
yolo_model.input: image_data,
input_image_shape: [image.size[1], image.size[0]],
K.learning_phase(): 0
})
class_coco = get_classes(classes_path)
out_coco = []
for i in out_classes:
out_coco.append(class_coco[i])
print(out_boxes)
print(out_scores)
print(out_coco)
[[238.1432 283.5006 775.9345 796.47186 ]
[ 42.37368 306.53665 520.77747 836.2313 ]
[172.68872 434.2144 708.88727 953.6306 ]
[362.78308 414.0043 905.73486 918.5013 ]
[316.41858 168.52351 890.3133 645.3454 ]
[114.531235 204.28731 636.6221 676.35187 ]
[671.0529 41.569946 968.4 320.18176 ]]
[0.9968477 0.9943246 0.96714234 0.9387743 0.93621033 0.9123165
0.8179135 ]
['person', 'person', 'person', 'person', 'person', 'person', 'sandwich']
将预测结果绘制在图片上
from PIL import Image, ImageFont, ImageDraw
font = ImageFont.truetype(font='font/FiraMono-Medium.otf',
size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32'))
thickness = (image.size[0] + image.size[1]) // 300
for i, c in reversed(list(enumerate(out_coco))):
predicted_class = c
box = out_boxes[i]
score = out_scores[i]
label = '{} {:.2f}'.format(predicted_class, score)
draw = ImageDraw.Draw(image)
label_size = draw.textsize(label, font)
top, left, bottom, right = box
top = max(0, np.floor(top + 0.5).astype('int32'))
left = max(0, np.floor(left + 0.5).astype('int32'))
bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32'))
right = min(image.size[0], np.floor(right + 0.5).astype('int32'))
print(label, (left, top), (right, bottom))
if top - label_size[1] >= 0:
text_origin = np.array([left, top - label_size[1]])
else:
text_origin = np.array([left, top + 1])
for i in range(thickness):
draw.rectangle(
[left + i, top + i, right - i, bottom - i],
outline=225)
draw.rectangle(
[tuple(text_origin), tuple(text_origin + label_size)],
fill=225)
draw.text(text_origin, label, fill=(0, 0, 0), font=font)
del draw
sandwich 0.82 (42, 671) (320, 968)
person 0.91 (204, 115) (676, 637)
person 0.94 (169, 316) (645, 890)
person 0.94 (414, 363) (919, 906)
person 0.97 (434, 173) (954, 709)
person 0.99 (307, 42) (836, 521)
person 1.00 (284, 238) (796, 776)
image
最终结果
上课拍了一张班里同学们的图,还挺怀念那时候的,最后一次实习,快要毕业了,这样的春天总是会莫名的伤感,这就是伤春悲秋吗?古文那么酸也不是没有道理的。
思考总结
一直以为YOLOv3采用了FPN方法,能够有效提高模型精度的原因在于联合训练,结果最新的论文YOLOF换个思路证明了FPN的有效性在于分治策略,而不是特征融合,笔者觉得想要进一步学习神经网络,还是应该做一些关于模型架构的有效性验证,去思考更底层的问题,有点玄学,呼啦呼啦!!!
上面用的那个数据集实在太小了,模型损失比较大,可以通过增加数据集进一步训练,测试结果也有点问题,就这样吧!