使用预训练模型训练YOLOv3-Keras

最新推荐文章于 2023-04-26 09:30:52 发布

小凡爱学习

最新推荐文章于 2023-04-26 09:30:52 发布

阅读量1.3k

点赞数

分类专栏：深度学习 keras python 文章标签： python 神经网络深度学习

本文链接：https://blog.csdn.net/zhouxuechao/article/details/115271325

版权

python 同时被 3 个专栏收录

14 篇文章 8 订阅

订阅专栏

深度学习

7 篇文章 1 订阅

订阅专栏

keras

6 篇文章 1 订阅

订阅专栏

使用预训练模型训练YOLOv3

前言

本次训练的测试图片由我的好徒儿荔枝友情提供，荔枝这个衣服好像三明治，同学们快来给我的好徒儿换个衣服搭配！

经典论文：YOLOv3: An Incremental Improvement

论文下载地址：https://pjreddie.com/media/files/papers/YOLOv3.pdf

中文译本：https://www.taominze.com/index.php/2019/09/11/512/

YOLOv3模型特点

YOLOv3模型是以DarkNet53作为底层网络，在网络中加入了Residual Block，同时借鉴了SSD的多层特征，利用不同层的特征图检测大小不同的目标，从而提升了小目标的检测精度。

YOLO v3的特点和改进在于：

多尺度预测（引入FPN）；
更好的基础分类网络（darknet-53, 类似于ResNet引入残差结构）；
分类器不在使用Softmax，分类损失采用binary cross-entropy loss（二分类交叉损失熵）；

YOLOv3不使用Softmax对每个框进行分类，主要考虑因素有两个：

Softmax使得每个框分配一个类别（score最大的一个），而对于Open Image这种数据集，目标可能有重叠的类别标签，因此Softmax不适用于多标签分类；
Softmax可被独立的多个logistic分类器替代，且准确率不会下降。

模型架构

这里配上一位大佬的手写笔记，手写的东西就是比较有味道，有内味了哈！

U8pIZ2qXxfYGoPy

准备数据

# 定义文件路径
from train import get_classes, get_anchors
# 数据文件路径
data_path = "./coco/coco_data"
# coco类型定义文件存储位置
classes_path = './model_data/coco_classes.txt'
# coco数据anchor值文件存储位置
anchors_path = './model_data/yolo_anchors.txt'
# coco数据标注信息文件存储位置
annotation_path = './coco/coco_train.txt'
# 预训练权重文件存储位置
weights_path = "./model_data/yolo.h5"
# 模型文件存储位置
save_path = "./result/models/"

classes = get_classes(classes_path)
anchors = get_anchors(anchors_path)
# 获取类型数量和anchor数量变量
num_classes = len(classes)
num_anchors = len(anchors)

选取标注数据

import numpy as np

# 训练集与验证集划分比例
val_split = 0.1
with open(annotation_path) as f:
    lines = f.readlines()
np.random.seed(10101)
np.random.shuffle(lines)
np.random.seed(None)
num_val = int(len(lines)*val_split)
num_train = len(lines) - num_val

构建数据生成器，并做数据增强

def data_generator(annotation_lines, batch_size, input_shape, data_path,anchors, num_classes):
    n = len(annotation_lines)
    i = 0
    while True:
        image_data = []
        box_data = []
        for b in range(batch_size):
            if i==0:
                np.random.shuffle(annotation_lines)
            image, box = get_random_data(annotation_lines[i], input_shape, data_path,random=True) # 随机挑选一个批次的数据
            image_data.append(image)
            box_data.append(box)
            i = (i+1) % n
        image_data = np.array(image_data)
        box_data = np.array(box_data)
        y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes) # 对标注框预处理，过滤异常标注框
        yield [image_data, *y_true], np.zeros(batch_size)

def data_generator_wrapper(annotation_lines, batch_size, input_shape, data_path,anchors, num_classes):
    n = len(annotation_lines)
    if n==0 or batch_size<=0: return None
    return data_generator(annotation_lines, batch_size, input_shape, data_path,anchors, num_classes)

模型训练

本案例使用Keras深度学习框架搭建YOLOv3神经网络。

可以进入相应的文件夹路径查看源码实现。

构建神经网络

import keras.backend as K
from yolo3.model import preprocess_true_boxes, yolo_body, yolo_loss
from keras.layers import Input, Lambda
from keras.models import Model
# 初始化session
K.clear_session()

# 图像输入尺寸
input_shape = (416, 416)
image_input = Input(shape=(None, None, 3))
h, w = input_shape
# 设置多尺度检测的下采样尺寸
y_true = [Input(shape=(h//{0:32, 1:16, 2:8}[l], w//{0:32, 1:16, 2:8}[l], num_anchors//3, num_classes+5)) 
          for l in range(3)]

# 构建YOLO模型结构
model_body = yolo_body(image_input, num_anchors//3, num_classes)

# 将YOLO权重文件加载进来，如果希望不加载预训练权重，从头开始训练的话，可以删除这句代码
model_body.load_weights(weights_path, by_name=True, skip_mismatch=True)

# 定义YOLO损失函数
model_loss = Lambda(yolo_loss, output_shape=(1,), name='yolo_loss',
    arguments={'anchors': anchors, 'num_classes': num_classes, 'ignore_thresh': 0.5})([*model_body.output, *y_true])

# 构建Model，为训练做准备
model = Model([model_body.input, *y_true], model_loss)

# 打印模型各层结构
model.summary()

定义训练回调函数

from keras.callbacks import ReduceLROnPlateau, EarlyStopping, ModelCheckpoint
import os

# 定义回调方法
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=3, verbose=1) # 学习率衰减策略
early_stopping = EarlyStopping(monitor='val_loss', min_delta=0, patience=5, verbose=1) # 早停策略
# 断点训练，有利于恢复和保存模型
checkpoint = ModelCheckpoint(os.path.join(save_path, 'trained_weights_final.h5'), \
                             monitor='val_loss', verbose=1, save_best_only=True, save_weights_only=True, mode='auto', period=1)

开始训练

from keras.optimizers import Adam
from yolo3.utils import get_random_data 
from keras.models import load_model
from yolo3.model import yolo_body
from keras.layers import Input

# 设置所有的层可训练
for i in range(len(model.layers)):
    model.layers[i].trainable = True
    
# 选择Adam优化器，设置学习率
learning_rate = 1e-5
model.compile(optimizer=Adam(lr=learning_rate), loss={'yolo_loss': lambda y_true, y_pred: y_pred}) 

# 设置批大小和训练轮数
batch_size = 16
max_epochs = 10
print('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size))

if not os.path.exists(save_path):
    os.makedirs(save_path)
if os.path.exists(os.path.join(save_path, 'trained_weights_final.h5')):
    model.load_weights(os.path.join(save_path, 'trained_weights_final.h5'))

# 开始训练
model.fit_generator(data_generator_wrapper(lines[:num_train], batch_size, input_shape, data_path,anchors, num_classes),
    steps_per_epoch=max(1, num_train//batch_size),
    validation_data=data_generator_wrapper(lines[num_train:], batch_size, input_shape, data_path,anchors, num_classes),
    validation_steps=max(1, num_val//batch_size),
    epochs=max_epochs,
    initial_epoch=0,
    callbacks=[reduce_lr, early_stopping, checkpoint])

Train on 179 samples, val on 19 samples, with batch size 16.
Epoch 1/10
11/11 [==============================] - 28s 3s/step - loss: 36.3566 - val_loss: 52.7574

Epoch 00001: val_loss did not improve from 44.60529
Epoch 2/10
11/11 [==============================] - 7s 631ms/step - loss: 36.2843 - val_loss: 36.2607

Epoch 00002: val_loss improved from 44.60529 to 36.26068, saving model to ./result/models/trained_weights_final.h5
Epoch 3/10
11/11 [==============================] - 8s 713ms/step - loss: 36.0089 - val_loss: 64.8398

Epoch 00003: val_loss did not improve from 36.26068
Epoch 4/10
11/11 [==============================] - 12s 1s/step - loss: 33.5629 - val_loss: 41.3809

Epoch 00004: val_loss did not improve from 36.26068
Epoch 5/10
11/11 [==============================] - 12s 1s/step - loss: 35.3097 - val_loss: 55.5916

Epoch 00005: ReduceLROnPlateau reducing learning rate to 9.999999747378752e-07.

Epoch 00005: val_loss did not improve from 36.26068
Epoch 6/10
11/11 [==============================] - 10s 866ms/step - loss: 34.0317 - val_loss: 56.6968

Epoch 00006: val_loss did not improve from 36.26068
Epoch 7/10
11/11 [==============================] - 12s 1s/step - loss: 34.5008 - val_loss: 38.1627

Epoch 00007: val_loss did not improve from 36.26068
Epoch 00007: early stopping

模型测试

打开一张测试图片

from PIL import Image
import numpy as np
# 测试文件路径
test_file_path = './lizhi.jpg'
# 打开测试文件
image = Image.open(test_file_path)
image_ori = np.array(image)
image_ori.shape

(978, 978, 3)

图片预处理

from yolo3.utils import letterbox_image

new_image_size = (image.width - (image.width % 32), image.height - (image.height % 32))
boxed_image = letterbox_image(image, new_image_size)
image_data = np.array(boxed_image, dtype='float32')
image_data /= 255.
image_data = np.expand_dims(image_data, 0)
image_data.shape

(1, 960, 960, 3)

import keras.backend as K
sess = K.get_session()

构建模型

from yolo3.model import yolo_body
from keras.layers import Input
# coco数据anchor值文件存储位置
anchor_path = "./model_data/yolo_anchors.txt"
with open(anchor_path) as f:
    anchors = f.readline()
anchors = [float(x) for x in anchors.split(',')]
anchors = np.array(anchors).reshape(-1, 2)
yolo_model = yolo_body(Input(shape=(None,None,3)), len(anchors)//3, num_classes)

加载模型权重，或将模型路径替换成上一步训练得出的模型路径

# 模型权重存储路径
weights_path = os.path.join(save_path, 'trained_weights_final.h5')
yolo_model.load_weights(weights_path)

定义IOU以及score：

IOU：将交并比大于IOU的边界框作为冗余框去除
score：将预测分数大于score的边界框筛选出来

iou = 0.45
score = 0.8

构建输出[boxes, scores, classes]

from yolo3.model import yolo_eval
input_image_shape = K.placeholder(shape=(2, ))
boxes, scores, classes = yolo_eval(
    yolo_model.output, 
    anchors,
    num_classes,
    input_image_shape,
    score_threshold=score, 
    iou_threshold=iou)

进行预测

out_boxes, out_scores, out_classes = sess.run(
    [boxes, scores, classes],
    feed_dict={
        yolo_model.input: image_data,
        input_image_shape: [image.size[1], image.size[0]],
        K.learning_phase(): 0
    })

class_coco = get_classes(classes_path)
out_coco = []
for i in out_classes:
    out_coco.append(class_coco[i])

print(out_boxes)
print(out_scores)
print(out_coco)

[[238.1432   283.5006   775.9345   796.47186 ]
 [ 42.37368  306.53665  520.77747  836.2313  ]
 [172.68872  434.2144   708.88727  953.6306  ]
 [362.78308  414.0043   905.73486  918.5013  ]
 [316.41858  168.52351  890.3133   645.3454  ]
 [114.531235 204.28731  636.6221   676.35187 ]
 [671.0529    41.569946 968.4      320.18176 ]]
[0.9968477  0.9943246  0.96714234 0.9387743  0.93621033 0.9123165
 0.8179135 ]
['person', 'person', 'person', 'person', 'person', 'person', 'sandwich']

将预测结果绘制在图片上

from PIL import Image, ImageFont, ImageDraw

font = ImageFont.truetype(font='font/FiraMono-Medium.otf',
                    size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32'))

thickness = (image.size[0] + image.size[1]) // 300

for i, c in reversed(list(enumerate(out_coco))):
    predicted_class = c
    box = out_boxes[i]
    score = out_scores[i]

    label = '{} {:.2f}'.format(predicted_class, score)
    draw = ImageDraw.Draw(image)
    label_size = draw.textsize(label, font)

    top, left, bottom, right = box
    top = max(0, np.floor(top + 0.5).astype('int32'))
    left = max(0, np.floor(left + 0.5).astype('int32'))
    bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32'))
    right = min(image.size[0], np.floor(right + 0.5).astype('int32'))
    print(label, (left, top), (right, bottom))

    if top - label_size[1] >= 0:
        text_origin = np.array([left, top - label_size[1]])
    else:
        text_origin = np.array([left, top + 1])

    for i in range(thickness):
        draw.rectangle(
            [left + i, top + i, right - i, bottom - i],
            outline=225)
    draw.rectangle(
        [tuple(text_origin), tuple(text_origin + label_size)],
        fill=225)
    draw.text(text_origin, label, fill=(0, 0, 0), font=font)
    del draw

sandwich 0.82 (42, 671) (320, 968)
person 0.91 (204, 115) (676, 637)
person 0.94 (169, 316) (645, 890)
person 0.94 (414, 363) (919, 906)
person 0.97 (434, 173) (954, 709)
person 0.99 (307, 42) (836, 521)
person 1.00 (284, 238) (796, 776)

image

output_35_0

最终结果

上课拍了一张班里同学们的图，还挺怀念那时候的，最后一次实习，快要毕业了，这样的春天总是会莫名的伤感，这就是伤春悲秋吗？古文那么酸也不是没有道理的。

思考总结

一直以为YOLOv3采用了FPN方法，能够有效提高模型精度的原因在于联合训练，结果最新的论文YOLOF换个思路证明了FPN的有效性在于分治策略，而不是特征融合，笔者觉得想要进一步学习神经网络，还是应该做一些关于模型架构的有效性验证，去思考更底层的问题，有点玄学，呼啦呼啦！！！
上面用的那个数据集实在太小了，模型损失比较大，可以通过增加数据集进一步训练，测试结果也有点问题，就这样吧！