最近开始使用Keras来做深度学习,发现模型搭建相较于MXnet, Caffe等确实比较方便,适合于新手练手,于是找来了目标检测经典的模型Faster-RCNN的keras代码来练练手,代码的主题部分转自知乎专栏Learning Machine,作者张潇捷,链接如下:
keras版faster-rcnn算法详解(1.RPN计算)
keras版faster-rcnn算法详解 (2.roi计算及其他)
我再对代码中loss的计算,config的设置等细节进行学习
Keras版Faster-RCNN代码学习(IOU,RPN)1
Keras版Faster-RCNN代码学习(Batch Normalization)2
Keras版Faster-RCNN代码学习(loss,xml解析)3
Keras版Faster-RCNN代码学习(roipooling resnet/vgg)4
Keras版Faster-RCNN代码学习(measure_map,train/test)5
config.py
from keras import backend as K
import math
class Config:
def __init__(self):
self.verbose = True
self.network = 'resnet50'
# setting for data augmentation
self.use_horizontal_flips = False
self.use_vertical_flips = False
self.rot_90 = False
# anchor box scales
self.anchor_box_scales = [128, 256, 512]
# anchor box ratios
self.anchor_box_ratios = [[1, 1], [1./math.sqrt(2), 2./math.sqrt(2)], [2./math.sqrt(2), 1./math.sqrt(2)]]
# size to resize the smallest side of the image
self.im_size = 600
# image channel-wise mean to subtract
self.img_channel_mean = [103.939, 116.779, 123.68]
self.img_scaling_factor = 1.0
# number of ROIs at once
self.num_rois = 4
# stride at the RPN (this depends on the network configuration)
self.rpn_stride = 16
self.balanced_classes = False
# scaling the stdev
self.std_scaling = 4.0
self.classifier_regr_std = [8.0, 8.0, 4.0, 4.0]
# overlaps for RPN
self.rpn_min_overlap = 0.3
self.rpn_max_overlap = 0.7
# overlaps for classifier ROIs
self.classifier_min_overlap = 0.1
self.classifier_max_overlap = 0.5
# placeholder for the class mapping, automatically generated by the parser
self.class_mapping = None
#location of pretrained weights for the base network
# weight files can be found at:
# https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_th_dim_ordering_th_kernels_notop.h5
# https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
self.model_path = 'model_frcnn.vgg.hdf5'
对代码所需要的参数进行配置
data_generators.py
import cv2
import numpy as np
import copy
#传递图像参数,增广配置参数,是否进行图像增广
def augment(img_data, config, augment=True):
assert 'filepath' in img_data
assert 'bboxes' in img_data
assert 'width' in img_data
assert 'height' in img_data
img_data_aug = copy.deepcopy(img_data)
img = cv2.imread(img_data_aug['filepath'])
if augment:
rows, cols = img.shape[:2]
#图像水平翻转,对应的bbox的对角坐标也进行水平翻转,翻转概率为50%
if config.use_horizontal_flips and np.random.randint(0, 2) == 0:
img = cv2.flip(img, 1)
for bbox in img_data_aug['bboxes']:
x1 = bbox['x1']
x2 = bbox['x2']
bbox['x2'] = cols - x1
bbox['x1'] = cols - x2
#图像垂直翻转,对应的bbox的对角坐标也进行垂直翻转,翻转概率为50%
if config.use_vertical_flips and np.random.randint(0, 2) == 0:
img =