https://www.jianshu.com/p/596e4171f7ad
类似RPN区域生成网络(region proposal network)具有平移不变性的anchor boxes. 从P3到P7层的anchors的面积从32*32一次增加到了512*512(为什么?怎么算的?),每层anchors长宽比{1:2,1:1,2:1},每层增加尺寸
,这样每层有9个anchors, ....
anchors.py
anchor_targets_bbox(),为box检测生成anchor 目标
def anchor_targets_bbox(
anchors,
image_group,
annotations_group,#真实标注的x1,y1,x2,y2,label, 注意这里 annotations_group
num_classes,
negative_overlap=0.4,
positive_overlap=0.5
):
""" Generate anchor targets for bbox detection.
Args
anchors: np.array of annotations of shape (N, 4) for (x1, y1, x2, y2).
image_group: List of BGR images.
annotations_group: List of annotations (np.array of shape (N, 5) for (x1, y1, x2, y2, label)).
num_classes: Number of classes to predict.
mask_shape: If the image is padded with zeros, mask_shape can be used to mark the relevant part of the image.
negative_overlap: IoU overlap for negative anchors (all anchors with overlap < negative_overlap are negative).
positive_overlap: IoU overlap or positive anchors (all anchors with overlap > positive_overlap are positive).
Returns
labels_batch: batch that contains labels & anchor states (np.array of shape (batch_size, N, num_classes + 1),
where N is the number of anchors for an image and the last column defines the anchor state (-1 for ignore, 0 for bg, 1 for fg).
regression_batch: batch that contains bounding-box regression targets for an image & anchor states (np.array of shape (batch_size, N, 4 + 1),
where N is the number of anchors for an image, the first 4 columns define regression targets for (x1, y1, x2, y2) and the
last column defines anchor states (-1 for ignore, 0 for bg, 1 for fg).
"""
assert(len(image_group) == len(annotations_group)), "The length of the images and annotations need to be equal."
assert(len(annotations_group) > 0), "No data received to compute anchor targets for."
for annotations in annotations_group:
assert('bboxes' in annotations), "Annotations should contain bboxes."
assert('labels' in annotations), "Annotations should contain labels."
batch_size = len(image_group)#计算batch_size
regression_batch = np.zeros((batch_size, anchors.shape[0], 4 + 1), dtype=keras.backend.floatx())#构造一个3维矩阵,batch_sizexanchors.shape[0]x5
#其中anchors.shape[0]很大,有可能是43803或39492,每个批次还都不一样,这个43803是前边根据函数中anchors_for_shape()计算得来的,也就是对一张416*560的图片来说,对这个图构造了43803个anchor
labels_batch = np.zeros((batch_size, anchors.shape[0], num_classes + 1), dtype=keras.backend.floatx())#回归类别,本质是刻画anchor的类别特征,可以认为 labels_batch
#中共有batch_size个元素,假设batch_size=8,网络的检测目标=3(人,车,飞机)则第1个元素的维度=[43803,4],其中4刻画了[人,车,飞机,正负样本状态]
# 构造一个3维矩阵,batch_sizexanchors.shape[0]xnum_classes + 1
# compute labels and regression targets
for index, (image, annotations) in enumerate(zip(image_group, annotations_group)):#这里是对一个batch_size中的每张图片进行遍历,当然,
#每张图片可能包含了多个检测目标,所以annotations['bboxes'].shape[0]>=1
if annotations['bboxes'].shape[0]:#annotations:{'labels': array([ 0., 0.]), 'bboxes': array([[ 67.97791573, 103.88162763, 448.83239265, 367.84012947],
# [ 439.76378026, 195.41451562, 569.55807188, 263.01028949]])}
# obtain indices of gt annotations with the greatest overlap
#这里是把43803个anchor与一张图片进行iou的计算
positive_indices, ig