resnet one-stage之anchor

最新推荐文章于 2024-09-11 16:03:04 发布

wanghua609

最新推荐文章于 2024-09-11 16:03:04 发布

阅读量759

点赞数 1

本文链接：https://blog.csdn.net/weixin_38145317/article/details/98640051

版权

https://www.jianshu.com/p/596e4171f7ad

类似RPN区域生成网络(region proposal network)具有平移不变性的anchor boxes. 从P3到P7层的anchors的面积从32*32一次增加到了512*512(为什么?怎么算的?),每层anchors长宽比{1:2,1:1,2:1},每层增加尺寸

$2^0,2^\frac{1}{3},2^\frac{2}{3}$ ,这样每层有9个anchors, ....

anchors.py

anchor_targets_bbox(),为box检测生成anchor 目标

def anchor_targets_bbox(
    anchors,
    image_group,
    annotations_group,#真实标注的x1,y1,x2,y2,label， 注意这里 annotations_group
    num_classes,
    negative_overlap=0.4,
    positive_overlap=0.5
):
    """ Generate anchor targets for bbox detection.

    Args
        anchors: np.array of annotations of shape (N, 4) for (x1, y1, x2, y2).
        image_group: List of BGR images.
        annotations_group: List of annotations (np.array of shape (N, 5) for (x1, y1, x2, y2, label)).
        num_classes: Number of classes to predict.
        mask_shape: If the image is padded with zeros, mask_shape can be used to mark the relevant part of the image.
        negative_overlap: IoU overlap for negative anchors (all anchors with overlap < negative_overlap are negative).
        positive_overlap: IoU overlap or positive anchors (all anchors with overlap > positive_overlap are positive).

    Returns
        labels_batch: batch that contains labels & anchor states (np.array of shape (batch_size, N, num_classes + 1),
                      where N is the number of anchors for an image and the last column defines the anchor state (-1 for ignore, 0 for bg, 1 for fg).
        regression_batch: batch that contains bounding-box regression targets for an image & anchor states (np.array of shape (batch_size, N, 4 + 1),
                      where N is the number of anchors for an image, the first 4 columns define regression targets for (x1, y1, x2, y2) and the
                      last column defines anchor states (-1 for ignore, 0 for bg, 1 for fg).
    """

    assert(len(image_group) == len(annotations_group)), "The length of the images and annotations need to be equal."
    assert(len(annotations_group) > 0), "No data received to compute anchor targets for."
    for annotations in annotations_group:
        assert('bboxes' in annotations), "Annotations should contain bboxes."
        assert('labels' in annotations), "Annotations should contain labels."

    batch_size = len(image_group)#计算batch_size

    regression_batch  = np.zeros((batch_size, anchors.shape[0], 4 + 1), dtype=keras.backend.floatx())#构造一个3维矩阵,batch_sizexanchors.shape[0]x5
    #其中anchors.shape[0]很大,有可能是43803或39492,每个批次还都不一样，这个43803是前边根据函数中anchors_for_shape()计算得来的，也就是对一张416*560的图片来说，对这个图构造了43803个anchor
    labels_batch      = np.zeros((batch_size, anchors.shape[0], num_classes + 1), dtype=keras.backend.floatx())#回归类别，本质是刻画anchor的类别特征，可以认为 labels_batch  
#中共有batch_size个元素,假设batch_size=8，网络的检测目标=3（人，车，飞机）则第1个元素的维度=[43803,4],其中4刻画了[人，车，飞机，正负样本状态]
    # 构造一个3维矩阵,batch_sizexanchors.shape[0]xnum_classes + 1

    # compute labels and regression targets
    for index, (image, annotations) in enumerate(zip(image_group, annotations_group)):#这里是对一个batch_size中的每张图片进行遍历，当然，
#每张图片可能包含了多个检测目标，所以annotations['bboxes'].shape[0]>=1
        if annotations['bboxes'].shape[0]:#annotations:{'labels': array([ 0.,  0.]), 'bboxes': array([[  67.97791573,  103.88162763,  448.83239265,  367.84012947],
       # [ 439.76378026,  195.41451562,  569.55807188,  263.01028949]])}
            # obtain indices of gt annotations with the greatest overlap
#这里是把43803个anchor与一张图片进行iou的计算
            positive_indices, ig