【detectron】对输入样本如何产生anchor

Mr_health

已于 2022-11-05 19:41:00 修改

阅读量1.7k

点赞数 2

分类专栏： detectron 文章标签： detectron

于 2018-12-10 14:28:19 首次发布

本文链接：https://blog.csdn.net/Mr_health/article/details/84938982

版权

detectron 专栏收录该内容

17 篇文章 0 订阅

订阅专栏

detectron中图片产生anchor的函数是get_field_of_anchors，它位于data_utils.py，要生成anchor需要下列参数，因此该函数主要输入就是以下几个参数，不做过多的解释了。

stride：
anchor_sizes
anchor_aspect_ratios

该函数生成生成anchor的过程如下：

1.首先对一个cell生程anchor，此时这个anchor没有位置点信息，只有长宽而已，这个长宽满足我们设计的anchor_sizes和anchor_aspect_ratios，generate_anchors代码放在最后面。

    # Anchors at a single feature cell，在一个cell产生的anchor
    cell_anchors = generate_anchors(
        stride=stride, sizes=anchor_sizes, aspect_ratios=anchor_aspect_ratios
    )
    num_cell_anchors = cell_anchors.shape[0] #anchor的数量

2.之后根据我们输入图片的长宽，以及stride，计算在这样一张图片上以stride为步长要在哪些位置生成anchor，此时才有了放置这些anchor的点.

detectron是在fpn_max_size为基准构成的正方形上生成放置anchor的点的，怎么理解呢，这里举出两个例子，假设我输如的图片大小为1024×1024。

（1）这里设置训练需要的样本最大边长为1024，即cfg.TRAIN.MAX_SIZE = 1024。由于我们原本的图片就是1024，所以就不用缩放了。retinanet默认cfg.FPN.COARSEST_STRIDE = 128。所以fpn_max_size = 128×ceil（1024/128 ） = 1024。之后以stride为步长，在fpn_max_size×fpn_max_size大小的图上生成放置anchor的点，那么单个边放置的个数就是field_size = ceil(1024/8) = 128。所以生成的anchor的位置点个数为128×128个。

（2）这里设置训练需要的样本最大边长为800，即cfg.TRAIN.MAX_SIZE = 800。此时要对我们的样本最长边缩放至800，retinanet默认cfg.FPN.COARSEST_STRIDE = 128。所以fpn_max_size = 128×ceil（800/128 ） = 896。之后以stride为步长，在fpn_max_size×fpn_max_size大小的图上生成放置anchor的点，那么单个边放置的个数就是field_size = ceil(896/8) = 112。所以生成的anchor的位置点个数为112×112个。

最后放置anchor的点坐标为shifts.它的第一列和第三列相同，第二列和第四列相同。

ps：其实对cfg.FPN.COARSEST_STRIDE这个参数理解还不是非常透彻，希望能与大家交流

    fpn_max_size = cfg.FPN.COARSEST_STRIDE * np.ceil(
        cfg.TRAIN.MAX_SIZE / float(cfg.FPN.COARSEST_STRIDE)
    )
    field_size = int(np.ceil(fpn_max_size / float(stride)))#以strid为步长，每隔一个stride，生成一个anchor，field_size表示最大的那个边长能生成多少个anchor
    shifts = np.arange(0, field_size) * stride
    shift_x, shift_y = np.meshgrid(shifts, shifts)
    shift_x = shift_x.ravel()
    shift_y = shift_y.ravel()
    shifts = np.vstack((shift_x, shift_y, shift_x, shift_y)).transpose()

3.有了这些放置点的位置，将a步骤的anchor挪过去，就相当于在该位置生成了一个anchor。

# Broacast anchors over shifts to enumerate all anchors at all positions
    # in the (H, W) grid:
    #   - add A cell anchors of shape (1, A, 4) to
    #   - K shifts of shape (K, 1, 4) to get
    #   - all shifted anchors of shape (K, A, 4)
    #   - reshape to (K*A, 4) shifted anchors
    A = num_cell_anchors
    K = shifts.shape[0]
    field_of_anchors = (
        cell_anchors.reshape((1, A, 4)) +
        shifts.reshape((1, K, 4)).transpose((1, 0, 2))
    )  
    field_of_anchors = field_of_anchors.reshape((K * A, 4))  #field_of_anchors的行数相当于anchor的数量

附录：generate_anchors代码：

def generate_anchors(
    stride=16, sizes=(32, 64, 128, 256, 512), aspect_ratios=(0.5, 1, 2)
):
    """Generates a matrix of anchor boxes in (x1, y1, x2, y2) format. Anchors
    are centered on stride / 2, have (approximate) sqrt areas of the specified
    sizes, and aspect ratios as given.
    #这里的stride可以理解为元anchor的边长（base_size）
    产生anchor的原理是:
    (1)先对元anchor生成不同aspect_ratios下的base_anchors
    (2)再根据该层实际设计的anchor的sizes，将base_anchors进行扩大，扩大的倍数就是下面的：np.array(sizes, dtype=np.float) / stride
    """
    return _generate_anchors(
        stride,
        np.array(sizes, dtype=np.float) / stride,
        np.array(aspect_ratios, dtype=np.float)
    )


def _generate_anchors(base_size, scales, aspect_ratios):
    """Generate anchor (reference) windows by enumerating aspect ratios X
    scales wrt a reference (0, 0, base_size - 1, base_size - 1) window.
    
    """
    anchor = np.array([1, 1, base_size, base_size], dtype=np.float) - 1 #这个就是元anchor
    anchors = _ratio_enum(anchor, aspect_ratios)    #根据aspect_ratios，对元anchor生成不同ratio下的base_anchors
    #根据该层实际设计的anchor的边长size，以及size与stride的关系（size/stride =scales），将base_anchors的边长乘以scales
    anchors = np.vstack(
        [_scale_enum(anchors[i, :], scales) for i in range(anchors.shape[0])]   
    )
    return anchors


def _whctrs(anchor):
    """Return width, height, x center, and y center for an anchor (window)."""
    w = anchor[2] - anchor[0] + 1
    h = anchor[3] - anchor[1] + 1
    x_ctr = anchor[0] + 0.5 * (w - 1)
    y_ctr = anchor[1] + 0.5 * (h - 1)
    return w, h, x_ctr, y_ctr


def _mkanchors(ws, hs, x_ctr, y_ctr):
    """Given a vector of widths (ws) and heights (hs) around a center
    (x_ctr, y_ctr), output a set of anchors (windows).
    """
    ws = ws[:, np.newaxis] #由（3,）变为（3,1）
    hs = hs[:, np.newaxis]
    anchors = np.hstack(  #hstack横向拼接，保证行不变，列合并，所以是anchors的大小是（3,4）
        (
            x_ctr - 0.5 * (ws - 1), #anchor左上角的x，大小为（3,1）
            y_ctr - 0.5 * (hs - 1), #anchor左上角的y，大小为（3,1）
            x_ctr + 0.5 * (ws - 1), #anchor右上角的x，大小为（3,1）
            y_ctr + 0.5 * (hs - 1)  #anchor右上角的y，大小为（3,1）
        )
    )
    return anchors


def _ratio_enum(anchor, ratios):
    """Enumerate a set of anchors for each aspect ratio wrt an anchor."""
    w, h, x_ctr, y_ctr = _whctrs(anchor)#返回高宽以及中心
    size = w * h  #面积
    #下面这三个是计算在面积相同情况固定长宽比下的长hs与宽ws
    size_ratios = size / ratios
    ws = np.round(np.sqrt(size_ratios))
    hs = np.round(ws * ratios)
    anchors = _mkanchors(ws, hs, x_ctr, y_ctr) #根据求取的长宽和中心seanchor
    return anchors


def _scale_enum(anchor, scales):
    """Enumerate a set of anchors for each scale wrt an anchor."""
    w, h, x_ctr, y_ctr = _whctrs(anchor) 
    ws = w * scales
    hs = h * scales
    anchors = _mkanchors(ws, hs, x_ctr, y_ctr)
    return anchors

每次再看生成anchor的过程总是健忘的一点是：到底anchor生成的（绝对）大小是基于对应的FPN的大小，还是基于原图的大小？
显然在计算IOU的时候anchor是与对应的GT进行计算，因此一定是基于原图的大小，要与与GT处于同一个尺度（比例）下。
搞混的原因是在不同的FPN层产生不同的大小的anchor，导致总以为anchor是在对应的FPN特征图上产生的。

其实在生产anchor中最重要的参数就是stride，不同的层在产生anchor的过程中stride是不一样的。这个stride实际上就是特征图相对于原图的下采样的倍数。
假设输入是512*512，经过经过层层卷积后得到C2层(经过横向链接对应于P2层)，大小为64*64，也就是下采样了8倍。按照FPN的定义，这一层用来产生小anchor，对于64*64大小的特征图上每一点产生3种比例的面积为32*32的anchor（这个32是anchor大小，可以自定义，与GT同处于一个比例下）。截止到目前我们产生的anchor的位置都是基于当前特征图的，因此还需要转变为基于原图的。这时就用到了stride，通过stride可以将当前特征图上每一个点都映射回原图。可想而知，映射回原图后就是在原图每8个点处产生anchor。
同理对于C3层(经过横向链接对应于P3层)，特征图大小为32*32，下采样了16倍，即stride = 16。在这一层产生大小为64*64的anchor，对应与原图就是每16个点处产生anchor。
依次类推。

基于上面的讲解，如果想要提升小目标的检测效果，通过简单调整FPN的方案有：
1. 设置更小的anchor_size
2. 利用分辨率更高的特征图。在上面的讲解中我们利用下采样8倍的特征图作为FPN中分辨率最高的特征图，在这种情况下对应于原图是每8个点产生anchor，anchor产生的还是不够密集，可能错过小目标。那么我们可以将下采样4倍的特征图作为FPN中分辨率最高的特征图，即大小为128*128，对应与原图是每4个点处产生anchor，小anchor的数量就明显多了，但是计算量也会随之上升。