饭后时间（四）---SSD先验框的尺寸及计算源码(含代码ssd_anchor.py)

最新推荐文章于 2024-03-29 00:30:00 发布

计算机视觉-Archer

最新推荐文章于 2024-03-29 00:30:00 发布

阅读量4.3k

点赞数 6

分类专栏：目标检测

本文链接：https://blog.csdn.net/zjc910997316/article/details/95069725

版权

目标检测专栏收录该内容

24 篇文章 3 订阅

订阅专栏

b站：连翘春风冻傻抱蚁人https://www.bilibili.com/video/av45660456

net->
ssd_vgg_300.py
本文研究核心

准备先验框的尺寸：因为SSD在每一层特征图上面都有不同尺寸的先验框
1 特征图到了相应的层再去计算
2 而先验框大小是计算好了的，存起来，用到了再调出来

1先验框尺寸计算讲解

1 正方形，先验框尺寸关系

sk' 指的是当前层的大正方形的边长
sk 指的是当前层的小正方形的边长
sk+1 指的是后一层的小正方形的边长
sk' = sqart(sk * sk+1)

红色表示栅格，蓝色表示大小正方形先验框

2 长方形，先验框尺寸关系

sk表示本层先验框的尺寸

3 函数def ssd_anchor_one_layer的步长step说明

对下面函数def ssd_anchor_one_layer 中step的解释：
特征图（38*38）与原图（300*300）的对应关系，
300/38 = 7.8 向上取整 = 8
特征图中的一个小格（1*1）对应原图（8*8）中一大块这个叫步长step

offset说明

辅助到达栅格中心点

def ssd_size_bounds_to_values(size_bounds,  # 缩放比 [0.15, 0.9]也就是sk
                              n_feat_layers,  # 特征图的层数 m=6
                              img_shape=(300, 300)):
    """
    1 对应相对边界，来计算anchor的先验框的大小。
    Compute the reference sizes of the anchor boxes from relative bounds.
    2 根据网络默认的大小（300像素），绝对值以像素为单位测量的
    The absolute values are measured in pixels, based on the network default size (300
    pixels).
    3 这个函数遵循原始在在Caffe中实现SSD的计算模式。
    This function follows the computation performed in the original implementation of SSD
    in Caffe.

    返回:
    列表列出了包含每个尺寸的绝对大小。
    对于每一个尺寸大小, 比率只适用于第一个值。
    Return:
      list of list containing the absolute sizes at each scale.
      For each scale, the ratios only apply to the first value.
    """

    assert img_shape[0] == img_shape[1]
    # assert是检查条件, 如果不符合就终止, 检验是否是正方形
    img_size = img_shape[0]  # 把尺寸赋给img_size=300
    # sk = smin + (smax - smin)*(k-1)/(m-1)
    # 下面操作是为了上式计算方便，为了取整，在原始图像上面有像素点的匹配,所以*100然后/100
    min_ratio = int(size_bounds[0] * 100)  # 缩放比:smin*100 = 0.15 * 100 = 15
    max_ratio = int(size_bounds[1] * 100)  # smax*100 = 0.9 * 100 = 90
    step = int(math.floor((max_ratio - min_ratio) / (n_feat_layers - 2)))
    # （90-15）/(6-2) = 18.75,
    # int(18.75)=18
    # 这里相当于(smax - smin)/(m-1). 层数-2是因为第一层不纳入, 缩放..

    # Start with the following smallest sizes. 从最小的size开始
    sizes = [[img_size * size_bounds[0] / 2, img_size * size_bounds[0]]]  # [22.4, 45]
    # img_size * size_bounds[0]/2当前层先验框尺寸, img_size * size_bounds[0]后一层先验框尺寸
    # =>第一层的缩放比比较特殊，特征图比较大, 使用了更加小的缩放比进行计算, /2 ,如下：
    # img_size * size_bounds[0]/2 = 0.15/2  这就是第一层先验框的尺寸

    for ratio in range(min_ratio, max_ratio + 1, step):  # min_ratio=15, max_ratio=90, step=18
        # 以步长来计算每一层的先验框的尺寸, 并记录在size中
        # step = 18，计算方法:(smax - smin) / m-2，  这里（0.15, 0.9）被分为4部分，=>因为第一层特殊
        sizes.append((img_size * ratio / 100.,
                      img_size * (ratio + step) / 100.))
        # (300*15,300*33)/100 = (45., 99.)    ratio = 15
        # (300*33,300*51)/100 = (99., 153.)    ratio = 33
        # (300*51,300*69)/100 = (153., 207.)    ratio = 51
        # (300*69,300*87)/100 = (207., 261.)    ratio = 69
        # (300*87,300*105)/100 = (261., 315.)    ratio = 87
    # sizes = [[22.4, 45],
    #          [45., 99.],  45 = 300*0.15
    #          [99., 153.],
    #          [153., 207.],
    #          [207., 261.],
    #          [261., 315.]]/100   261 = 300*0.87
    # sizes列表中的[x,y]分表代表本层的先验框尺寸，和后一层的先验框尺寸
    # sizes列表中的[x,y]分表代表每层先验框的小正方形，大正方形的尺寸
    return sizes
"""
     feat_shapes = [(38, 38), (19, 19), (10, 10), (5, 5), (3, 3), (1, 1)],
         anchor_size_bounds=[0.15, 0.9]
         anchor_sizes=[(21., 45.),
                       (45., 99.),
                       (99., 153.),
                       (153., 207.)
                       [207., 261.],
    #                  [261., 315.]]/100
"""


def ssd_feat_shapes_from_net(predictions, default_shapes=None):
    """Try to obtain the feature shapes from the prediction layers. The latter
    can be either a Tensor or Numpy ndarray.
    Return:
      list of feature shapes. Default values if predictions shape not fully
      determined.
    """
    feat_shapes = []
    for l in predictions:
        # Get the shape, from either a np array or a tensor.
        if isinstance(l, np.ndarray):
            shape = l.shape
        else:
            shape = l.get_shape().as_list()
        shape = shape[1:4]
        # Problem: undetermined shape...
        if None in shape:
            return default_shapes
        else:
            feat_shapes.append(shape)
    return feat_shapes


def ssd_anchor_one_layer(img_shape,  # 原始图像
                         feat_shape,  # 特征图的尺寸
                         sizes,  # 当前一个先验框的尺寸:一组先验框中，中心的小正方形的边长 ?=?
                         ratios,  # 缩放比
                         step,  # 步长，假设第一层特征图38*38对应原图300*300, 向上取整[300/38] = [7.8] = 8 = step
                                # 不同于前面的步长,前面的step = (90-15)/4 = 18
                         offset=0.5,  # 偏置值,为了让一个栅格左上角的起始点，到达栅格的中心点
                         dtype=np.float32):
    """
    1 为每个特征层计算SSD的先验框
    Computer SSD default anchor boxes for one feature layer.
    2 确定网格中心的相对位置，和网格的相对宽度w和高度h
    Determine the relative position grid of the centers, and the relative width and height.
    3 参数
      Arguments:
      1）feat_shape: 特征图的shape,用于计算相对位置网格
      feat_shape: Feature shape, used for computing relative position grids;
      2）size: 绝对参考尺寸
      size: Absolute reference sizes;
      3）ratio: 比例，为了使用这些特征
      ratios: Ratios to use on these features;
      4）image_shape:图片形状，用于计算下面画对于former的高度h，宽度w
      img_shape: Image shape, used for computing height, width relatively to the former;
      5) offset: 网格偏移量
      offset: Grid offset.
    4 返回值：
    Return:
      y, x, h, w: Relative x and y grids, and height and width.
    """
    # Compute the position grid: simple way.
    # y, x = np.mgrid[0:feat_shape[0], 0:feat_shape[1]]
    # y = (y.astype(dtype) + offset) / feat_shape[0]
    # x = (x.astype(dtype) + offset) / feat_shape[1]
    # Weird SSD-Caffe computation using steps values...
    y, x = np.mgrid[0:feat_shape[0], 0:feat_shape[1]]
    '''
    如果feat_shap[0] feat_shape[1]都是38的话
    print(x), print(y)都是下面输出
    [[ 0  1  2 ... 35 36 37]
     [ 0  1  2 ... 35 36 37]
     [ 0  1  2 ... 35 36 37]
     ...
     [ 0  1  2 ... 35 36 37]
     [ 0  1  2 ... 35 36 37]
     [ 0  1  2 ... 35 36 37]]
    '''
    # 对于第一个特征图（block4:38*38）:
    y = (y.astype(dtype) + offset) * step / img_shape[0]  # (y+0.5)*(300/38)/300 = (y+0.5)/38 这里归一化, 缩放到0 1之间
    x = (x.astype(dtype) + offset) * step / img_shape[1]  # dtype = np.float32
    # x = (x.astype(dtype) + offset) * step 这里相当于是把特征图里面的一个点对应到了原图的相应位置(y+0.5)*(300/38)
    # x = (x.astype(dtype) + offset) * step / img_shape[1]  然后除以原图大小300，,归一化(y+0.5)*(300/38)/300
    # 得到在原图上，相对原图比例大小的每个锚点的中心坐标x,y

    # Expand dims to support easy broadcasting.
    y = np.expand_dims(y, axis=-1)  # 在最后增加了一个维度通道数，为了后面计算方便, 现在是38*38*1
    x = np.expand_dims(x, axis=-1)  # -1 表示最后一维[输入，高度，宽度，通道]

    # Compute relative height and width. 计算相对高度和宽度
    # Tries to follow the original implementation of SSD for the order. 常识遵循原始的SSD的实现顺序
    # 38*38 3*3 1*1这三层有4个先验框，其他层6个先验框
    '''为什么不直接给出的原因：
    # sizes列表中的[x,y]分表代表本层的先验框尺寸，和后一层的先验框尺寸
    # sizes列表中的[x,y]分表代表每层先验框的中心得小正方形，和外面的大正方形的尺寸
    len(sizes)代表正方形的个数，为 2
    len(ratios)代表对中间的小正方形进行变形，变成长方形的个数， 为 2
    '''
    num_anchors = len(sizes) + len(ratios)  # 计算当前一层一共有多少个先验框, 2+2 = 4个先验框
    h = np.zeros((num_anchors,), dtype=dtype)  # 为后面 记录高和宽的值 做准备
    w = np.zeros((num_anchors,), dtype=dtype)
    '''
    >>> num_anchors = 4
    >>> h = np.zeros((num_anchors,), dtype=np.float32)
    >>> print(h)
    [0. 0. 0. 0.]
    第一位代表中间的小正方形, 用当层的先验框的尺寸处理原始图像的尺寸，可以高的尺寸缩放到0-1之间,全部以比例表示
    '''
    # Add first anchor boxes with ratio=1.
    # =>添加第1个先验框
    h[0] = sizes[0] / img_shape[0]  # 缩放到0-1之间， 需要一个尺寸的时候直接乘原始图像，得到真实尺寸
    w[0] = sizes[0] / img_shape[1]
    di = 1  # 锚点宽个数偏移?=？. 可以看成一个计录器， =1代表第一个先验框的尺寸就记录完成了
    # ?=？测试下sizes里面内容

    # =>添加第2个先验框
    if len(sizes) > 1:  # 检验有多少个正方形的先验框，因为第一层有两个正方形的先验框，所以肯定大于一
        h[1] = math.sqrt(sizes[0] * sizes[1]) / img_shape[0]  # 计算外面的大正方形的先验框与原始图像的比例
        w[1] = math.sqrt(sizes[0] * sizes[1]) / img_shape[1]  # /img_shape 300变成比例值
        # 大正方形边长 = sqart(sk * sk+1), 即对sqart(本层小正方形边长*下层小正方形边长)
        di += 1

    # =>添加第3个先验框
    for i, r in enumerate(ratios):
        # 关于长方形先验框的边长的计算
        # h = sk/sqart(ar)
        # w = sk*sqart(ar)
        h[i + di] = sizes[0] / img_shape[0] / math.sqrt(r)  # sk/300/sqrt(ar), 先除以αr, 再除以真实图像边长, 归一
        w[i + di] = sizes[0] / img_shape[1] * math.sqrt(r)  # sk/300 * sqrt(ar),
    return y, x, h, w


def ssd_anchors_all_layers(img_shape,
                           layers_shape,  # 每个特征层形状尺寸
                           anchor_sizes,  # 起始特征途中框的长宽size
                           anchor_ratios,  # 锚点框长宽比列表
                           anchor_steps,  # 锚点框相对原图缩放比
                           offset=0.5,
                           dtype=np.float32):
    """Compute anchor boxes for all feature layers.
    检测所有特征图中锚点框的4个坐标信息; 输入原始图大小
    """
    layers_anchors = []  # 用于存放所有特征图中锚点框位置尺寸信息
    for i, s in enumerate(layers_shape):
        # 分别计算每个特征图中锚点框的位置尺寸信息：
        '''下面调用了前面的函数
        def ssd_anchor_one_layer(img_shape,  # 原始图像
                         feat_shape,  # 特征图的尺寸
                         sizes,  # 当前一个先验框的尺寸:一组先验框中，中心的小正方形的边长 ?=?
                         ratios,  # 缩放比
                         step,  # 步长，假设第一层特征图38*38对应原图300*300, 向上取整[300/38] = [7.8] = 8 = step
                                # 不同于前面的步长,前面的step = (90-15)/4 = 18
                         offset=0.5,  # 偏置值,为了让一个栅格左上角的起始点，到达栅格的中心点
                         dtype=np.float32):
        '''
        # 调用函数的返回数据全部放在了下面anchor_bboxes
        anchor_bboxes = ssd_anchor_one_layer(img_shape,
                                             s,
                                             anchor_sizes[i],
                                             anchor_ratios[i],
                                             anchor_steps[i],
                                             offset=offset, dtype=dtype)
        # 输入：第i个特征图中起始锚点框大小：如第0个是（21.， 45.）  不是22.4吗 ？=？
        # 输入：第i个特征图中锚点框长宽比列表：如第0个是[2, .5]
        # 输入：第i个特征图中锚点框相对与原始图像的缩放比：如第0个是8  [300/38] = [7.8] = 8 = step
        layers_anchors.append(anchor_bboxes)  # 再将得到结果放到列表中
    return layers_anchors