深度学习（八）fasterRCNN生成锚框利用前景概率提取roi

最新推荐文章于 2024-07-09 16:54:21 发布

风痕依旧

最新推荐文章于 2024-07-09 16:54:21 发布

阅读量986

点赞数

分类专栏：深度学习

本文链接：https://blog.csdn.net/qq_29507011/article/details/101027114

版权

深度学习专栏收录该内容

11 篇文章 0 订阅

订阅专栏

利用anchor_base锚框的形状+（a,b,a,b）坐标，输出的是锚框的左上角纵坐标，左上角横坐标，右下角纵坐标，右下角横坐标

代码在/model/region_proposal_network/enumerate_shift_anchor()

代码如下：

def _enumerate_shifted_anchor(anchor_base, feat_stride, height, width):  #输入为anchor_base锚框基本四坐标，feat_stride为缩放倍数
    import numpy as xp
    shift_y = xp.arange(0, height * feat_stride, feat_stride)   #映射原图的纵坐标
    shift_x = xp.arange(0, width * feat_stride, feat_stride)   #映射原图的横坐标
    shift_x, shift_y = xp.meshgrid(shift_x, shift_y)   #生成shift_y行，shift_x列的矩阵
    shift = xp.stack((shift_y.ravel(), shift_x.ravel(),
                      shift_y.ravel(), shift_x.ravel()), axis=1)   #ravel和flaten一样，用于拉平矩阵,stack,axis=1为水平合并，输出两组相同的坐标a,b,a,b  shape[0]为坐标的数量

    A = anchor_base.shape[0]   #9基础锚框的数量
    K = shift.shape[0]   #featrue_map坐标的数量
    anchor = anchor_base.reshape((1, A, 4)) + \
             shift.reshape((1, K, 4)).transpose((1, 0, 2))   #\为连接作用，这句意义是基础锚框的坐标(1,A,4)与shift(K,1,4)相加，
             #意味着一个坐标点分别与基础锚框进行9次加法运算得到9个锚框，所以输出为（K,A,4），共K*A个锚框
    anchor = anchor.reshape((K * A, 4)).astype(np.float32)   #改变生成的anchor维度，shape[0]为锚框个数,shape[1]为左上右下的4个横纵坐标
    return anchor

筛选锚框是利用前景概率文件在model/utils/creator_tool.py

代码如下：

class ProposalCreator:

    def __init__(self,
                 parent_model,
                 nms_thresh=0.7,
                 n_train_pre_nms=12000,
                 n_train_post_nms=2000,
                 n_test_pre_nms=6000,
                 n_test_post_nms=300,
                 min_size=16
                 ):
        self.parent_model = parent_model
        self.nms_thresh = nms_thresh
        self.n_train_pre_nms = n_train_pre_nms
        self.n_train_post_nms = n_train_post_nms
        self.n_test_pre_nms = n_test_pre_nms
        self.n_test_post_nms = n_test_post_nms
        self.min_size = min_size

    def __call__(self, loc, score,
                 anchor, img_size, scale=1.):
       
        if self.parent_model.training:
            n_pre_nms = self.n_train_pre_nms
            n_post_nms = self.n_train_post_nms
        else:
            n_pre_nms = self.n_test_pre_nms
            n_post_nms = self.n_test_post_nms

        # Convert anchors into proposal via bbox transformations.
        # roi = loc2bbox(anchor, loc)
        roi = loc2bbox(anchor, loc)   #输出为预测框

        # Clip predicted boxes to image.
        roi[:, slice(0, 4, 2)] = np.clip(
            roi[:, slice(0, 4, 2)], 0, img_size[0])    ###重置预测框，设置坐标下限和上限，下限为0，上限横坐标对用W，纵坐标对应H
        roi[:, slice(1, 4, 2)] = np.clip(
            roi[:, slice(1, 4, 2)], 0, img_size[1])

        # Remove predicted boxes with either height or width < threshold.
        min_size = self.min_size * scale
        hs = roi[:, 2] - roi[:, 0]     ###输出为一维数组锚框高度
        ws = roi[:, 3] - roi[:, 1]    ###输出为一维数组锚框宽度
        keep = np.where((hs >= min_size) & (ws >= min_size))[0]    ###输出为一维数组符合要求的roi索引号
        roi = roi[keep, :]   ###提取符合要求的roi
        score = score[keep]    ###输出为符合要求的roi的前景概率数组

        # Sort all (proposal, score) pairs by score from highest to lowest.
        # Take top pre_nms_topN (e.g. 6000).
        order = score.ravel().argsort()[::-1]   ###将前景概率进行拉伸并逆序，从大到小排序
        if n_pre_nms > 0:
            order = order[:n_pre_nms]
        roi = roi[order, :]    ###该前景概率排序roi

        # Apply nms (e.g. threshold = 0.7).
        # Take after_nms_topN (e.g. 300).

        # unNOTE: somthing is wrong here!
        # TODO: remove cuda.to_gpu
        keep = non_maximum_suppression(
            cp.ascontiguousarray(cp.asarray(roi)),
            thresh=self.nms_thresh)   ###此函数是用来进行非极大值抑制，输出满足要求的索引号
        if n_post_nms > 0:
            keep = keep[:n_post_nms]   ###切片
        roi = roi[keep]   ###输出为切片后指定数量的roi
        return roi