深度学习(八)fasterRCNN生成锚框利用前景概率提取roi

利用anchor_base锚框的形状+(a,b,a,b)坐标,输出的是锚框的左上角纵坐标,左上角横坐标,右下角纵坐标,右下角横坐标

代码在/model/region_proposal_network/enumerate_shift_anchor()

代码如下:

def _enumerate_shifted_anchor(anchor_base, feat_stride, height, width):  #输入为anchor_base锚框基本四坐标,feat_stride为缩放倍数
    import numpy as xp
    shift_y = xp.arange(0, height * feat_stride, feat_stride)   #映射原图的纵坐标
    shift_x = xp.arange(0, width * feat_stride, feat_stride)   #映射原图的横坐标
    shift_x, shift_y = xp.meshgrid(shift_x, shift_y)   #生成shift_y行,shift_x列的矩阵
    shift = xp.stack((shift_y.ravel(), shift_x.ravel(),
                      shift_y.ravel(), shift_x.ravel()), axis=1)   #ravel和flaten一样,用于拉平矩阵,stack,axis=1为水平合并,输出两组相同的坐标a,b,a,b  shape[0]为坐标的数量

    A = anchor_base.shape[0]   #9基础锚框的数量
    K = shift.shape[0]   #featrue_map坐标的数量
    anchor = anchor_base.reshape((1, A, 4)) + \
             shift.reshape((1, K, 4)).transpose((1, 0, 2))   #\为连接作用,这句意义是基础锚框的坐标(1,A,4)与shift(K,1,4)相加,
             #意味着一个坐标点分别与基础锚框进行9次加法运算得到9个锚框,所以输出为(K,A,4),共K*A个锚框
    anchor = anchor.reshape((K * A, 4)).astype(np.float32)   #改变生成的anchor维度,shape[0]为锚框个数,shape[1]为左上右下的4个横纵坐标
    return anchor

筛选锚框是利用前景概率文件在model/utils/creator_tool.py

代码如下:

class ProposalCreator:

    def __init__(self,
                 parent_model,
                 nms_thresh=0.7,
                 n_train_pre_nms=12000,
                 n_train_post_nms=2000,
                 n_test_pre_nms=6000,
                 n_test_post_nms=300,
                 min_size=16
                 ):
        self.parent_model = parent_model
        self.nms_thresh = nms_thresh
        self.n_train_pre_nms = n_train_pre_nms
        self.n_train_post_nms = n_train_post_nms
        self.n_test_pre_nms = n_test_pre_nms
        self.n_test_post_nms = n_test_post_nms
        self.min_size = min_size

    def __call__(self, loc, score,
                 anchor, img_size, scale=1.):
       
        if self.parent_model.training:
            n_pre_nms = self.n_train_pre_nms
            n_post_nms = self.n_train_post_nms
        else:
            n_pre_nms = self.n_test_pre_nms
            n_post_nms = self.n_test_post_nms

        # Convert anchors into proposal via bbox transformations.
        # roi = loc2bbox(anchor, loc)
        roi = loc2bbox(anchor, loc)   #输出为预测框

        # Clip predicted boxes to image.
        roi[:, slice(0, 4, 2)] = np.clip(
            roi[:, slice(0, 4, 2)], 0, img_size[0])    ###重置预测框,设置坐标下限和上限,下限为0,上限横坐标对用W,纵坐标对应H
        roi[:, slice(1, 4, 2)] = np.clip(
            roi[:, slice(1, 4, 2)], 0, img_size[1])

        # Remove predicted boxes with either height or width < threshold.
        min_size = self.min_size * scale
        hs = roi[:, 2] - roi[:, 0]     ###输出为一维数组锚框高度
        ws = roi[:, 3] - roi[:, 1]    ###输出为一维数组锚框宽度
        keep = np.where((hs >= min_size) & (ws >= min_size))[0]    ###输出为一维数组符合要求的roi索引号
        roi = roi[keep, :]   ###提取符合要求的roi
        score = score[keep]    ###输出为符合要求的roi的前景概率数组

        # Sort all (proposal, score) pairs by score from highest to lowest.
        # Take top pre_nms_topN (e.g. 6000).
        order = score.ravel().argsort()[::-1]   ###将前景概率进行拉伸并逆序,从大到小排序
        if n_pre_nms > 0:
            order = order[:n_pre_nms]
        roi = roi[order, :]    ###该前景概率排序roi

        # Apply nms (e.g. threshold = 0.7).
        # Take after_nms_topN (e.g. 300).

        # unNOTE: somthing is wrong here!
        # TODO: remove cuda.to_gpu
        keep = non_maximum_suppression(
            cp.ascontiguousarray(cp.asarray(roi)),
            thresh=self.nms_thresh)   ###此函数是用来进行非极大值抑制,输出满足要求的索引号
        if n_post_nms > 0:
            keep = keep[:n_post_nms]   ###切片
        roi = roi[keep]   ###输出为切片后指定数量的roi
        return roi

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值