参考https://blog.csdn.net/weixin_41693877/article/details/107159304
大概过程
(1)生成所有anchor
(2)根据回归得到的 偏移量预测数据,对生成的anchor进行修正,并且将超出原图边界之外部分的边框修正到边界,即proposal
(3)利用网络预测的得分,对proposal进行排序,取靠前的部分。再对proposal进行NMS,取前2000个作为结果。
1 生成所有anchor
https://blog.csdn.net/weixin_43436587/article/details/108082934
可以移步目标数据生成部分。
2 修正proposal
# 生成anchor后,首先利用回归网络对anchor进行偏移修整, (batch, 16650, 4)
# 回归得到的是一个偏移量,利用得到的偏移量对原本的anchor进行修正
#anchors 是之前生成的,bbox_deltas是回归分支输出
proposals = bbox_transform_inv(anchors, bbox_deltas, batch_size)
# 2. clip predicted boxes to image
# 严格限制proposal的四个角在图像边界内
# 将超出图像范围的边框修整到图像边界,(batch, 16650, 4)
proposals = clip_boxes(proposals, im_info, batch_size)
3 取得分较高的proposal进行NMS
scores_keep = scores # (batch, 16650)
proposals_keep = proposals # (batch, 16650, 4)
_, order = torch.sort(scores_keep, 1, True)
output = scores.new(batch_size, post_nms_topN, 5).zero_()
for i in range(batch_size):
proposals_single = proposals_keep[i] # 取出单个样本的候选框
scores_single = scores_keep[i] # 取出单个样本的前景概率
order_single = order[i] # 取出单个样本的的前景概率排序索引
# 选取前12000个(训练阶段)
if pre_nms_topN > 0 and pre_nms_topN < scores_keep.numel():
order_single = order_single[:pre_nms_topN]
# 取得分最高的前12000(训练阶段)
proposals_single = proposals_single[order_single, :]
scores_single = scores_single[order_single].view(-1,1)
# 进行NMS
keep_idx_i = nms(torch.cat((proposals_single, scores_single), 1), nms_thresh, force_cpu=not cfg.USE_GPU_NMS)
keep_idx_i = keep_idx_i.long().view(-1)
# 最终选择前2000个,作为最终的Proposal输出
if post_nms_topN > 0:
keep_idx_i = keep_idx_i[:post_nms_topN]
proposals_single = proposals_single[keep_idx_i, :]
scores_single = scores_single[keep_idx_i, :]
# padding 0 at the end.
num_proposal = proposals_single.size(0)
output[i,:,0] = i
output[i,:num_proposal,1:] = proposals_single
取出的样本的scores_single、scores_single、以及order,根据order选出scores、proposals、排在前12000的部分,进行NMS,取前2000个作为输出。