proposal_top_layer.py和snippets.py函数解析

最新推荐文章于 2020-04-07 19:03:03 发布

南石北岸生

最新推荐文章于 2020-04-07 19:03:03 发布

阅读量1.3k

点赞数 1

分类专栏：目标检测 Faster R-CNN

本文链接：https://blog.csdn.net/gusui7202/article/details/84634485

版权

目标检测同时被 2 个专栏收录

29 篇文章 1 订阅

订阅专栏

Faster R-CNN

18 篇文章 3 订阅

订阅专栏

proposal_top_layer.py

这个函数在主网络network.py里面调用到，负责对rpn计算结果roi proposals的优选，不包括nms。

函数的解析如下：

# --------------------------------------------------------
# Faster R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from model.config import cfg
from model.bbox_transform import bbox_transform_inv, clip_boxes, bbox_transform_inv_tf, clip_boxes_tf

import tensorflow as tf
import numpy as np
import numpy.random as npr

def proposal_top_layer(rpn_cls_prob, rpn_bbox_pred, im_info, _feat_stride, anchors, num_anchors):
  """A layer that just selects the top region proposals
     without using non-maximal suppression,
     For details please see the technical report
   该层只负责roi proposals的优选，不负责nms
  """
  rpn_top_n = cfg.TEST.RPN_TOP_N、#5000

# Only useful when TEST.MODE is 'top', specifies the number of top proposals to select
#C.TEST.RPN_TOP_N = 5000
# 仅仅在TEST.MODE是top的时候使用，指定选择前5000个框

  #提取概率分数
  scores = rpn_cls_prob[:, :, :, num_anchors:]
  #提取预测框
  rpn_bbox_pred = rpn_bbox_pred.reshape((-1, 4))
  #变形一下方便接入
  scores = scores.reshape((-1, 1))
  #统计有多少个框
  length = scores.shape[0]
  if length < rpn_top_n:#如果小于5000，就随机采样，就是说如果框少于5000个，我也要随机重复采样，让它变成5000个框。
    # Random selection, maybe unnecessary and loses good proposals
    # But such case rarely happens
    top_inds = npr.choice(length, size=rpn_top_n, replace=True)#size是采样的数量，replace为true为有放回采样，false为不放回采样。
  else:
    top_inds = scores.argsort(0)[::-1]#列，从大到小排序，取索引。
    top_inds = top_inds[:rpn_top_n]#索引取前5000个
    top_inds = top_inds.reshape(rpn_top_n, )#reshape一下，变成5000行。

  # Do the selection here
  anchors = anchors[top_inds, :]#按索引提取anchor数据
  rpn_bbox_pred = rpn_bbox_pred[top_inds, :]#按索引提取rpnbbox回归数据
  scores = scores[top_inds]#按索引提取得分数据

  # Convert anchors into proposals via bbox transformations
  #bbox_transform_inv函数把anchors框转换为proposals
  #也就是说，输入anchors和计算出来的dw dh dx dy，计算得到修正后的proposals
  #bbox_transform_inv有个变体bbox_transform_inv_tf
  #其实是一样的，实现方式不同。另一个用tf函数来实现。
  proposals = bbox_transform_inv(anchors, rpn_bbox_pred)

  # Clip predicted boxes to image
  #剪切roi
  proposals = clip_boxes(proposals, im_info[:2])

  # Output rois blob
  # Our RPN implementation only supports a single input image, so all
  # batch inds are 0
  #这个代码实现的rpn只能一张一张图片的处理roi，所以batch=0
  batch_inds = np.zeros((proposals.shape[0], 1), dtype=np.float32)
  blob = np.hstack((batch_inds, proposals.astype(np.float32, copy=False)))
  return blob, scores

#可能上面的是原来的版本吧，下面的是作者写的tensorflow版
def proposal_top_layer_tf(rpn_cls_prob, rpn_bbox_pred, im_info, _feat_stride, anchors, num_anchors):
  """A layer that just selects the top region proposals
     without using non-maximal suppression,
     For details please see the technical report
	 这个实现与上面是一样的功能，但是实现的过程有点区别
  """
  rpn_top_n = cfg.TEST.RPN_TOP_N

  scores = rpn_cls_prob[:, :, :, num_anchors:]
  rpn_bbox_pred = tf.reshape(rpn_bbox_pred, shape=(-1, 4))
  scores = tf.reshape(scores, shape=(-1,))

  # Do the selection here
  #可以看到选择的过程都是用tf实现的:tf.nn.top_k\tf.reshape\tf.gather\bbox_transform_inv_tf这个函数也有连个版本，上面的def用了另一个非tf实现版本。
  top_scores, top_inds = tf.nn.top_k(scores, k=rpn_top_n)
  top_scores = tf.reshape(top_scores, shape=(-1, 1))
  top_anchors = tf.gather(anchors, top_inds)
  top_rpn_bbox = tf.gather(rpn_bbox_pred, top_inds)
  proposals = bbox_transform_inv_tf(top_anchors, top_rpn_bbox)

  # Clip predicted boxes to image
  #clip_boxes_tf同样
  proposals = clip_boxes_tf(proposals, im_info[:2])

  # Output rois blob
  # Our RPN implementation only supports a single input image, so all
  # batch inds are 0
  proposals = tf.to_float(proposals)
  batch_inds = tf.zeros((rpn_top_n, 1))
  blob = tf.concat([batch_inds, proposals], 1)
  return blob, top_scores

snippets.py

snippets小片

这个函数在主网络network.py里面调用，实际是对generate_anchors()的封装。这个函数之前解析过了，就是生成anchor框的。

anchors = generate_anchors(ratios=np.array(anchor_ratios), scales=np.array(anchor_scales))

函数主要负责把这个生成的框，扩展到原图上，所以要计算shift偏移。

代码解析如下：

# --------------------------------------------------------
# Tensorflow Faster R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Xinlei Chen
# --------------------------------------------------------
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf
import numpy as np
from layer_utils.generate_anchors import generate_anchors
#下面的函数再network.py里面调用，有一个双胞胎generate_anchors_pre_tf，用于end to end实现。而下面的是非e2e用到的。
def generate_anchors_pre(height, width, feat_stride, anchor_scales=(8,16,32), anchor_ratios=(0.5,1,2)):
  """ A wrapper function to generate anchors given different scales
    Also return the number of anchors in variable 'length'
    功能是给定不同的scales和ratios生成各种尺度的anchors
  """
  #首先调用generate_anchors函数，之前解析过了。其实主要就是这个。这个函数只是封装好，然后加了其他一些处理。
  anchors = generate_anchors(ratios=np.array(anchor_ratios), scales=np.array(anchor_scales))
  A = anchors.shape[0]
  #w和h其实是feature map的大小，对应着w*h个feature，乘上feat_stride就可以计算得到每一个anchor的点。
  #比如三个anchor在feature map上坐标是[1,2,3]，推广到原图得到[1,2,3]x16=[16,32,48][x1,x2,x3]。
  shift_x = np.arange(0, width) * feat_stride
  shift_y = np.arange(0, height) * feat_stride
  #这一个放缩操作一来一回保证了能完全对应上，得到一个新得原图坐标系
  #meshgrid就是按向量a\b生成坐标矩阵
  shift_x, shift_y = np.meshgrid(shift_x, shift_y)
  #vstack按列堆叠，最后得到的shifts就是所有anchor在原图中的坐标集合
  #ravel()是类似与flatten的功能，把数据拉直，但是ravel是修改原始数据的，flatten则是复制后变换，不改变原数据
  shifts = np.vstack((shift_x.ravel(), shift_y.ravel(), shift_x.ravel(), shift_y.ravel())).transpose()
  K = shifts.shape[0]
  # width changes faster, so here it is H, W, C
  #计算出anchors
  anchors = anchors.reshape((1, A, 4)) + shifts.reshape((1, K, 4)).transpose((1, 0, 2))
  anchors = anchors.reshape((K * A, 4)).astype(np.float32, copy=False)
  length = np.int32(anchors.shape[0])
  return anchors, length
#下面的函数是上面的tf实现
def generate_anchors_pre_tf(height, width, feat_stride=16, anchor_scales=(8, 16, 32), anchor_ratios=(0.5, 1, 2)):
  shift_x = tf.range(width) * feat_stride # width
  shift_y = tf.range(height) * feat_stride # height
  shift_x, shift_y = tf.meshgrid(shift_x, shift_y)
  sx = tf.reshape(shift_x, shape=(-1,))
  sy = tf.reshape(shift_y, shape=(-1,))
  shifts = tf.transpose(tf.stack([sx, sy, sx, sy]))
  K = tf.multiply(width, height)
  shifts = tf.transpose(tf.reshape(shifts, shape=[1, K, 4]), perm=(1, 0, 2))

  anchors = generate_anchors(ratios=np.array(anchor_ratios), scales=np.array(anchor_scales))
  A = anchors.shape[0]
  anchor_constant = tf.constant(anchors.reshape((1, A, 4)), dtype=tf.int32)
  #A是anchors的个数，默认是9，k就是w*h，即feature map大小，也就是anchor点的数量
  length = K * A#所有anchor的数量
  anchors_tf = tf.reshape(tf.add(anchor_constant, shifts), shape=(length, 4))
  
  return tf.cast(anchors_tf, dtype=tf.float32), length

南石北岸生

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
proposal_top_layer.py和snippets.py函数解析

proposal_top_layer.py这个函数在主网络network.py里面调用到，负责对rpn计算结果roi proposals的优选，不包括nms。函数的解析如下：# --------------------------------------------------------# Faster R-CNN# Licensed under The MIT License...
复制链接

扫一扫

专栏目录