参考文章
在目标检测算法中,region proposal产生的ROI大小不一,而分类网络的输入要固定的输入,所以ROI Pooing起到一个连接作用,实现了网络的end to end.
下图为一个特征图,黑色框为产生的ROI区域,需要把该区域通过ROI Pooing操作输出为2x2大小的维度。
ROI Pooing的操作很简单,如下操作:
框的宽 W = 7,高H = 5,左上角的右下角坐标为(x,y)=(W/2,3+H/2)=(3.5,5.5),由于坐标都是整数,ROI Pooing直接舍去了小数部分变为(3,5),所以ROI Pooing依据了最近邻插值的原理。
对每个分割的区域用max pooing操作得到
keras版faster rcnn中RoiPoolingConv层的实现如下:
class RoiPoolingConv(Layer):
'''ROI pooling layer for 2D inputs.
See Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,
K. He, X. Zhang, S. Ren, J. Sun
# Arguments
pool_size: int
Size of pooling region to use. pool_size = 7 will result in a 7x7 region.
num_rois: number of regions of interest to be used
# Input shape
list of two 4D tensors [X_img,X_roi] with shape:
X_img:
`(1, channels, rows, cols)` if dim_ordering='th'
or 4D tensor with shape:
`(1, rows, cols, channels)` if dim_ordering='tf'.
X_roi:
`(1,num_rois,4)` list of rois, with orderin