anchors个数
基于FPN的特征图[P2,P3,P4,P5,P6],介绍下MASKRCNN网络中Anchor锚框的生成,根据源码中介绍的规则,遍历P2到P6这五个特征层,以每个特征图上的每个像素点都生成Anchor锚框,每个像素点生成的锚框根据框的面积相同,设置不同的长宽比率RATIO=[0.5,1,2]完成了三种变换,因此每个像素点可生成3个框
feature | shape(h,w,c) | anchhors num | feature_stride | RPN_ANCHOR_SCALES |
---|---|---|---|---|
p2 | 256* 256* 256 | 256* 256* 3 =196608 | 4 | 32 |
p3 | 128* 128* 256 | 128* 128* 3 =49152 | 8 | 64 |
p4 | 64* 64* 256 | 64* 64* 3 =12288 | 16 | 128 |
p5 | 32* 32* 256 | 32* 32* 3 =3072 | 32 | 256 |
p6 | 16 * 16 * 256 | 16 * 16 * 3 =768 | 64 | 512 |
总计 | 261888 |
anchors生成
这部分在源代码里是有一个专门的函数:generate_anchors()
anchors = self.get_anchors(config.IMAGE_SHAPE)
#config.IMAGE_SHAPE = (1024,1024,3)原图的shape
# Duplicate across the batch dimension because Keras requires it
# TODO: can this be optimized to avoid duplicating the anchors?
#广播(坐标)-》(batch_size,坐标)
anchors = np.broadcast_to(anchors, (config.BATCH_SIZE,) + anchors.shape)
a = utils.generate_pyramid_anchors(
self.config.RPN_ANCHOR_SCALES,#(32, 64, 128, 256, 512)
self.config.RPN_ANCHOR_RATIOS,#[0.5, 1, 2]
backbone_shapes,
self.config.BACKBONE_STRIDES,#[4, 8, 16, 32, 64]
self.config.RPN_ANCHOR_STRIDE) # 1
# 每一层特征图 生成anchors
anchors = []
for i in range(len(scales)):
anchors.append(generate_anchors(scales[i], ratios, feature_shapes[i],
feature_strides[i], anchor_stride))
以p2层特征图为例,查看具体计算流程
scales = 32
ratios = [0.5,1,2]
feature_shapes = [256,256]
feature_strides = 4
anchor_stride =1
print('scales:',scales)
print('ratios:', ratios)
print('shape:', shape)
print('feature_stride:', feature_stride)
print('anchor_stride:', anchor_stride)
scales, ratios = np.meshgrid(np.array(scales), np.array(ratios))
scales = scales.flatten()
ratios = ratios.flatten()
print('scales:', scales)
print('ratios:', ratios)
scales: 32
ratios: [0.5, 1, 2]
shape: [256 256]
feature_stride: 4
anchor_stride: 1
scales: [32 32 32]
ratios: [0.5 1. 2. ]
以scales 为方形边长,进行ratios三种变换时,矩形宽高
heights = scales / np.sqrt(ratios)
widths = scales * np.sqrt(ratios)
print('heights:', heights)
print('widths:', widths)
heights: [45.254834 32. 22.627417]
widths: [22.627417 32. 45.254834]
# Enumerate shifts in feature space
shifts_y = np.arange(0, shape[0], anchor_stride) * feature_stride
shifts_x = np.arange(0, shape[1], anchor_stride) * feature_stride
shifts_x, shifts_y = np.meshgrid(shifts_x, shifts_y)
print(shifts_x)
print(shifts_y)
[[ 0 4 8 … 1012 1016 1020]
[ 0 4 8 … 1012 1016 1020]
[ 0 4 8 … 1012 1016 1020]
…
[ 0 4 8 … 1012 1016 1020]
[ 0 4 8 … 1012 1016 1020]
[ 0 4 8 … 1012 1016 1020]]
[[ 0 0 0 … 0 0 0]
[ 4 4 4 … 4 4 4]
[ 8 8 8 … 8 8 8]
…
[1012 1012 1012 … 1012 1012 1012]
[1016 1016 1016 … 1016 1016 1016]
[1020 1020 1020 … 1020 1020 1020]]
box_widths, box_centers_x = np.meshgrid(widths, shifts_x)
box_heights, box_centers_y = np.meshgrid(heights, shifts_y)
# Reshape to get a list of (y, x) and a list of (h, w)
box_centers = np.stack(
[box_centers_y, box_centers_x], axis=2).reshape([-1, 2])
print(box_centers)
box_sizes = np.stack([box_heights, box_widths], axis=2).reshape([-1, 2])
print(box_sizes)
[[ 0 0]
[ 0 0]
[ 0 0]
…
[1020 1020]
[1020 1020]
[1020 1020]]
[[45.254834 22.627417]
[32. 32. ]
[22.627417 45.254834]
…
[45.254834 22.627417]
[32. 32. ]
[22.627417 45.254834]]
矩形框左上角和右下角坐标
boxes = np.concatenate([box_centers - 0.5 * box_sizes,
box_centers + 0.5 * box_sizes], axis=1)
print(boxes)
print(boxes.shape)
[[ -22.627417 -11.3137085 22.627417 11.3137085]
[ -16. -16. 16. 16. ]
[ -11.3137085 -22.627417 11.3137085 22.627417 ]
…
[ 997.372583 1008.6862915 1042.627417 1031.3137085]
[1004. 1004. 1036. 1036. ]
[1008.6862915 997.372583 1031.3137085 1042.627417 ]]
(196608, 4)