ssd caffe PriorBox参数计算box 个数

最新推荐文章于 2020-11-03 16:39:15 发布

纯粹扯淡

最新推荐文章于 2020-11-03 16:39:15 发布

阅读量2k

点赞数

本文链接：https://blog.csdn.net/kaien1226/article/details/88894513

版权

参考：

https://arleyzhang.github.io/articles/786f1ca3/

https://blog.csdn.net/wfei101/article/details/79477762

https://blog.csdn.net/Dlyldxwl/article/details/80070693

Fig.2 default boxes

作者的实验表明default box的shape数量越多，效果越好。

这里用到的 default box 和Faster RCNN中的 anchor 很像，在Faster RCNN中 anchor 只用在最后一个卷积层，但是在本文中，default box 是应用在多个不同层的feature map上。

那么default box的scale（大小）和aspect ratio（横纵比）要怎么定呢？假设我们用m个feature maps做预测，那么对于每个featuer map而言其default box的scale是按以下公式计算的：

∨∨

Sk=Smin+Smax−Sminm−1(k−1),k∈[1,m]Sk=Smin+Smax−Sminm−1(k−1),k∈[1,m]

这里smin是0.2，表示最底层的scale是0.2；smax是0.9，表示最高层的scale是0.9。

至于aspect ratio，用arar表示为下式：注意这里一共有5种aspect ratio

ar={1,2,3,1/2,1/3}ar={1,2,3,1/2,1/3}

因此每个default box的宽的计算公式为：

wak=skar‾‾√wka=skar

高的计算公式为：（很容易理解宽和高的乘积是scale的平方）

hak=sk/ar‾‾√hka=sk/ar

另外当aspect ratio为1时，作者还增加一种scale的default box：

s′k=sksk+1‾‾‾‾‾‾√sk′=sksk+1

因此，对于每个feature map cell而言，一共有6种default box。

可以看出这种default box在不同的feature层有不同的scale，在同一个feature层又有不同的aspect ratio，因此基本上可以覆盖输入图像中的各种形状和大小的object！

看代码

for (int h = 0; h < layer_height; ++h) {
for (int w = 0; w < layer_width; ++w) {
float center_x = (w + offset_) * step_w;
float center_y = (h + offset_) * step_h;
float box_width, box_height;
for (int s = 0; s < min_sizes_.size(); ++s) {
int min_size_ = min_sizes_[s];
// first prior: aspect_ratio = 1, size = min_size
box_width = box_height = min_size_;
// xmin
top_data[idx++] = (center_x - box_width / 2.) / img_width;
// ymin
top_data[idx++] = (center_y - box_height / 2.) / img_height;
// xmax
top_data[idx++] = (center_x + box_width / 2.) / img_width;
// ymax
top_data[idx++] = (center_y + box_height / 2.) / img_height;

if (max_sizes_.size() > 0) {
CHECK_EQ(min_sizes_.size(), max_sizes_.size());
int max_size_ = max_sizes_[s];
// second prior: aspect_ratio = 1, size = sqrt(min_size * max_size)
box_width = box_height = sqrt(min_size_ * max_size_);
// xmin
top_data[idx++] = (center_x - box_width / 2.) / img_width;
// ymin
top_data[idx++] = (center_y - box_height / 2.) / img_height;
// xmax
top_data[idx++] = (center_x + box_width / 2.) / img_width;
// ymax
top_data[idx++] = (center_y + box_height / 2.) / img_height;
}

// rest of priors
for (int r = 0; r < aspect_ratios_.size(); ++r) {
float ar = aspect_ratios_[r];
if (fabs(ar - 1.) < 1e-6) {
continue;
}
box_width = min_size_ * sqrt(ar);
box_height = min_size_ / sqrt(ar);
// xmin
top_data[idx++] = (center_x - box_width / 2.) / img_width;
// ymin
top_data[idx++] = (center_y - box_height / 2.) / img_height;
// xmax
top_data[idx++] = (center_x + box_width / 2.) / img_width;
// ymax
top_data[idx++] = (center_y + box_height / 2.) / img_height;
}
}
}
}

看一下caffe中设置的参数

layer {
name: "fc7_mbox_priorbox"
type: "PriorBox"
bottom: "fc7"
bottom: "data"
top: "fc7_mbox_priorbox"
prior_box_param {
min_size: 60.0
max_size: 111.0
aspect_ratio: 2
aspect_ratio: 3
flip: true
clip: false
variance: 0.1
variance: 0.1
variance: 0.2
variance: 0.2
step: 16
offset: 0.5
}

具体到每一个feature map上获得prior box时，会从这6种中进行选择。如下表和图所示最后会得到（38*38*4 + 19*19*6 + 10*10*6 + 5*5*6 + 3*3*4 + 1*1*4）= 8732个prior box。

feature map	feature map size	min_size(sksk)	max_size(sk+1sk+1)	aspect_ratio	step	offset	variance
conv4_3	38×38	30	60	1,2	8	0.50	0.1， 0.1， 0.2， 0.2
fc6	19×19	60	111	1,2,3	16
conv6_2	10×10	111	162	1,2,3	32
conv7_2	5×5	162	213	1,2,3	64
conv8_2	3×3	213	264	1,2	100
conv9_2	1×1	264	315	1,2	300