首先需要明白原版SSD中对于conf和loc输出通道和priorbox产生数量之间的关系。如需了解请查看博主两篇博文:
https://blog.csdn.net/xunan003/article/details/79249056
https://blog.csdn.net/xunan003/article/details/79186162
与SSD不同,RefineDet修改了multibox_loss_layer.cpp,即修改了多盒损失,将loss分为arm loss和odm loss,这涉及到通道对应问题。需要注意的是RefineDet中提取anchor的四个卷积层分别为conv4_3、conv5_3、fc7和conv6_2,这四层分别对应四个mbox_conf、mbox_loc和mbox_priorbox层。在arm loss计算阶段,mbox_loc层的output_num计算方式与SSD相同,mbox_priorbox层产生的每个默认框产生4个坐标值并对应mbox_loc层的4个输出通道,即output_num=mbox_priorbox数量 × 4。而mbox_conf层的output_num的计算方式与SSD不同,在RefineDet中mbox_conf用于计算每个priorbox是否预测类别的分数,即每个默认框产生两个分数(类别分数与非类别分数),该层用于粗略的估计box的类别,对应论文中的arm损失,所以mbox_conf层的output_num=mbox_priorbox数量 × 2。如下结构:mbox_priorbox输出3个default boxs,所以mbox_conf和mbox_loc对应的output_num分别为6和12。
layer {
name: "conv6_2_mbox_loc"
type: "Convolution"
bottom: "conv6_2"
top: "conv6_2_mbox_loc"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
convolution_param {
num_output: 12
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "conv6_2_mbox_loc_perm"
type: "Permute"
bottom: "conv6_2_mbox_loc"
top: "conv6_2_mbox_loc_perm"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
layer {
name: "conv6_2_mbox_loc_flat"
type: "Flatten"
bottom: "conv6_2_mbox_loc_perm"
top: "conv6_2_mbox_loc_flat"
flatten_param {
axis: 1
}
}
layer {
name: "conv6_2_mbox_conf"
type: "Convolution"
bottom: "conv6_2"
top: "conv6_2_mbox_conf"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
convolution_param {
num_output: 6
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "conv6_2_mbox_conf_perm"
type: "Permute"
bottom: "conv6_2_mbox_conf"
top: "conv6_2_mbox_conf_perm"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
layer {
name: "conv6_2_mbox_conf_flat"
type: "Flatten"
bottom: "conv6_2_mbox_conf_perm"
top: "conv6_2_mbox_conf_flat"
flatten_param {
axis: 1
}
}
layer {
name: "conv6_2_mbox_priorbox"
type: "PriorBox"
bottom: "conv6_2"
bottom: "data"
top: "conv6_2_mbox_priorbox"
prior_box_param {
min_size: 256.0
aspect_ratio: 2.0
flip: true
clip: false
variance: 0.10000000149
variance: 0.10000000149
variance: 0.20000000298
variance: 0.20000000298
step: 64.0
offset: 0.5
}
}
在odm部分,其实与arm部分一一对应,P3、P4、P5和P6分别与conv4_3、conv5_3、fc7和conv6_2相互对应,虽然在odm部分并未产生mbox_priorbox,但根据odm_loss的计算方式(如下结构)其实包含了arm部分产生的priorbox。
layer {
name: "odm_loss"
type: "MultiBoxLoss"
bottom: "odm_loc"
bottom: "odm_conf"
bottom: "arm_priorbox"
bottom: "label"
bottom: "arm_conf_flatten"
bottom: "arm_loc"
top: "odm_loss"
include {
phase: TRAIN
}
propagate_down: true
propagate_down: true
propagate_down: false
propagate_down: false
propagate_down: false
propagate_down: false
loss_param {
normalization: VALID
}
multibox_loss_param {
loc_loss_type: SMOOTH_L1
conf_loss_type: SOFTMAX
loc_weight: 1.0
num_classes: 5
share_location: true
match_type: PER_PREDICTION
overlap_threshold: 0.5
use_prior_for_matching: true
background_label_id: 0
use_difficult_gt: false
neg_pos_ratio: 3.0
neg_overlap: 0.5
code_type: CENTER_SIZE
ignore_cross_boundary_bbox: false
mining_type: MAX_NEGATIVE
objectness_score: 0.00999999977648
}
}
所以odm部分产生的conf和loc是对arm部分的优化和补充,此部分产生mbox_conf和mbox_loc的四个层P3、P4、P5和P6对应conv4_3、conv5_3、fc7和conv6_2产生的priorbox数量,其output_num符合原始SSD计算方式,即mbox_conf的output_num=mbox_priorbox数量 × class_num(样本类别数)。而mbox_loc输出通道与arm部分相同,即output_num=mbox_priorbox数量 × 4。
如下结构为P6后产生的mbox_conf和mbox_loc层的output_num,与上面结构的conv6_2对应。即mbox_priorbox输出3个default boxs,所以mbox_conf和mbox_loc对应的output_num分别为15(类别数是5,即5×3=15)和12。
layer {
name: "P6_mbox_loc"
type: "Convolution"
bottom: "P6"
top: "P6_mbox_loc"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
convolution_param {
num_output: 24
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "P6_mbox_loc_perm"
type: "Permute"
bottom: "P6_mbox_loc"
top: "P6_mbox_loc_perm"
permute_param {
order: 0
order: 2
order: 3
order: 1
}
}
layer {
name: "P6_mbox_loc_flat"
type: "Flatten"
bottom: "P6_mbox_loc_perm"
top: "P6_mbox_loc_flat"
flatten_param {
axis: 1
}
}
layer {
name: "P6_mbox_conf"
type: "Convolution"
bottom: "P6"
top: "P6_mbox_conf"
param {
lr_mult: 1.0
decay_mult: 1.0
}
param {
lr_mult: 2.0
decay_mult: 0.0
}
convolution_param {
num_output: 15
pad: 1
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.0
}
}
}