目标检测-RefineDet训练脚本解析

个人微信公众号:AI研习图书馆,欢迎关注~

深度学习知识及资源分享,学习交流,共同进步~

Refinedet算法系列-相关链接

  1. 论文地址
  2. 项目地址
  3. Refinedet文章翻译
  4. Refinedet训练代码解析
  5. Refinedet网络结构解析
  6. 目标检测-RefineDet实现详细步骤
  7. Refinedet论文总结

RefineDet系列-训练脚本解析

RefineDet的相关原理介绍已经在之前的博客中做了介绍:
这里主要就所用到的训练脚本中做一些分析,这里选用的网络是VGG16作为基础训练网络,相关训练脚本的参数解释如下

'''
所用文件位于:RefineDet-master\examples\refinedet\VGG16_VOC2007_512.py
'''

# Add extra layers on top of a "base" network (e.g. VGGNet or ResNet).
# AddExtraLayers函数是网络结构构造中比较重要的函数,主要实现的就是论文中的
# transfer connection block (TCB)部分,也就是类似FPN算法的特征融合操作。
def AddExtraLayers(net, use_batchnorm=True, arm_source_layers=[], normalizations=[], lr_mult=1):
    use_relu = True

    # Add additional convolutional layers.
	# 添加论文中所说的后面几层卷积
    # 512/32: 16 x 16
    from_layer = net.keys()[-1]

    # 512/64: 8 x 8
    out_layer = "conv6_1"
    ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 1, 0, 1, lr_mult=lr_mult)

    from_layer = out_layer
    out_layer = "conv6_2"
    ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 512, 3, 1, 2, lr_mult=lr_mult)
	
	# 按照指定的ARM特征图构建FPN网络结构
    arm_source_layers.reverse()  # ARM模块的特征图来源层
    normalizations.reverse()
    num_p = 6
    for index, layer in enumerate(arm_source_layers):
        out_layer = layer
		# 对conv4_3与conv5_3层归一化操作,scale分别为10与8
		# 文中讲到这两个层由于之前层的特征差异,这里对其归一化,并且不断学习scale
        if normalizations:
            if normalizations[index] != -1:
                norm_name = "{}_norm".format(layer)
                net[norm_name] = L.Normalize(net[layer], scale_filler=dict(type="constant", value=normalizations[index]),
                    across_spatial=False, channel_shared=False)
                out_layer = norm_name
                arm_source_layers[index] = norm_name
        from_layer = out_layer
        out_layer = "TL{}_{}".format(num_p, 1)
        ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 3, 1, 1, lr_mult=lr_mult)

        if num_p == 6:  # 网络的最后一个TCB模块,没有后面的输入
            from_layer = out_layer
            out_layer = "TL{}_{}".format(num_p, 2)
            ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 3, 1, 1, lr_mult=lr_mult)

            from_layer = out_layer
            out_layer = "P{}".format(num_p)
            ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 3, 1, 1, lr_mult=lr_mult)
        else:  # 其它TCB模块的创建
            from_layer = out_layer
            out_layer = "TL{}_{}".format(num_p, 2)
            ConvBNLayer(net, from_layer, out_layer, use_batchnorm, False, 256, 3, 1, 1, lr_mult=lr_mult)

            from_layer = "P{}".format(num_p+1)
            out_layer = "P{}-up".format(num_p+1)
            DeconvBNLayer(net, from_layer, out_layer, use_batchnorm, False, 256, 2, 0, 2, lr_mult=lr_mult)

            from_layer = ["TL{}_{}".format(num_p, 2), "P{}-up".format(num_p+1)]
            out_layer = "Elt{}".format(num_p)
            EltwiseLayer(net, from_layer, out_layer)
            relu_name = '{}_relu'.format(out_layer)
            net[relu_name] = L.ReLU(net[out_layer], in_place=True)
            out_layer = relu_name

            from_layer = out_layer
            out_layer = "P{}".format(num_p)
            ConvBNLayer(net, from_layer, out_layer, use_batchnorm, use_relu, 256, 3, 1, 1, lr_mult=lr_mult)

        num_p = num_p - 1

    return net


### Modify the following parameters accordingly ###
# The directory which contains the caffe code.
# We assume you are running the script at the CAFFE_ROOT.
caffe_root = os.getcwd()

# Set true if you want to start training right after generating all files.
run_soon = True
# Set true if you want to load from most recently saved snapshot.
# Otherwise, we will load from the pretrain_model defined below.
resume_training = True  # 是否需要从上次训练中断的地方继续训练
# If true, Remove old model files.
remove_old_models = False  # 是否删除老的模型

# The database file for training data. Created by data/VOC0712/create_data.sh
train_data = "examples/VOC0712/VOC0712_trainval_lmdb"
# The database file for testing data. Created by data/VOC0712/create_data.sh
test_data = "examples/VOC0712/VOC0712_test_lmdb"
# Specify the batch sampler.
resize_width = 512  # 训练时候图像需要resize到的宽度
resize_height = 512 # 训练时候图像需要resize到的高度
resize = "{}x{}".format(resize_width, resize_height)
# batch_sampler列表用在数据读取和处理中
batch_sampler = [
        {
                'sampler': {
                        },
                'max_trials': 1,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.1,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.3,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.5,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.7,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'min_jaccard_overlap': 0.9,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        {
                'sampler': {
                        'min_scale': 0.3,
                        'max_scale': 1.0,
                        'min_aspect_ratio': 0.5,
                        'max_aspect_ratio': 2.0,
                        },
                'sample_constraint': {
                        'max_jaccard_overlap': 1.0,
                        },
                'max_trials': 50,
                'max_sample': 1,
        },
        ]
# train_transform_param字典是对训练数据的预处理操作,数据增强
train_transform_param = {
        'mirror': True,  # 水平镜像
        'mean_value': [104, 117, 123],  # 图像均值
        'resize_param': {  # resize方法设置
                'prob': 1,
                'resize_mode': P.Resize.WARP,
                'height': resize_height,
                'width': resize_width,
                'interp_mode': [
                        P.Resize.LINEAR,
                        P.Resize.AREA,
                        P.Resize.NEAREST,
                        P.Resize.CUBIC,
                        P.Resize.LANCZOS4,
                        ],
                },
        'distort_param': {  # 明暗色彩等图像变换
                'brightness_prob': 0.5,
                'brightness_delta': 32,
                'contrast_prob': 0.5,
                'contrast_lower': 0.5,
                'contrast_upper': 1.5,
                'hue_prob': 0.5,
                'hue_delta': 18,
                'saturation_prob': 0.5,
                'saturation_lower': 0.5,
                'saturation_upper': 1.5,
                'random_order_prob': 0.0,
                },
        'expand_param': {
                'prob': 0.5,
                'max_expand_ratio': 4.0,
                },
        'emit_constraint': {
            'emit_type': caffe_pb2.EmitConstraint.CENTER,
            }
        }
# 网络测试时的参数设置
test_transform_param = {
        'mean_value': [104, 117, 123],
        'resize_param': {
                'prob': 1,
                'resize_mode': P.Resize.WARP,
                'height': resize_height,
                'width': resize_width,
                'interp_mode': [P.Resize.LINEAR],
                },
        }

# If true, use batch norm for all newly added layers.
# Currently only the non batch norm version has been tested.
# 设置batchnorm,默认为false
use_batchnorm = False
lr_mult = 1
# Use different initial learning rate.
# 初始学习率
if use_batchnorm:
    base_lr = 0.0004
else:
    # A learning rate for batch_size = 1, num_gpus = 1.
    base_lr = 0.00004

# Modify the job name if you want.
job_name = "refinedet_vgg16_{}".format(resize)
# The name of the model. Modify it if you want.
model_name = "VOC0712_{}".format(job_name)

# Directory which stores the model .prototxt file.
# 网络的*.prototxt文件存储路径
save_dir = "models/VGGNet/VOC0712/{}".format(job_name)
# Directory which stores the snapshot of models.
# 网络训练的snapshot路径
snapshot_dir = "models/VGGNet/VOC0712/{}".format(job_name)
# Directory which stores the job script and log file.
# job路径,存储网络文件、训练log等
job_dir = "jobs/VGGNet/VOC0712/{}".format(job_name)
# Directory which stores the detection results.  检测结果保存路径
output_result_dir = "{}/data/RefineDet/pascal/VOCdevkit/results/VOC2007/{}/Main".format(os.environ['HOME'], job_name)

# model definition files.
train_net_file = "{}/train.prototxt".format(save_dir)
test_net_file = "{}/test.prototxt".format(save_dir)
deploy_net_file = "{}/deploy.prototxt".format(save_dir)
solver_file = "{}/solver.prototxt".format(save_dir)
# snapshot prefix.
# 模型存储名字前缀
snapshot_prefix = "{}/{}".format(snapshot_dir, model_name)
# job script path.
job_file = "{}/{}.sh".format(job_dir, model_name)

# Stores the test image names and sizes. Created by data/VOC0712/create_list.sh
# 测试文件
name_size_file = "data/VOC0712/test_name_size.txt"
# The pretrained model. We use the Fully convolutional reduced (atrous) VGGNet.
# VGG的预训练模型路径
pretrain_model = "models/VGGNet/VGG_ILSVRC_16_layers_fc_reduced.caffemodel"
# Stores LabelMapItem.
# 类别标签文件
label_map_file = "data/VOC0712/labelmap_voc.prototxt"

# MultiBoxLoss parameters.
num_classes = 21		# 网络检测的类别数,类别+背景
share_location = True
background_label_id = 0
train_on_diff_gt = True
normalization_mode = P.Loss.VALID
code_type = P.PriorBox.CENTER_SIZE
ignore_cross_boundary_bbox = False
mining_type = P.MultiBoxLoss.MAX_NEGATIVE  # 困难样本挖掘的方法,MAX和OHEM
neg_pos_ratio = 3.  # 困难样本挖掘确定样本个数的相乘因子
loc_weight = (neg_pos_ratio + 1.) / 4.  # 权值系数
# 坐标回归的损失函数, 'conf_loss_type'是分类的损失函数
multibox_loss_param = {
    'loc_loss_type': P.MultiBoxLoss.SMOOTH_L1,
    'conf_loss_type': P.MultiBoxLoss.SOFTMAX,
    'loc_weight': loc_weight,
    'num_classes': num_classes,
    'share_location': share_location,
    'match_type': P.MultiBoxLoss.PER_PREDICTION,
    'overlap_threshold': 0.5,
    'use_prior_for_matching': True,
    'background_label_id': background_label_id,
    'use_difficult_gt': train_on_diff_gt,
    'mining_type': mining_type,
    'neg_pos_ratio': neg_pos_ratio,
    'neg_overlap': 0.5,
    'code_type': code_type,
    'ignore_cross_boundary_bbox': ignore_cross_boundary_bbox,
    'objectness_score': 0.01,
    }
loss_param = {
    'normalization': normalization_mode,
    }

# parameters for generating priors.
# minimum dimension of input image
# min_dim = 512
# conv4_3 ==> 64 x 64
# conv5_3 ==> 32 x 32
# fc7 ==> 16 x 16
# conv6_2 ==> 8 x 8
# ARM模块参与TCB的特征层
arm_source_layers = ['conv4_3', 'conv5_3', 'fc7', 'conv6_2']
# ODM模块的特征融合层
odm_source_layers = ['P3', 'P4', 'P5', 'P6']
# anchor的最小scale参数,为各层stride的4倍
min_sizes = [32, 64, 128, 256]
max_sizes = [[], [], [], []]
# 为各层stride的参数
steps = [8, 16, 32, 64]
# anchor生成的长宽比例,prior_box_layer中由此生成[0.5,1.0,2.0]三种尺寸的anchor
aspect_ratios = [[2], [2], [2], [2]]
# L2 normalize conv4_3 and conv5_3.
# 特征的归一化初始参数
normalizations = [10, 8, -1, -1]
# variance used to encode/decode prior bboxes.
if code_type == P.PriorBox.CENTER_SIZE:
  prior_variance = [0.1, 0.1, 0.2, 0.2]
else:
  prior_variance = [0.1]
flip = True  # 是否翻转,比如anchor ratio的翻转
clip = False # 边界剪裁

# Solver parameters.
# Defining which GPUs to use.
# 使用GPU时所用设备设置
gpus = "0,1,2,3"
gpulist = gpus.split(",")
num_gpus = len(gpulist)

# Divide the mini-batch to different GPUs.
# 训练时候每张卡上训练图片数目设置
batch_size = 32
accum_batch_size = 32
iter_size = accum_batch_size / batch_size
solver_mode = P.Solver.CPU
device_id = 0
batch_size_per_device = batch_size
if num_gpus > 0:
  batch_size_per_device = int(math.ceil(float(batch_size) / num_gpus))
  iter_size = int(math.ceil(float(accum_batch_size) / (batch_size_per_device * num_gpus)))
  solver_mode = P.Solver.GPU
  device_id = int(gpulist[0])

# 根据Loss设置基础学习率,默认为P.Loss.VALID,因而默认学习率为0.001
if normalization_mode == P.Loss.NONE:
  base_lr /= batch_size_per_device
elif normalization_mode == P.Loss.VALID:
  base_lr *= 25. / loc_weight
elif normalization_mode == P.Loss.FULL:
  # Roughly there are 2000 prior bboxes per image.
  # TODO(weiliu89): Estimate the exact # of priors.
  base_lr *= 2000.

# Evaluate on whole test set.
# 测试网络相关参数设置,训练时网络只有train没有test
num_test_image = 4952
test_batch_size = 1
test_iter = num_test_image / test_batch_size

# solver参数配置
solver_param = {
    # Train parameters
    'base_lr': base_lr,
    'weight_decay': 0.0005,
    'lr_policy': "multistep",
    'stepvalue': [80000, 100000, 120000],
    'gamma': 0.1,
    'momentum': 0.9,
    'iter_size': iter_size,
    'max_iter': 120000,
    'snapshot': 5000,
    'display': 10,
    'average_loss': 10,
    'type': "SGD",
    'solver_mode': solver_mode,
    'device_id': device_id,
    'debug_info': False,
    'snapshot_after_train': True,
    # Test parameters
    # 'test_iter': [test_iter],
    # 'test_interval': 5000,
    # 'eval_type': "detection",
    # 'ap_version': "11point",
    # 'test_initialization': False,
    }

# parameters for generating detection output.
# 检测时候相关参数配置
det_out_param = {
    'num_classes': num_classes,  # 检测的类别数
    'share_location': share_location,
    'background_label_id': background_label_id,  # 背景类ID
    'nms_param': {'nms_threshold': 0.45, 'top_k': 1000},  # NMS阈值与NMS之后保留的检测框个数
    'keep_top_k': 500,  # 最后网络输出时的检测框个数
    'confidence_threshold': 0.01, # NMS中设置的检测置信度阈值
    'code_type': code_type,
    'objectness_score': 0.01,  # ARM中是否为目标的置信度阈值
    }

# parameters for evaluating detection results.
# 进行评估时的参数设置
det_eval_param = {
    'num_classes': num_classes,
    'background_label_id': background_label_id,
    'overlap_threshold': 0.5,
    'evaluate_difficult_gt': False,
    'name_size_file': name_size_file,
    }

### Hopefully you don't need to change the following ###
# Check file. 一些相关训练启动检查
check_if_exist(train_data)
check_if_exist(test_data)
check_if_exist(label_map_file)
check_if_exist(pretrain_model)
make_if_not_exist(save_dir)
make_if_not_exist(job_dir)
make_if_not_exist(snapshot_dir)

# Create train net.
# 调用caffe.NetSpec()初始化得到一个网络
net = caffe.NetSpec()
# CreateAnnotatedDataLayer函数是用来读取数据的,函数所在脚本:
# ~RefineDet/python/caffe/model_libs.py,这部分和SSD代码是一样的。
net.data, net.label = CreateAnnotatedDataLayer(train_data, batch_size=batch_size_per_device,
        train=True, output_label=True, label_map_file=label_map_file,
        transform_param=train_transform_param, batch_sampler=batch_sampler)

# VGG骨干网络创建
VGGNetBody(net, from_layer='data', fully_conv=True, reduced=True, dilated=False, dropout=False)

# AddExtraLayers函数是基于前面的得到VGG网络结构增加2个卷积层,
# 然后对4个层执行论文中Figure1的transfer connection block操作。
# 首先取VGG的conv4_3、conv5_3、fc7、conv6_2层输出,假设输入图像大小是320*320,
# 那么这4层的输出feature map大小分别是40*40,20*20,10*10,5*5。
# 这4层就对应Figure1中Anchor Refinement Module部分的4个灰色矩形块,
# 接着从这4个矩形块引出Transfer Connection Block得到P6,P5,P4,P3,
# 也就是Figure1中Object Detection Module部分的4个蓝色矩形块。
# 这就是AddExtraLayers函数实现的内容。
AddExtraLayers(net, use_batchnorm, arm_source_layers, normalizations, lr_mult=lr_mult)
# 因为前面AddExtraLayers函数中对arm_source_layers执行了reverse操作,所以这里相当于再反转回来。
arm_source_layers.reverse()
normalizations.reverse()

# CreateRefineDetHead函数用来生成分类层、回归层等,是比较重要的一个函数,
# 经过该函数后返回的mbox_layers就是完整的网络结构输出,该函数所在脚本:
# ~RefineDet/python/caffe/model_libs.py,这部分代码也是在原来SSD的
# CreateMultiBoxHead函数基础上修改得到的。
# 该函数有两个重要输入:from_layers=arm_source_layers和
# from_layers2=odm_source_layers,前者是Figuer1中4个灰色矩形块的集合(arm是
# Anchor Refinement Module的缩写);后者是Figure1中4个蓝色矩形块的集合(odm是
# Object Detection Module的缩写),初始化为['P3', 'P4', 'P5', 'P6'],
# 这也是本文和SSD算法比较大的不同点。最后大概介绍下返回结果mbox_layers的内容:
# mbox_layers[0]是"arm_loc",表示bbox的回归输出;
# mbox_layers[1]是"arm_conf",表示bbox的分类输出(是否是object的二分类);
# mbox_layers[2]是"arm_priorbox",表示priorbox(anchor)的信息;
# mbox_layers[3]是”odm_conf“,表示bbox的回归输出;
# mbox_layers[4]是”odm_loc“,表示bbox的分类输出(类别数是所有object的类别数+背景)。
mbox_layers = CreateRefineDetHead(net, data_layer='data', from_layers=arm_source_layers,
        use_batchnorm=use_batchnorm, min_sizes=min_sizes, max_sizes=max_sizes,
        aspect_ratios=aspect_ratios, steps=steps, normalizations=[],
        num_classes=num_classes, share_location=share_location, flip=flip, clip=clip,
        prior_variance=prior_variance, kernel_size=3, pad=1, lr_mult=lr_mult, from_layers2=odm_source_layers)

# 定义好网络结构后,就要定义损失函数了。
# 先定义”arm_loss“,通过L.MultiBoxLoss接口来计算损失函数。mbox_layers_arm列表
# 就是保存了bbox的回归输出(”arm_loc“)、分类输出(”arm_conf“)、anchor(或者叫priorbox)
# 信息(”arm_priorbox“)、ground truth信息(net.label),multibox_loss_param_arm中
# 除了分类类别数(”num_classes=2“)与原来的配置不同外,其他都是沿用原来的配置。
# 回传损失只回传前面两个变量。这部分损失基本上和RPN网络类似,分类部分的损失根据
# mbox_layers[1](”arm_conf“)和net.label来得到,回归部分的损失根据
# mbox_layers[0](”arm_loc“)和mbox_layers[2](”arm_priorbox“)来得到。
# MultiBoxLoss是自定义层,最早见于SSD算法中,这里稍作修改,源码参考
# https://github.com/sfzhang15/RefineDet/blob/master/src/caffe/layers/multibox_loss_layer.cpp。
# 不过在这里调用时候和在SSD中调用没有区别(输入列表都是4个变量),只不过分类的类别数变化了而已。
name = "arm_loss"
mbox_layers_arm = []
mbox_layers_arm.append(mbox_layers[0]) # mbox_layers[0]是"arm_loc",表示bbox的回归输出
mbox_layers_arm.append(mbox_layers[1]) # mbox_layers[1]是"arm_conf",表示bbox的分类输出(是否是object的二分类)
mbox_layers_arm.append(mbox_layers[2]) # mbox_layers[2]是"arm_priorbox",表示priorbox(anchor)的信息
mbox_layers_arm.append(net.label)
multibox_loss_param_arm = multibox_loss_param.copy()
multibox_loss_param_arm['num_classes'] = 2
net[name] = L.MultiBoxLoss(*mbox_layers_arm, multibox_loss_param=multibox_loss_param_arm,
        loss_param=loss_param, include=dict(phase=caffe_pb2.Phase.Value('TRAIN')),
        propagate_down=[True, True, False, False])

# 这一部分代码主要是将net["arm_conf"]作为softmax函数的输入,并得到分类概率输出net[flatten_name],
# 或者写成net["arm_conf_flatten"]。net["arm_conf"]是前面arm部分的二分类输出结果,
# 因此这部分操作和Faster RCNN中得到proposal的过程几乎是一样的。
# Create the MultiBoxLossLayer.
conf_name = "arm_conf"
reshape_name = "{}_reshape".format(conf_name)
net[reshape_name] = L.Reshape(net[conf_name], shape=dict(dim=[0, -1, 2]))
softmax_name = "{}_softmax".format(conf_name)
net[softmax_name] = L.Softmax(net[reshape_name], axis=2)
flatten_name = "{}_flatten".format(conf_name)
net[flatten_name] = L.Flatten(net[softmax_name], axis=1)

# 定义”odm_loss“,也是通过L.MultiBoxLoss接口来计算损失函数。
# mbox_layers_odm列表保存了bbox的回归输出(”odm_loc“)、分类输出(”odm_conf“)、
# anchor(或者叫priorbox)信息(”arm_priorbox“)、gound truth信息(net.label)、
# 分类的概率输出(net["arm_conf_flatten"])、ARM部分的bbox的回归输出(net[”arm_loc“])。
# 除了前面两个变量外,后面4个变量都是为了做bbox的过滤和正负样本的平衡,因此回传损失只回传前面两个变量。
# 这部分的损失函数计算和SSD算法类似,分类支路的损失根据mbox_layers[4](”odm_conf“)和
# net.label得到,回归支路的损失根据mbox_layers[3](”odm_loc“)和
# mbox_layers[2](”arm_priorbox“)得到。这里可以看到输入列表变成6个变量,
# 这里就是RefineNet中对MultiBoxLossLayer的修改:增加了两个输入变量。
# 这部分非常重要,也是RefineDet的一个亮点的体现。这两个变量中的net["arm_conf_flatten"]
# 主要参与到hard negative mining过程,是对负样本排序和sample的(文中说的是负样本
# 的confidence(也就是判为负样本的概率)大于阈值0.99,则该样本不参与到ODM部分的训练)。
# 另一个变量net[”arm_loc“]提供了bbox的初始坐标,有利于检测网络得到更准确的结果。
name = "odm_loss"
mbox_layers_odm = []
mbox_layers_odm.append(mbox_layers[3]) # mbox_layers[3]是”odm_conf“,表示bbox的回归输出
mbox_layers_odm.append(mbox_layers[4]) # mbox_layers[4]是”odm_loc“,表示bbox的分类输出(类别数是所有object的类别数+背景)
mbox_layers_odm.append(mbox_layers[2]) # mbox_layers[2]是"arm_priorbox",表示priorbox(anchor)的信息
mbox_layers_odm.append(net.label)
mbox_layers_odm.append(net[flatten_name])
mbox_layers_odm.append(mbox_layers[0]) # mbox_layers[0]是"arm_loc",表示bbox的回归输出
net[name] = L.MultiBoxLoss(*mbox_layers_odm, multibox_loss_param=multibox_loss_param,
        loss_param=loss_param, include=dict(phase=caffe_pb2.Phase.Value('TRAIN')),
        propagate_down=[True, True, False, False, False, False])

# 测试相关网络创建,网络定义文件导出等,就不作解释
......

您的支持,是我不断创作的最大动力~

可以关注我的个人微信公众号:AI研习图书馆

交流学习,共同进步~

欢迎点赞关注留言交流~

深度学习,乐此不疲~

在这里插入图片描述

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

AI研习图书馆

您的鼓励将是我创作的最大动力~

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值