行人检测0-08:LFFD-源码无死角解析(3)-网络架构讲解

以下链接是个人关于LFFD(行人检测)所有见解,如有错误欢迎大家指出,我会第一时间纠正。有兴趣的朋友可以加微信:17575010159 相互讨论技术。若是帮助到了你什么,一定要记得点赞!因为这是对我最大的鼓励。 文 末 附 带 \color{blue}{文末附带} 公 众 号 − \color{blue}{公众号 -} 海 量 资 源 。 \color{blue}{ 海量资源}。

行人检测0-00:LFFD-史上最新无死角详细解读:https://blog.csdn.net/weixin_43013761/article/details/102592374

思路引导

在原论文中,存在如下图示:
在这里插入图片描述
大家要注意一点的是,LFFD这篇论文第一次发表的是关于人脸的检测 ,在 论文中也明确的有说到,其有8个branch,但是在行人检测的代码中,只实现了红色框中的四个branch,为什么这样呢?可能是因为发现计算出所有的branch,准确率没有太大的提高,在速度和准确率之间,选择了前者吧。有了这个想法,我们就开始看代码吧:

代码解析

前面我们已经知道,代码训练开始是从pedestrian_detection/config_farm/configuration_30_320_20L_4scales_v1.py开始的,

def run():
	    # 把网络导入
    from symbol_farm import symbol_30_320_20L_4scales_v1 as net
    
    # 网络加载
    net_symbol, data_names, label_names = net.get_net_symbol()

从这里可以知道 ,网络构建的代码在pedestrian_detection/symbol_farm/symbol_30_320_20L_4scales_v1.py中,并且在 symbol_farm文件夹下面,我们可以看到symbol_structures.xlsx文件,其就是网络详细的构建图,这里就不粘贴复制了,我们 先看看get_net_symbol(deploy_flag=False)函数:

# 已经证明,作者源码(默认)使用mask损失的,即deploy_flag=False,注意,这里所指的是训练时
def get_net_symbol(deploy_flag=False):
    data_names = ['data']

    label_names = ['mask_1', 'label_1',
                   'mask_2', 'label_2',
                   'mask_3', 'label_3',
                   'mask_4', 'label_4',]

    # batch data,网络的正常运行,至少输入下面的参数,data表示的是图像像素,其他的为各个branch计算损失时,所需要的标签
    data = mxnet.symbol.Variable(name='data', shape=(cfg.param_train_batch_size, cfg.param_num_image_channel, cfg.param_net_input_height, cfg.param_net_input_width))
    label_1 = mxnet.symbol.Variable(name='label_1', shape=(cfg.param_train_batch_size, cfg.param_num_output_channels, cfg.param_feature_map_size_list[0], cfg.param_feature_map_size_list[0]))
    mask_1 = mxnet.symbol.Variable(name='mask_1', shape=(cfg.param_train_batch_size, cfg.param_num_output_channels, cfg.param_feature_map_size_list[0], cfg.param_feature_map_size_list[0]))
    label_2 = mxnet.symbol.Variable(name='label_2', shape=(cfg.param_train_batch_size, cfg.param_num_output_channels, cfg.param_feature_map_size_list[1], cfg.param_feature_map_size_list[1]))
    mask_2 = mxnet.symbol.Variable(name='mask_2', shape=(cfg.param_train_batch_size, cfg.param_num_output_channels, cfg.param_feature_map_size_list[1], cfg.param_feature_map_size_list[1]))
    label_3 = mxnet.symbol.Variable(name='label_3', shape=(cfg.param_train_batch_size, cfg.param_num_output_channels, cfg.param_feature_map_size_list[2], cfg.param_feature_map_size_list[2]))
    mask_3 = mxnet.symbol.Variable(name='mask_3', shape=(cfg.param_train_batch_size, cfg.param_num_output_channels, cfg.param_feature_map_size_list[2], cfg.param_feature_map_size_list[2]))
    label_4 = mxnet.symbol.Variable(name='label_4', shape=(cfg.param_train_batch_size, cfg.param_num_output_channels, cfg.param_feature_map_size_list[3], cfg.param_feature_map_size_list[3]))
    mask_4 = mxnet.symbol.Variable(name='mask_4', shape=(cfg.param_train_batch_size, cfg.param_num_output_channels, cfg.param_feature_map_size_list[3], cfg.param_feature_map_size_list[3]))

    # 把像素正则化到-1~0之间
    data = (data - 127.5) / 127.5

    # conv block 1 ---------------------------------------------------------------------------------------
    conv1 = mxnet.symbol.Convolution(data=data,kernel=(3, 3),stride=(2, 2),pad=(0, 0),num_filter=num_filters_list[1],name='conv1')
    relu1 = mxnet.symbol.Activation(data=conv1, act_type='relu', name='relu_conv1')

    # conv block 2 ----------------------------------------------------------------------------------------
    conv2 = mxnet.symbol.Convolution(data=relu1,kernel=(3, 3),stride=(2, 2),pad=(0, 0),num_filter=num_filters_list[1],name='conv2')
    relu2 = mxnet.symbol.Activation(data=conv2, act_type='relu', name='relu_conv2')

    # conv block 3 ----------------------------------------------------------------------------------------
    conv3 = mxnet.symbol.Convolution(data=relu2,kernel=(3, 3),stride=(2, 2),pad=(0, 0),num_filter=num_filters_list[1],name='conv3')
    relu3 = mxnet.symbol.Activation(data=conv3, act_type='relu', name='relu_conv3')

    # conv block 4 ----------------------------------------------------------------------------------------
    conv4 = mxnet.symbol.Convolution(data=relu3, kernel=(3, 3),stride=(1, 1),pad=(1, 1), num_filter=num_filters_list[1], name='conv4')
    relu4 = mxnet.symbol.Activation(data=conv4, act_type='relu', name='relu_conv4')

    # conv block 5 ----------------------------------------------------------------------------------------
    conv5 = mxnet.symbol.Convolution(data=relu4,kernel=(3, 3),stride=(1, 1),pad=(1, 1),num_filter=num_filters_list[1],name='conv5')
    conv5 = conv3 + conv5
    relu5 = mxnet.symbol.Activation(data=conv5, act_type='relu', name='relu_conv5')

    # conv block 6 ----------------------------------------------------------------------------------------
    conv6 = mxnet.symbol.Convolution(data=relu5,kernel=(3, 3),stride=(1, 1),pad=(1, 1),num_filter=num_filters_list[1],name='conv6')
    relu6 = mxnet.symbol.Activation(data=conv6, act_type='relu', name='relu_conv6')

    # conv block 7 ----------------------------------------------------------------------------------------
    conv7 = mxnet.symbol.Convolution(data=relu6,kernel=(3, 3),stride=(1, 1),pad=(1, 1),num_filter=num_filters_list[1],name='conv7')
    conv7 = conv5 + conv7
    relu7 = mxnet.symbol.Activation(data=conv7, act_type='relu', name='relu_conv7')

    # conv block 8 ----------------------------------------------------------------------------------------
    conv8 = mxnet.symbol.Convolution(data=relu7,kernel=(3, 3),stride=(1, 1),pad=(1, 1),num_filter=num_filters_list[1],name='conv8')
    relu8 = mxnet.symbol.Activation(data=conv8, act_type='relu', name='relu_conv8')

    # conv block 9 ----------------------------------------------------------------------------------------
    conv9 = mxnet.symbol.Convolution(data=relu8,kernel=(3, 3),stride=(1, 1),pad=(1, 1),num_filter=num_filters_list[1],name='conv9')
    conv9 = conv9 + conv7
    relu9 = mxnet.symbol.Activation(data=conv9, act_type='relu', name='relu_conv9')

    # conv block 10 ----------------------------------------------------------------------------------------
    conv10 = mxnet.symbol.Convolution(data=relu9,kernel=(3, 3),stride=(1, 1),pad=(1, 1),num_filter=num_filters_list[1],name='conv10')
    relu10 = mxnet.symbol.Activation(data=conv10, act_type='relu', name='relu_conv10')

    # conv block 11 ----------------------------------------------------------------------------------------
    conv11 = mxnet.symbol.Convolution(data=relu10,kernel=(3, 3),stride=(1, 1),pad=(1, 1),num_filter=num_filters_list[1],name='conv11')
    conv11 = conv11 + conv9
    relu11 = mxnet.symbol.Activation(data=conv11, act_type='relu', name='relu_conv11')

    # loss 1 RF:143 ,该处输出的特征图为[6,143,143]----------------------------------------------------------------------------------------------------
    # for scale [30,60],人脸被缩放为30~60----------------------------------------------------------------------------------------
    # 默认deploy_flag=False
    if deploy_flag:
        predict_score_1, predict_bbox_1 = loss_branch(relu11, 'conv11', deploy_flag=deploy_flag)
    # 执行该处代码
    else:
        loss_score_1, loss_bbox_1 = loss_branch(relu11, 'conv11', mask=mask_1, label=label_1)


    # conv block 12 ----------------------------------------------------------------------------------------
    conv12 = mxnet.symbol.Convolution(data=relu11,kernel=(3, 3),stride=(2, 2),pad=(0, 0),num_filter=num_filters_list[2],name='conv12')
    relu12 = mxnet.symbol.Activation(data=conv12, act_type='relu', name='relu_conv12')

    # conv block 13 ----------------------------------------------------------------------------------------
    conv13 = mxnet.symbol.Convolution(data=relu12,kernel=(3, 3),stride=(1, 1),pad=(1, 1),num_filter=num_filters_list[2],name='conv13')
    relu13 = mxnet.symbol.Activation(data=conv13, act_type='relu', name='relu_conv13')

    # conv block 14 ----------------------------------------------------------------------------------------
    conv14 = mxnet.symbol.Convolution(data=relu13,kernel=(3, 3),stride=(1, 1),pad=(1, 1),num_filter=num_filters_list[2],name='conv14')
    conv14 = conv14 + conv12
    relu14 = mxnet.symbol.Activation(data=conv14, act_type='relu', name='relu_conv14')


    # loss 2 RF:223 ----------------------------------------------------------------------------------------------------
    # for scale [60,100]----------------------------------------------------------------------------------------
    if deploy_flag:
        predict_score_2, predict_bbox_2 = loss_branch(relu14, 'conv14', deploy_flag=deploy_flag)
    else:
        loss_score_2, loss_bbox_2 = loss_branch(relu14, 'conv14', mask=mask_2, label=label_2)

    # conv block 15 ----------------------------------------------------------------------------------------
    conv15 = mxnet.symbol.Convolution(data=relu14,kernel=(3, 3),stride=(2, 2),pad=(0, 0),num_filter=num_filters_list[2],name='conv15')
    relu15 = mxnet.symbol.Activation(data=conv15, act_type='relu', name='relu_conv15')

    # conv block 16 ----------------------------------------------------------------------------------------
    conv16 = mxnet.symbol.Convolution(data=relu15,kernel=(3, 3),stride=(1, 1),pad=(1, 1),num_filter=num_filters_list[2],name='conv16')
    relu16 = mxnet.symbol.Activation(data=conv16, act_type='relu', name='relu_conv16')

    # conv block 17 ----------------------------------------------------------------------------------------
    conv17 = mxnet.symbol.Convolution(data=relu16,kernel=(3, 3),stride=(1, 1),pad=(1, 1),num_filter=num_filters_list[2], name='conv17')
    conv17 = conv17 + conv15
    relu17 = mxnet.symbol.Activation(data=conv17, act_type='relu', name='relu_conv17')

    # loss 3 RF:383 ----------------------------------------------------------------------------------------------------
    # for scale [100,180]----------------------------------------------------------------------------------------
    if deploy_flag:
        predict_score_3, predict_bbox_3 = loss_branch(relu17, 'conv17', deploy_flag=deploy_flag)
    else:
        loss_score_3, loss_bbox_3 = loss_branch(relu17, 'conv17', mask=mask_3, label=label_3)

    # conv block 18 ----------------------------------------------------------------------------------------
    conv18 = mxnet.symbol.Convolution(data=relu17, kernel=(3, 3), stride=(2, 2),pad=(0, 0), num_filter=num_filters_list[2],name='conv18')
    relu18 = mxnet.symbol.Activation(data=conv18, act_type='relu', name='relu_conv18')

    # conv block 19 ----------------------------------------------------------------------------------------
    conv19 = mxnet.symbol.Convolution(data=relu18, kernel=(3, 3), stride=(1, 1),pad=(1, 1), num_filter=num_filters_list[2],name='conv19')
    relu19 = mxnet.symbol.Activation(data=conv19, act_type='relu', name='relu_conv19')

    # conv block 20 ----------------------------------------------------------------------------------------
    conv20 = mxnet.symbol.Convolution(data=relu19, kernel=(3, 3), stride=(1, 1),pad=(1, 1), num_filter=num_filters_list[2],
                                      name='conv20')
    conv20 = conv20 + conv18
    relu20 = mxnet.symbol.Activation(data=conv20, act_type='relu', name='relu_conv20')

    # loss 4 RF:703 ----------------------------------------------------------------------------------------------------
    # for scale [180, 320]----------------------------------------------------------------------------------------
    if deploy_flag:
        predict_score_4, predict_bbox_4 = loss_branch(relu20, 'conv20', deploy_flag=deploy_flag)
    else:
        loss_score_4, loss_bbox_4 = loss_branch(relu20, 'conv20', mask=mask_4, label=label_4)

    if deploy_flag:
        net = mxnet.symbol.Group([predict_score_1, predict_bbox_1,
                                  predict_score_2, predict_bbox_2,
                                  predict_score_3, predict_bbox_3,
                                  predict_score_4, predict_bbox_4,])

        return net
    else:
        net = mxnet.symbol.Group([loss_score_1, loss_bbox_1,
                                  loss_score_2, loss_bbox_2,
                                  loss_score_3, loss_bbox_3,
                                  loss_score_4, loss_bbox_4,])

        return net, data_names, label_names

大家可以看到,其中基本没有什么注释,其实也不知道注释些什么,无非就是一些卷积 ,然后得到 特征图,但是其中计算特征图损失的函数,还是有必要讲解一下:

# 第一次 输入input_data[143,143]
def loss_branch(input_data, prefix_name, mask=None, label=None, deploy_flag=False):
    branch_conv1 = mxnet.symbol.Convolution(data=input_data,kernel=(1, 1),stride=(1, 1),pad=(0, 0),num_filter=num_filters_list[2],name=prefix_name + '_1')
    branch_relu1 = mxnet.symbol.Activation(data=branch_conv1, act_type='relu', name='relu_' + prefix_name + '_1')

    # face classification,这里是是获得预测人脸概率的特征图
    branch_conv2_score = mxnet.symbol.Convolution(data=branch_relu1,kernel=(1, 1),stride=(1, 1),pad=(0, 0),num_filter=num_filters_list[2],name=prefix_name + '_2_score')
    branch_relu2_score = mxnet.symbol.Activation(data=branch_conv2_score, act_type='relu', name='relu_' + prefix_name + '_2_score')
    branch_conv3_score = mxnet.symbol.Convolution(data=branch_relu2_score,kernel=(1, 1),stride=(1, 1),pad=(0, 0),num_filter=2,name=prefix_name + '_3_score')

    # bbox regression,这里是是获得预测bbox回归框的特征图
    branch_conv2_bbox = mxnet.symbol.Convolution(data=branch_relu1,kernel=(1, 1),stride=(1, 1),pad=(0, 0),num_filter=num_filters_list[2],name=prefix_name + '_2_bbox')
    branch_relu2_bbox = mxnet.symbol.Activation(data=branch_conv2_bbox, act_type='relu', name='relu_' + prefix_name + '_2_bbox')
    branch_conv3_bbox = mxnet.symbol.Convolution(data=branch_relu2_bbox,kernel=(1, 1),stride=(1, 1),pad=(0, 0),num_filter=4,name=prefix_name + '_3_bbox')

    # 训练时,该段代码没有执行
    if deploy_flag:
        # 特征图经过softmax预测 特征图对应原图的概率
        predict_score = mxnet.symbol.softmax(data=branch_conv3_score, axis=1)
        predict_score = mxnet.symbol.slice_axis(predict_score, axis=1, begin=0, end=1)

        # 获得预测box的坐标
        predict_bbox = branch_conv3_bbox

        return predict_score, predict_bbox
    else:
        # 把第0个通道和第1个通道分割出来,表示正负样本,或者所人脸的置信度
        mask_score = mxnet.symbol.slice_axis(mask, axis=1, begin=0, end=2)
        # 把第0个通道和第1个通道分割出来,表示正负样本,或者所人脸的置信度
        label_score = mxnet.symbol.slice_axis(label, axis=1, begin=0, end=2)
        # 结合mask_score与label_score同预测的branch_conv3_scor计算分类损失
        loss_score = mxnet.symbol.Custom(pred=branch_conv3_score, label=label_score, mask=mask_score, hnm_ratio=cfg.param_hnm_ratio,
                                         op_type='cross_entropy_with_hnm_for_one_class_detection', name=prefix_name + '_loss_score')

        # 把第2个通道到第5个通道分割出来,对于mask其内容为0或者1的像素特征图
        mask_bbox = mxnet.symbol.slice_axis(mask, axis=1, begin=2, end=6)

        # 与mask相乘之后,主要实现这样一个目的,因为predict_bbox预测出来之后,可能每个特征图像素都有值(就是不为0),但是有的像素对应
        # 的是一个负样本(负样本没有box,其值应该为0),通过相乘,负样本为特征图像素为0,正样本保持原来的box数值(和1相乘)
        predict_bbox = branch_conv3_bbox * mask_bbox
        label_bbox = mxnet.symbol.slice_axis(label, axis=1, begin=2, end=6) * mask_bbox
        # 求出box损失
        loss_bbox = mxnet.symbol.LinearRegressionOutput(data=predict_bbox, label=label_bbox, name=prefix_name + '_loss_bbox')

        return loss_score, loss_bbox

通过这里,大家应该就能很清楚的明白为什么需要mask了吧,通过掩码,就能把,映射到原图上没有box的元素归0,如果映射到 原图中,存在box的,则预测的box坐标数值不变。也就是 ,只有准在box的特征图上的点,才会进行loss计算。

其实LFFD思维还是挺简单的,那么今天就到这里吧!

在这里插入图片描述

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

江南才尽,年少无知!

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值