【BEV】BEVDet

zgq016

已于 2023-04-02 19:35:42 修改

阅读量1.5k

点赞数 3

文章标签：深度学习人工智能机器学习

于 2023-03-17 21:38:47 首次发布

本文链接：https://blog.csdn.net/guangqianzhang/article/details/129615932

版权

BEVDet 解析

BEVDet

BEVDet

BEVDet继承于CenterPoint–>MVTwoStageDetector
模型实现基于MMlab MMdet3D框架
该算法基于Centeroint点云检测，通过多视角图像估计深度，形成层视锥形点云，进而生成BEV视角下的pillar点云主体，完成点云检测。

下面是根据代码绘制结构

在这里插入图片描述

模型

在这里插入图片描述

bevdet-r50

模块	type		模块	type
img_backbone	`'ResNet'`
img_neck	`CustomFPN`	`[1024,2048]->512`
img_view_transformer	`LSSViewTransformer`	`512->80`
img_bev_encoder_backbone	`CustomResNet`	`80->[80x2,80x4,80x8`
img_bev_encoder_neck	`FPN_LSS`	`80x8+80*2->256`
pts_bbox_head	`CenterHead`	`256->`	bbox_coder	`CenterPointBBoxCoder`
			separate_head	`SeparateHead`
			loss_cls	`GaussianFocalLoss`
			loss_bbox	`L1Loss`

model = dict(
    type='BEVDet',
    img_backbone=dict(
        pretrained='torchvision://resnet50',
        type='ResNet',
        depth=50,
        num_stages=4,   # 该网络共有4个阶段
        out_indices=(2, 3), ## 要网络的第2个和第3个阶段的特征图作为输出
        frozen_stages=-1,  # 将所有层的权重都冻结，只训练最后一层或几层的权重
        norm_cfg=dict(type='BN', requires_grad=True),
        norm_eval=False,  # 当norm_eval=False时，归一化层将处于训练模式，它将使用当前的batch的均值和方差来归一化输入数据。当norm_eval=True时，归一化层将处于评估模式，它将使用先前存储的移动平均均值和方差来归一化输入数据。
        with_cp=True,# 特征金字塔网络在进行特征融合时会使用copy操作
        style='pytorch'), 
    img_neck=dict(
        type='CustomFPN',
        in_channels=[1024, 2048],
        out_channels=512,
        num_outs=1,
        start_level=0, # 从网络的第0层开始进行特征提取
        out_ids=[0]),  # 特征金字塔网络中的第0个特征图
    img_view_transformer=dict(
        type='LSSViewTransformer',
        grid_config=grid_config,
        input_size=data_config['input_size'],
        in_channels=512,
        out_channels=numC_Trans,
        downsample=16),
    img_bev_encoder_backbone=dict(
        type='CustomResNet',
        numC_input=numC_Trans,
        num_channels=[numC_Trans * 2, numC_Trans * 4, numC_Trans * 8]),
    img_bev_encoder_neck=dict(
        type='FPN_LSS',
        in_channels=numC_Trans * 8 + numC_Trans * 2,
        out_channels=256),
    pts_bbox_head=dict(
        type='CenterHead',  # BEVDet继承Centerpoints
        in_channels=256,
        tasks=[
            dict(num_class=1, class_names=['car']),
            dict(num_class=2, class_names=['truck', 'construction_vehicle']),
            dict(num_class=2, class_names=['bus', 'trailer']),
            dict(num_class=1, class_names=['barrier']),
            dict(num_class=2, class_names=['motorcycle', 'bicycle']),
            dict(num_class=2, class_names=['pedestrian', 'traffic_cone']),
        ],
        common_heads=dict(
            reg=(2, 2), height=(1, 2), dim=(3, 2), rot=(2, 2), vel=(2, 2)),
        share_conv_channel=64,
        bbox_coder=dict(
            type='CenterPointBBoxCoder',
            pc_range=point_cloud_range[:2],
            post_center_range=[-61.2, -61.2, -10.0, 61.2, 61.2, 10.0],
            max_num=500,
            score_threshold=0.1,
            out_size_factor=8,
            voxel_size=voxel_size[:2],
            code_size=9),
        separate_head=dict(
            type='SeparateHead', init_bias=-2.19, final_kernel=3),
        loss_cls=dict(type='GaussianFocalLoss', reduction='mean'),
        loss_bbox=dict(type='L1Loss', reduction='mean', loss_weight=0.25),
        norm_bbox=True),
    # model training and testing settings
    train_cfg=dict(
        pts=dict(
            point_cloud_range=point_cloud_range,
            grid_size=[1024, 1024, 40],
            voxel_size=voxel_size,
            out_size_factor=8,
            dense_reg=1,
            gaussian_overlap=0.1,
            max_objs=500,
            min_radius=2,
            code_weights=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.2, 0.2])),
    test_cfg=dict(
        pts=dict(
            pc_range=point_cloud_range[:2],
            post_center_limit_range=[-61.2, -61.2, -10.0, 61.2, 61.2, 10.0],
            max_per_img=500,
            max_pool_nms=False,
            min_radius=[4, 12, 10, 1, 0.85, 0.175],
            score_threshold=0.1,
            out_size_factor=8,
            voxel_size=voxel_size[:2],
            pre_max_size=1000,
            post_max_size=83,

            # Scale-NMS
            nms_type=[
                'rotate', 'rotate', 'rotate', 'circle', 'rotate', 'rotate'
            ],
            nms_thr=[0.2, 0.2, 0.2, 0.2, 0.2, 0.5],
            nms_rescale_factor=[
                1.0, [0.7, 0.7], [0.4, 0.55], 1.1, [1.0, 1.0], [4.5, 9.0]
            ])))

训练配置

point_cloud_range = [-51.2, -51.2, -5.0, 51.2, 51.2, 3.0]

train_pipeline	test_pipeline
`PrepareImageInputs`	`PrepareImageInputs`
`LoadAnnotationsBEVDepth`	`LoadAnnotationsBEVDepth`
`ObjectRangeFilter`	`LoadPointsFromFile`
`ObjectNameFilter`	`MultiScaleFlipAug3D`
`DefaultFormatBundle3D`	`（DefaultFormatBundle3D`
`Collect3D`	`Collect3D）`

Scale NMS

            # Scale-NMS
            nms_type=[
                'rotate', 'rotate', 'rotate', 'circle', 'rotate', 'rotate'
            ],
            nms_thr=[0.2, 0.2, 0.2, 0.2, 0.2, 0.5],
            nms_rescale_factor=[
                1.0, [0.7, 0.7], [0.4, 0.55], 1.1, [1.0, 1.0], [4.5, 9.0]
            ]

优化配置

optimizer	lr	lr_config
`AdamW`	2e-4	policy=`step`

推理记录

模块	子模块	子模块	x_size块	mean
extract_img_feat	image_encoder	img_backbone `ResNet`	([1, 1024, 16, 44]) ([1, 2048, 8, 22])	2,3特征图
	image_encoder	img_neck `CustomFPN`	([1, 512, 16, 44])	融合后特征
	img_view_transformer		([1, 59, 16, 44])	depth
	bev_encoder	`CustomResNet` `FPN_LSS`	[1, 256, 128, 128]	BEV特征
pts_bbox_head	CenterHead	`SeparateHead`	Loss	多任务检测

注册

注册机制通过cfg中关键字type对已经注册类进行对应实现。

obj_type = args.pop('type')
    if isinstance(obj_type, str):
        obj_cls = registry.get(obj_type)
        if obj_cls is None:
            raise KeyError(
                f'{obj_type} is not in the {registry.name} registry')
    elif inspect.isclass(obj_type) or inspect.isfunction(obj_type):
        obj_cls = obj_type
    else:
        raise TypeError(
            f'type must be a str or valid type, but got {type(obj_type)}')
    try:
        return obj_cls(**args)

注意：利用deepcopy实现参数的传递和隔离

随机种子

在相同中下，随机数相同，即此通过函数实现的随机数为伪随机数。类似为一元函数关系，相同输入产生同一个随机值。特别的是在产生随机数后将会产生新的随机种子，所以在重复使用随机函数时会输出不同的随机值，因为第二次的随机‘种子x’已经不一样了

总结

mmlab框架已经对各个基础模块进行封装，和一些功能模块的解耦。在使用的时候可以不用深究细节，==严禁重复造轮子！！！！==

严禁重复造轮子！！！！

zgq016

关注

3
点赞
踩
14

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫