BEVDET环境配置和代码学习

最新推荐文章于 2024-05-31 16:22:19 发布

Ethannnn~

最新推荐文章于 2024-05-31 16:22:19 发布

阅读量694

点赞数

文章标签：学习

本文链接：https://blog.csdn.net/Ethan__Ji/article/details/134059750

版权

配置环境

配置环境进行了好多次尝试（大部分都是失败告终，，）记录一下遇到的问题（坑）还有解决方案

主要参考这个专栏。

实验室服务器

conda create -n bevdet python=3.8
conda activate bevdet

# 3 虚拟环境bevdet中安装torch
pip install torch==1.10.0+cu113 torchvision==0.11.0+cu113 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

# 4 虚拟环境bevdet中安装openlib相关库 mmcv-full安装耗时比较长，只要鼠标没卡住都是正常
pip install mmcv-full==1.5.3 onnxruntime-gpu==1.8.1 mmdet==2.25.1 mmsegmentation==0.25.0

# 5 进入BEVDet工程目录,安装mmdet3d
pip install -e -v .

# 6 安装其他依赖 numpy==1.23.4 setuptools==58.2.0等
pip install pycuda lyft_dataset_sdk networkx==2.2 numba==0.53.0 numpy==1.23.4 nuscenes-devkit plyfile scikit-image tensorboard trimesh==2.35.39 setuptools==58.2.0

报错：

RuntimeError: The detected CUDA version (10.2) mismatches the version that was used to compile PyTorch (11.3). Please make sure to use the same CUDA versions.

查看cuda版本：

cat /usr/local/cuda/version.txt
---
CUDA Version 10.2.89

淦，实验室服务器最高到cuda版本10.2 的，放弃（这是可以说的吗）

在自己电脑上配置环境：

mac连cuda都没得装，，别想了，，

尝试云服务器autodl

安装Torch

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu113

安装mmcv-full报错，

尝试重新安装

conda install -c conda-forge mmcv-full==1.5.3

还是不行？？什么情况？？

最后发现上面的torch安装（按照cu113）会直接安装最新的pytorch1.12.0而不是要的1.10.0，，只好全部重新搞了，，，

全部木大！

安装torch和cuda：

conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge

pip install mmcv-full==1.5.3 onnxruntime-gpu==1.8.1 mmdet==2.25.1 mmsegmentation==0.25.0

# 先进入项目文件夹
pip install -e -v .

pip install pycuda lyft_dataset_sdk networkx==2.2 numba==0.53.0 numpy==1.23.4 nuscenes-devkit plyfile scikit-image tensorboard trimesh==2.35.39 setuptools==58.2.0

这次安装至少是没报错，，算是配置好了，，

开练！

先准备数据集，参考这个博客的第二小节以及readme文件

# 1 新建ckpts文件夹并进入
mkdir ckpts && cd ckpts

# 2 下载resnet50-0676ba61.pth
wget https://download.pytorch.org/models/resnet50-0676ba61.pth

经常是第一次下载很慢，重新下载就快一些了

修改configs/bevdet/bevdet-r50.py参数

pretrained='./ckpts/resnet50-0676ba61.pth'

samples_per_gpu=1,  # 每个 GPU 上的样本数, 决定了每个GPU的批量大小（batch size）
workers_per_gpu=0,  # 每个 GPU 上的工作线程数，也就是用来加载数据的子进程数。
                    # 这个参数决定了数据加载的速度和效率
  
 max_epochs = 2

OO，启动！

python tools/train.py ./configs/bevdet/bevdet-r50.py

报错：

File "/root/miniconda3/envs/bevdet/lib/python3.8/site-packages/numba/np/ufunc/decorators.py", line 3, in <module>
    from numba.np.ufunc import _internal
SystemError: initialization of _internal failed without raising an exception

重新安装了numpy==1.23.4，问题解决

File "/root/miniconda3/envs/bevdet/lib/python3.8/site-packages/mmcv/utils/config.py", line 508, in pretty_text
    text, _ = FormatCode(text, style_config=yapf_style, verify=True)
TypeError: FormatCode() got an unexpected keyword argument 'verify'

参考这个issue，pip install yapf==0.40.1 问题解决

File "/root/miniconda3/envs/bevdet/lib/python3.8/site-packages/torch/serialization.py", line 242, in __init__
    super(_open_zipfile_reader, self).__init__(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

原因是模型文件损坏，，原来是前面下resnet50权重，第一次下载太慢中断后，没下完的模型保存下来了，，把那个删除掉，后来下好的权重改好名字，解决。

AttributeError: module 'distutils' has no attribute 'version'

重新安装了setuptools==58.0.4，解决（要b溃了

终于跑起来了！皇天不负有心人，感动中国了

看代码

项目文件在mmdetection3d项目上扩展而来

mmdetection3d
├── mmdet3d
├── tools
├── configs
├── data
│   ├── nuscenes
│   │   ├── maps
│   │   ├── samples
│   │   ├── sweeps
│   │   ├── v1.0-test
|   |   ├── v1.0-trainval

有很多东西并没有用到

看：

tools/create_data_bevdet.py
configs/bevdet/bevdet-r50.py
tools/train.py
mmdet3d/apis/train.py
mmdet3d/datasets/nuscenes_dataset.py
mmdet3d/models/detectors/bevdet.py
mmdet3d/models/necks/view_transformer.py

tools/train.py

疑惑

config文件中model.type='BEVDet'，不是下文中这个：

if cfg.model.type in ['EncoderDecoder3D']:
        logger_name = 'mmseg'
    else:
        logger_name = 'mmdet'
    logger = get_root_logger(
        log_file=log_file, log_level=cfg.log_level, name=logger_name)

mmdet3d/apis/train.py中也有一个判断该类型的地方

# TODO: ugly workaround to judge whether we are training det or seg model
    if cfg.model.type in ['EncoderDecoder3D']:
        logger_name = 'mmseg'
    else:
        logger_name = 'mmdet'

mmdet3d/apis/train.py

HOOK FP16

Fp16OptimizerHook()

参考入门mmdetection（捌）---聊一聊FP16

# fp16 setting
    # 从config里读取fp16字段，如果没有为None；
    fp16_cfg = cfg.get('fp16', None) 
    if fp16_cfg is not None:    # 如果我们设置了fp16，则会生成一个Fp16OptimizerHook的实例
        optimizer_config = Fp16OptimizerHook(**cfg.optimizer_config, **fp16_cfg, distributed=distributed)
    elif distributed and 'type' not in cfg.optimizer_config: 
        # 如果我们没有设置，则正常从config里面读取optimizer_config
        # 比如bevdet-r50 设置grad_clip: optimizer_config = dict(grad_clip=dict(max_norm=5, norm_type=2))
        optimizer_config = OptimizerHook(**cfg.optimizer_config)
    else:
        optimizer_config = cfg.optimizer_config

    # 然后注册训练的hooks, optimizer_config会被当参数传进去
    # register hooks
    runner.register_training_hooks(
        cfg.lr_config,
        optimizer_config,
        cfg.checkpoint_config,
        cfg.log_config,
        cfg.get('momentum_config', None),
        custom_hooks_config=cfg.get('custom_hooks', None))

数据预处理Pipeline

先下载数据

下载完成后,由于我们使用的mini数据集进行测试，而源码中需要full数据集，我们将v1.0-mini复制一份并命名为v1.0-trainval数据集data目录如下：

data
  └──nuscenes
    ├── gts
    ├── maps
    ├── samples
    ├── sweeps
    ├── v1.0-mini
    └── v1.0-trainval  # 复制的文件夹

运行tools/create_data_bevdet.py生成数据集。

运行完成后会在./data/nuscenes下生成bevdetv2-nuscenes_infos_train.pkl与bevdetv2-nuscenes_infos_val.pkl两个文件夹。

导入数据集

Train.py中，根据config中data.train配置，build_dataset(cfg.data.train)

mmdet3d将原始的nuScenes进行转换，使得各个数据集之间保持统一的格式，这个是离线完成的

BEVDet所使用的mmdet3d版本，处理后的数据在坐标系定义方面和原始的nuScenes保持一致。如果使用新版的mmdet3d处理的数据，就有可能出现mAOE特别差的情况，因为新版的mmdet3d处理后的数据在坐标系定义方面和原始的nuScenes不一致

模型

model = dict(
    type='BEVDet',
    img_backbone=dict(
        pretrained='torchvision://resnet50',
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(2, 3),
        frozen_stages=-1,
        norm_cfg=dict(type='BN', requires_grad=True),
        norm_eval=False,
        with_cp=True,
        style='pytorch'),
    img_neck=dict(
        type='CustomFPN',
        in_channels=[1024, 2048],
        out_channels=256,
        num_outs=1,
        start_level=0,
        out_ids=[0]),
    img_view_transformer=dict(
        type='LSSViewTransformer',
        grid_config=grid_config,
        input_size=data_config['input_size'],
        in_channels=256,
        out_channels=numC_Trans,
        downsample=16),
    img_bev_encoder_backbone=dict(
        type='CustomResNet',
        numC_input=numC_Trans,
        num_channels=[numC_Trans * 2, numC_Trans * 4, numC_Trans * 8]),
    img_bev_encoder_neck=dict(
        type='FPN_LSS',
        in_channels=numC_Trans * 8 + numC_Trans * 2,
        out_channels=256),
    pts_bbox_head=dict(
        type='CenterHead',
        in_channels=256,
        tasks=[
            dict(num_class=10, class_names=['car', 'truck',
                                            'construction_vehicle',
                                            'bus', 'trailer',
                                            'barrier',
                                            'motorcycle', 'bicycle',
                                            'pedestrian', 'traffic_cone']),
        ],
        common_heads=dict(
            reg=(2, 2), height=(1, 2), dim=(3, 2), rot=(2, 2), vel=(2, 2)),
        share_conv_channel=64,
        bbox_coder=dict(
            type='CenterPointBBoxCoder',
            pc_range=point_cloud_range[:2],
            post_center_range=[-61.2, -61.2, -10.0, 61.2, 61.2, 10.0],
            max_num=500,
            score_threshold=0.1,
            out_size_factor=8,
            voxel_size=voxel_size[:2],
            code_size=9),
        separate_head=dict(
            type='SeparateHead', init_bias=-2.19, final_kernel=3),
        loss_cls=dict(type='GaussianFocalLoss', reduction='mean'),
        loss_bbox=dict(type='L1Loss', reduction='mean', loss_weight=0.25),
        norm_bbox=True),

img_backbone: ResNet

# 位置 /mmdet/models/backbones/resnet.py
img_backbone=dict(
        pretrained='torchvision://resnet50',
        type='ResNet',
        depth=50,
        num_stages=4,
        out_indices=(2, 3),
        frozen_stages=-1,
        norm_cfg=dict(type='BN', requires_grad=True),
        norm_eval=False,
        with_cp=True,
        style='pytorch'),

没有用mmdet3d的resnet

img_neck: CustomFPN

# 位置mmdet3d/models/necks/fpn.py
img_neck=dict(
        type='CustomFPN',
        in_channels=[1024, 2048],
        out_channels=256,
        num_outs=1,
        start_level=0,
        out_ids=[0]),

view_transformation: LSSViewTransformer

模型定义中的

img_view_transformer=dict(
        type='LSSViewTransformer',
        grid_config=grid_config,
        input_size=data_config['input_size'],
        in_channels=256,
        out_channels=numC_Trans,
        downsample=16),
# 位于mmdet3d/models/necks/view_transformer.py

forward()

前向传递

返回 torch.tensor: Bird-eye-view feature in shape (B, C, H_BEV, W_BEV)

走深度估计网络获取深度特征

x = self.depth_net(x) 
###
self.depth_net = nn.Conv2d(in_channels, self.D + self.out_channels, kernel_size=1, padding=0)

view_transform(self, input, depth, tran_feat)

view_transform_core(input, depth, tran_feat)

self.get_lidar_coor()

view_transform()

view_transform_core()

CustomResNet

# 位置mmdet3d/models/backbones/resnet.py
img_bev_encoder_backbone=dict(
        type='CustomResNet',
        numC_input=numC_Trans, # 64
        num_channels=[numC_Trans * 2, numC_Trans * 4, numC_Trans * 8]),

FPN_LSS

# 位置mmdet3d/models/necks/lss_fpn.py
img_bev_encoder_neck=dict(
        type='FPN_LSS',
        in_channels=numC_Trans * 8 + numC_Trans * 2, # 640
        out_channels=256),

CenterHead

# 位置mmdet3d/models/dense_heads/centerpoint_head.py
# pts -> Proposal, Tracking, Segmentation
pts_bbox_head=dict(
        type='CenterHead',
        in_channels=256,
        tasks=[
            dict(num_class=10, class_names=['car', 'truck',
                                            'construction_vehicle',
                                            'bus', 'trailer',
                                            'barrier',
                                            'motorcycle', 'bicycle',
                                            'pedestrian', 'traffic_cone']),
        ],
        common_heads=dict(
            reg=(2, 2), height=(1, 2), dim=(3, 2), rot=(2, 2), vel=(2, 2)),
        share_conv_channel=64,
        bbox_coder=dict(
            type='CenterPointBBoxCoder',
            pc_range=point_cloud_range[:2],
            post_center_range=[-61.2, -61.2, -10.0, 61.2, 61.2, 10.0],
            max_num=500,
            score_threshold=0.1,
            out_size_factor=8,
            voxel_size=voxel_size[:2],
            code_size=9),
        separate_head=dict(
            type='SeparateHead', init_bias=-2.19, final_kernel=3),
        loss_cls=dict(type='GaussianFocalLoss', reduction='mean'),
        loss_bbox=dict(type='L1Loss', reduction='mean', loss_weight=0.25),
        norm_bbox=True),

模型主要模块的代码都看了，ing view encoder，view transformation，bev encoder，

mmdet3d的这个框架了解了大概的运行方式，config，pipeline，module，register，不过hook还没有太看明白，还有检测头的部分一些基础知识可能还需要再了解一下，centerpoint，热力图

用云服务器配置了一下环境跑了一下模型，输出log方便看到总体模型结构

模型结构疑问：

paper中模型结构图img encoder部分的通道数对不上，，backbone是resnet50，neck是fpn，通道数和图里的不一样，是哪里没理解对吗

mmdet3d框架感觉功能很强大，，但是学习门槛不低，，封装程度太高了，深入学习一下

Ethannnn~

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
2
评论
BEVDET环境配置和代码学习

mmdet3d的这个框架了解了大概的运行方式，config，pipeline，module，register，不过hook还没有太看明白，还有检测头的部分一些基础知识可能还需要再了解一下，centerpoint，热力图。原因是模型文件损坏，，原来是前面下resnet50权重，第一次下载太慢中断后，没下完的模型保存下来了，，把那个删除掉，后来下好的权重改好名字，解决。最后发现上面的torch安装（按照cu113）会直接安装最新的pytorch1.12.0而不是要的1.10.0，，只好全部重新搞了，，，
复制链接

扫一扫