BEVDet代码复现实践

AI松子666

已于 2024-04-09 17:48:34 修改

阅读量1.3k

点赞数 13

文章标签：机器人 python 自动驾驶计算机视觉人工智能

于 2024-04-09 15:21:38 首次发布

本文链接：https://blog.csdn.net/qq_39523365/article/details/137553428

版权

1 环境配置

默认ubuntu-20.04，python-3.8, torch-1.10.0, cuda-11.3，cudnn-8.6

不要问其他版本能不能通，小白和不想折腾环境的童鞋直接抄作业

1.1 基本环境配置

1.基本依赖

apt-get install -y vim libsm6 libxext6 libxrender-dev libgl1-mesa-glx git wget libssl-dev libopencv-dev libspdlog-dev

2.拉取源码

# 1 下载源码   目前默认是dev2.1
git clone https://github.com/HuangJunJie2017/BEVDet.git

# 2 查看代码版本号 显示dev2.1版本
git checkout

3.配置bevdet的python虚拟环境

# 1 创建虚拟环境
conda create -n bevdet python=3.8

# 2 激活python环境
conda activate bevdet

# 3 虚拟环境bevdet中安装torch
pip install torch==1.10.0+cu113 torchvision==0.11.0+cu113 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

# 4 虚拟环境bevdet中安装openlib相关库 mmcv-full安装耗时比较长，只要鼠标没卡住都是正常
pip install mmcv-full==1.5.3 onnxruntime-gpu==1.8.1 mmdet==2.25.1 mmsegmentation==0.25.0

# 5 进入BEVDet工程目录,安装mmdet3d
pip install -v -e .
# 可以不加-v，终端不打印编译安装log。

# 6 安装其他依赖 numpy==1.23.4 setuptools==58.2.0等
pip install pycuda lyft_dataset_sdk networkx==2.2 numba==0.53.0 numpy==1.23.4 nuscenes-devkit plyfile scikit-image tensorboard trimesh==2.35.39 setuptools==58.2.0 yapf==0.40.1

1.2 报错汇总

# 错误1
...
    from numba.np.ufunc import _internal
SystemError: initialization of _internal failed without raising an exception
# 修改方法： 降低numpy版本即可
pip install numpy==1.23.4

# 错误2
ModuleNotFoundError: No module named 'spconv'
# 修改方法  需要跟cuda配置上, 本人是cuda-11.3, 安装版本如下
pip install spconv-cu113

# 错误3
ModuleNotFoundError: No module named 'IPython'
# 修改方法
pip install IPython

# 错误4
# 情况1：'No module named 'projects.mmdet3d_plugin'
# 情况2：ModuleNotFoundError: No module named 'tools'
# 情况3: ModuleNotFoundError: No module named 'tools.data_converter'
# 因为tools和projects.mmdet3d_plugin都是从本地导入模块, 
# 导入失败要么是python环境变量没生效, 要么是模块的路径不对
# 修改办法: 更新python-path环境即可, 当前python虚拟环境的终端执行下面语句
export PYTHONPATH=$PYTHONPATH:"./"
# 如果还报错检查这句代码的路径是否正确, 可是使用绝对路径代替

# 错误5
TypeError: FormatCode() got an unexpected keyword argument 'verify'
# 修改办法: 降低yapf版本
pip install yapf==0.40.1

# 错误6 
ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory
# 原因： 安装的mmcv与cuda版本没对用上，建议去whl官方下载离线安装
# 修改参考1.4.1安装mmcv-full教程

# 错误7
# AttributeError: module 'distutils' has no attribute 'version'
修改：更新setuptools版本
pip install setuptools==58.4.0

# 错误8
# docker里面提示libGL.so.1不存在
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
# 修改方法：安装ffmpeg即可
apt-get install ffmpeg -y

# 错误9 pip安装mmcv-full时报错
subprocess.CalledProcessError: Command '['which', 'g++']' returned non-zero exit status 1.
      [end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for mmcv-full
# 修改方法：由于g++，gcc工具没安装,安装build-essential即可
sudo apt-get install build-essential

# 错误10 训练时显存爆炸 RuntimeError: CUDA out of memory
# 修改：先将配置文件中samples_per_gpu改为1即可workers_per_gpu改0测试环境，
# 后期正式训练时逐渐增加这2个参数的数字, 直到显存占满
# 如果设置成1和0都显存不够, 可以更换显卡了
samples_per_gpu=1, workers_per_gpu=0 

# 问题11 ImportError: cannot import name 'bev_pool_v2_ext' from 'mmdet3d.ops.bev_pool_v2'
# 解决方法：进入该工程重新安装mmdet3d
pip install -v -e .

# 问题12 raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
#RuntimeError: DataLoader worker (pid(s) 4252, 17184) exited unexpectedly
# 解决方法 改workers_per_gpu=0 不使用多线程

#错误12
  File "/home/zs/anaconda3/lib/python3.8/site-packages/mmcv/utils/ext_loader.py", line 15, in load_ext
    assert hasattr(ext, fun), f'{fun} miss in module {name}'
AssertionError: chamfer_distance_forward miss in module _ext
解决方法：原因是安装了mmcv，没有安装mmcv-full.
pip install mmcv-full==1.5.3 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.10/index.html

2 准备数据

1.下载数据集
具体下载细节参考Fast-BEV代码复现实践的第2小节数据集准备内容

下载完成后,由于我们使用的mini数据集进行测试，而源码中需要full数据集，我们将v1.0-mini复制一份并命名为v1.0-trainval数据集data目录如下：

data
  └──nuscenes
    ├── gts
    ├── maps
    ├── samples
    ├── sweeps
    ├── v1.0-mini
    └── v1.0-trainval  # 复制的文件夹

真正要达到训练目的，建议上nuscenes-full数据集,生成bevdet数据集
如果未复制v1.0-mini命名为v1.0-trainval会报错 AssertionError: Database version not found: ./data/nuscenes/v1.0-trainval
2. 运行tools/create_data_bevdet.py生成数据集

python tools/create_data_bevdet.py

运行完成后会在./data/nuscenes下生成bevdetv2-nuscenes_infos_train.pkl与bevdetv2-nuscenes_infos_val.pkl两个文件夹

3 训练

下载预训练模型

# 1 新建ckpts文件夹并进入
mkdir ckpts && cd ckpts

# 2 下载resnet50-0676ba61.pth
wget https://download.pytorch.org/models/resnet50-0676ba61.pth

修改configs/bevdet/bevdet-r50.py参数
正式训练时根据显卡算力动态调整，前期验证环境建议参数设置小一点，
测试环境推荐设为samples_per_gpu=1,workers_per_gpu=0, 正式训练时按硬件配置调整.

# 1 预训练路径
pretrained='./ckpts/resnet50-0676ba61.pth'
# 2 数据集加载参数设置
samples_per_gpu=1,  # 每个 GPU 上的样本数, 决定了每个GPU的批量大小（batch size）
workers_per_gpu=0,  # 每个 GPU 上的工作线程数，也就是用来加载数据的子进程数。
                    # 这个参数决定了数据加载的速度和效率
# 3 训练周期数
max_epochs = 2

训练

python tools/train.py ./configs/bevdet/bevdet-r50.py

训练2个周期生成./work_dirs目录结构如下:

work_dirs
└── bevdet-r50
    ├── 20230914_154608.log
    ├── 20230914_154608.log.json
    ├── bevdet-r50.py
    ├── epoch_1_ema.pth
    ├── epoch_1.pth
    ├── epoch_2_ema.pth
    ├── epoch_2.pth
    ├── latest.pth -> epoch_2.pth
    └── tf_logs
        └── events.out.tfevents.1694677573.PC.211602.0

3 测试

自己训练的权重没效果，直接用官方训练好的,详情见readme.md, 官方权重百度云地址

本次次数选择bevdet-r50配置进行测试，其他配置同理

# 1 自己的权重
python tools/test.py ./configs/bevdet/bevdet-r50.py work_dirs/bevdet-occ-r50-4d-stereo-24e/latest.pth --eval mAP

# 2 官方权重
python tools/test.py ./configs/bevdet/bevdet-r50.py ckpts/bevdet-r50.pth --eval mAP

4 可视化

1.生成json文件

# 运行test.py 必须--out", "--eval", "--format-only", "--show" or "--show-dir至少跟一个
# json文件生成需要增加 --eval-options参数 jsonfile_prefix=test_dirs
# 实在搞不清楚请看test.py的源码，看如何加载参数即可

# 1 直接测试
python tools/test.py ./configs/bevdet/bevdet-r50.py ckpts/bevdet-r50.pth --format-only

# 2 测试保存json文件
python tools/test.py ./configs/bevdet/bevdet-r50.py ckpts/bevdet-r50.pth --format-only --eval-options jsonfile_prefix=test_dirs
# 保留json位于目录test_dirs下

# 3 直接生成保存为pkl格式
python tools/test.py ./configs/bevdet/bevdet-r50.py ckpts/bevdet-r50.pth --out=./test_dirs/out.pkl

运行上面第一条指令生成./work_dir/results_nusc.json文件

2.json文件转可视化

python tools/analysis_tools/vis.py ./work_dir/results_nusc.json

生成视频./vis/vis.mp4文件, 如下：

请添加图片描述
bevdet-r50.pth效果还行

AI松子666

关注

13
点赞
踩
23

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫