OpenPCDet——nuScenes数据集训练BEVFusion/TransFusion_L
引言
OpenPCDet环境搭建参考:【3D目标检测】环境搭建(OpenPCDet、MMdetection3d)
源码地址:OpenPCDet:https://github.com/open-mmlab/OpenPCDet
自定义数据集训练参考:【3D目标检测】OpenPCDet自定义数据集训练
1 nuScenes数据集准备
nuScenes数据集官网
首先需要注册登录,找到如图进行下载:
可以先下载v1.0-mini
数据集测试,有条件的基础上下载全部数据集。
1.1 数据集准备和预处理
按照OpenPCDet的要求准备nuScenes数据结构:
最好放在OpenPCDet/data
文件夹下
或者可以使用软连接
ln -s /opt/data/DATASETS/nuscenes /opt/data/CNN_3D/OpenPCDet/data
OpenPCDet
├── data
│ ├── nuscenes
│ │ │── v1.0-trainval (or v1.0-mini if you use mini)
│ │ │ │── samples
│ │ │ │── sweeps
│ │ │ │── maps
│ │ │ │── v1.0-trainval
├── pcdet
├── tools
1.2 生成标准数据格式
安装必要依赖包
pip install nuscenes-devkit==1.0.5 -i https://pypi.tuna.tsinghua.edu.cn/simple
修改文件路径如下:
- OpenPCDet/pcdet/datasets/nuscenes/
nuscenes_dataset.py
修改导入本地库路径
from pcdet.ops.roiaware_pool3d import roiaware_pool3d_utils
from pcdet.utils import common_utils
from pcdet.datasets.dataset import DatasetTemplate
修改配置信息
if __name__ == '__main__':
import yaml
import argparse
from pathlib import Path
from easydict import EasyDict
parser = argparse.ArgumentParser(description='arg parser')
parser.add_argument('--cfg_file', type=str, default='/opt/data/CNN_3D/OpenPCDet/tools/cfgs/dataset_configs/nuscenes_dataset.yaml', help='specify the config of dataset') # 修改配置路径
parser.add_argument('--func', type=str, default='create_nuscenes_infos', help='')
parser.add_argument('--version', type=str, default='v1.0-mini', help='') # 修改:mini数据集
parser.add_argument('--with_cam', action='store_true', default=True, help='use camera or not') # 修改为True,多模态
args = parser.parse_args()
修改函数create_nuscenes_info导入库路径
def create_nuscenes_info(version, data_path, save_path, max_sweeps=10, with_cam=False):
from nuscenes.nuscenes import NuScenes
from nuscenes.utils import splits
import nuscenes_utils # 修改
- OpenPCDet/tools/cfgs/dataset_configs/
nuscenes_dataset.yaml
DATASET: 'NuScenesDataset'
DATA_PATH: '/opt/data/CNN_3D/OpenPCDet/data/nuscenes' # 修改1:数据集路径
VERSION: 'v1.0-mini' # 'v1.0-trainval' # 修改2:数据集规模
MAX_SWEEPS: 10
PRED_VELOCITY: True
SET_NAN_VELOCITY_TO_ZEROS: True
FILTER_MIN_POINTS_IN_GT: 1
运行代码:
# 1. 根据上述步骤则可以直接运行
python nuscenes_dataset.py
# 2. 官网
#仅使用点云数据
python -m pcdet.datasets.nuscenes.nuscenes_dataset --func create_nuscenes_infos \
--cfg_file tools/cfgs/dataset_configs/nuscenes_dataset.yaml \
--version v1.0-trainval
# 使用多模态数据(点云+图像)
python -m pcdet.datasets.nuscenes.nuscenes_dataset --func create_nuscenes_infos \
--cfg_file tools/cfgs/dataset_configs/nuscenes_dataset.yaml \
--version v1.0-trainval \
--with_cam
终端成功运行如下:
文件夹内生成如下文件:
2 训练BEVFusion
2.1 下载预训练权重
2.2 修改训练配置文件bevfusion.yaml
两处修改如下图:
CLASS_NAMES: ['car','truck', 'construction_vehicle', 'bus', 'trailer',
'barrier', 'motorcycle', 'bicycle', 'pedestrian', 'traffic_cone']
DATA_CONFIG:
_BASE_CONFIG_: /opt/data/CNN_3D/OpenPCDet/tools/cfgs/dataset_configs/nuscenes_dataset.yaml # 修改1:数据配置文件路径
POINT_CLOUD_RANGE: [-54.0, -54.0, -5.0, 54.0, 54.0, 3.0]
CAMERA_CONFIG:
USE_CAMERA: True
IMAGE:
FINAL_DIM: [256,704]
RESIZE_LIM_TRAIN: [0.38, 0.55]
RESIZE_LIM_TEST: [0.48, 0.48]
DATA_AUGMENTOR:
DISABLE_AUG_LIST: ['placeholder']
AUG_CONFIG_LIST:
- NAME: random_world_flip
ALONG_AXIS_LIST: ['x', 'y']
- NAME: random_world_rotation
WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]
- NAME: random_world_scaling
WORLD_SCALE_RANGE: [0.9, 1.1]
- NAME: random_world_translation
NOISE_TRANSLATE_STD: [0.5, 0.5, 0.5]
- NAME: imgaug
ROT_LIM: [-5.4, 5.4]
RAND_FLIP: True
DATA_PROCESSOR:
- NAME: mask_points_and_boxes_outside_range
REMOVE_OUTSIDE_BOXES: True
- NAME: shuffle_points
SHUFFLE_ENABLED: {
'train': True,
'test': True
}
- NAME: transform_points_to_voxels
VOXEL_SIZE: [0.075, 0.075, 0.2]
MAX_POINTS_PER_VOXEL: 10
MAX_NUMBER_OF_VOXELS: {
'train': 120000,
'test': 160000
}
- NAME: image_calibrate
- NAME: image_normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
MODEL:
NAME: BevFusion
VFE:
NAME: MeanVFE
BACKBONE_3D:
NAME: VoxelResBackBone8x
USE_BIAS: False
MAP_TO_BEV:
NAME: HeightCompression
NUM_BEV_FEATURES: 256
IMAGE_BACKBONE:
NAME: SwinTransformer
EMBED_DIMS: 96
DEPTHS: [2, 2, 6, 2]
NUM_HEADS: [3, 6, 12, 24]
WINDOW_SIZE: 7
MLP_RATIO: 4
DROP_RATE: 0.
ATTN_DROP_RATE: 0.
DROP_PATH_RATE: 0.2
PATCH_NORM: True
OUT_INDICES: [1, 2, 3]
WITH_CP: False
CONVERT_WEIGHTS: True
INIT_CFG:
type: Pretrained
checkpoint: /opt/data/CNN_3D/OpenPCDet/ckpt/swint-nuimages-pretrained.pth # 修改2:预训练权重路径
NECK:
NAME: GeneralizedLSSFPN
IN_CHANNELS: [192, 384, 768]
OUT_CHANNELS: 256
START_LEVEL: 0
END_LEVEL: -1
NUM_OUTS: 3
VTRANSFORM:
NAME: DepthLSSTransform
IMAGE_SIZE: [256, 704]
IN_CHANNEL: 256
OUT_CHANNEL: 80
FEATURE_SIZE: [32, 88]
XBOUND: [-54.0, 54.0, 0.3]
YBOUND: [-54.0, 54.0, 0.3]
ZBOUND: [-10.0, 10.0, 20.0]
DBOUND: [1.0, 60.0, 0.5]
DOWNSAMPLE: 2
FUSER:
NAME: ConvFuser
IN_CHANNEL: 336
OUT_CHANNEL: 256
BACKBONE_2D:
NAME: BaseBEVBackbone
LAYER_NUMS: [5, 5]
LAYER_STRIDES: [1, 2]
NUM_FILTERS: [128, 256]
UPSAMPLE_STRIDES: [1, 2]
NUM_UPSAMPLE_FILTERS: [256, 256]
USE_CONV_FOR_NO_STRIDE: True
DENSE_HEAD:
CLASS_AGNOSTIC: False
NAME: TransFusionHead
USE_BIAS_BEFORE_NORM: False
NUM_PROPOSALS: 200
HIDDEN_CHANNEL: 128
NUM_CLASSES: 10
NUM_HEADS: 8
NMS_KERNEL_SIZE: 3
FFN_CHANNEL: 256
DROPOUT: 0.1
BN_MOMENTUM: 0.1
ACTIVATION: relu
NUM_HM_CONV: 2
SEPARATE_HEAD_CFG:
HEAD_ORDER: ['center', 'height', 'dim', 'rot', 'vel']
HEAD_DICT: {
'center': {'out_channels': 2, 'num_conv': 2},
'height': {'out_channels': 1, 'num_conv': 2},
'dim': {'out_channels': 3, 'num_conv': 2},
'rot': {'out_channels': 2, 'num_conv': 2},
'vel': {'out_channels': 2, 'num_conv': 2},
}
TARGET_ASSIGNER_CONFIG:
FEATURE_MAP_STRIDE: 8
DATASET: nuScenes
GAUSSIAN_OVERLAP: 0.1
MIN_RADIUS: 2
HUNGARIAN_ASSIGNER:
cls_cost: {'gamma': 2.0, 'alpha': 0.25, 'weight': 0.15}
reg_cost: {'weight': 0.25}
iou_cost: {'weight': 0.25}
LOSS_CONFIG:
LOSS_WEIGHTS: {
'cls_weight': 1.0,
'bbox_weight': 0.25,
'hm_weight': 1.0,
'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.2, 0.2]
}
LOSS_CLS:
use_sigmoid: True
gamma: 2.0
alpha: 0.25
POST_PROCESSING:
SCORE_THRESH: 0.0
POST_CENTER_RANGE: [-61.2, -61.2, -10.0, 61.2, 61.2, 10.0]
POST_PROCESSING:
RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
SCORE_THRESH: 0.1
OUTPUT_RAW_SCORE: False
EVAL_METRIC: kitti
OPTIMIZATION:
BATCH_SIZE_PER_GPU: 3
NUM_EPOCHS: 6
OPTIMIZER: adam_cosineanneal
LR: 0.0001
WEIGHT_DECAY: 0.01
MOMENTUM: 0.9
BETAS: [0.9, 0.999]
MOMS: [0.9, 0.8052631]
PCT_START: 0.4
WARMUP_ITER: 500
DECAY_STEP_LIST: [35, 45]
LR_WARMUP: False
WARMUP_EPOCH: 1
GRAD_NORM_CLIP: 35
LOSS_SCALE_FP16: 32
2.3 BEVFusion模型训练
修改train.py文件,主要修改以下配置
注意 batch_size=3; epochs=6
报错!
Error in collate_batch: key=img_process_infos
...
Traceback (most recent call last):
File "/opt/conda/envs/pcdet/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/opt/conda/envs/pcdet/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
return self.collate_fn(data)
File "/opt/data/CNN_3D/OpenPCDet/tools/../pcdet/datasets/dataset.py", line 322, in collate_batch
raise TypeError
TypeError
原因分析:
numpy版本太高和老版本不兼容
Bug解决:
pip install numpy==1.23.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
训练完毕如下图所示(笔者只跑了6轮,效果并不好):
2.4 BEVFusion模型评估
官网指令:
bash scripts/dist_test.sh ${NUM_GPUS} --cfg_file cfgs/nuscenes_models/bevfusion.yaml
–ckpt …/output/cfgs/nuscenes_models/bevfusion/default/ckpt/checkpoint_epoch_6.pth
笔者使用指令:
修改test.py文件如下:
def parse_config():
parser = argparse.ArgumentParser(description='arg parser')
parser.add_argument('--cfg_file', type=str, default='/media/ll/L/llr/a2023_my_3d/OpenPCDet/tools/cfgs/nuscenes_models/bevfusion.yaml', help='specify the config for training') # 配置文件路径
parser.add_argument('--batch_size', type=int, default=1, required=False, help='batch size for training') # 显存溢出的时候则可以减小batch_size
parser.add_argument('--ckpt', type=str, default='/media/ll/L/llr/a2023_my_3d/OpenPCDet/ckpt/cbgs_bevfusion.pth', help='checkpoint to start from') # 权重文件
说明:
根据上述步骤,同理可以跑通nuscenes_models下的所有模型
笔者又跑了transfusion_lidar
模型,如下如所示: