07 MMDetection代码实战
07 MMDetection代码实战)
记录时间:2023年6月9日
环境检测
由于发现虚拟机无法使用物理机的显卡设备,因此本次环境为WSL2配置。
#检测cuda版本
(pytorch) yyy-username@DESKTOP-V8QQE2T:~/mmdetection$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0
#检测gcc版本
(pytorch) yyy-username@DESKTOP-V8QQE2T:~/mmdetection$ gcc --version
gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#下载mmdetection库
git clone https://github.com/open-mmlab/mmdetection.git -b 3.x
Cloning into 'mmdetection'...
remote: Enumerating objects: 36544, done.
remote: Counting objects: 100% (1195/1195), done.
remote: Compressing objects: 100% (661/661), done.
remote: Total 36544 (delta 659), reused 889 (delta 520), pack-reused 35349
Receiving objects: 100% (36544/36544), 56.83 MiB | 3.70 MiB/s, done.
Resolving deltas: 100% (25594/25594), done.
#安装mmdetection库
(pytorch) yyy-username@DESKTOP-V8QQE2T:~$ cd mmdetection/
(pytorch) yyy-username@DESKTOP-V8QQE2T:~/mmdetection$ pip install -e .
Obtaining file:///home/yyy-username/mmdetection
Preparing metadata (setup.py) ... done
Successfully built pycocotools
Installing collected packages: terminaltables, shapely, scipy, pycocotools, mmdet
Running setup.py develop for mmdet
Successfully installed mmdet-3.0.0 pycocotools-2.0.6 scipy-1.10.1 shapely-2.0.1 terminaltables-3.1.10
编写环境查看语句
import torch, torchvision
print('Pytorch 版本', torch.__version__)
print('CUDA 是否可用',torch.cuda.is_available())
import mmcv
from mmcv.ops import get_compiling_cuda_version, get_compiler_version
print('MMCV版本', mmcv.__version__)
print('CUDA版本', get_compiling_cuda_version())
print('编译器版本', get_compiler_version())
import mmdet
print('mmdetection版本', mmdet.__version__)
# import mmpose
# print('mmpose版本', mmpose.__version__)
from mmengine.utils import get_git_hash
from mmengine.utils.dl_utils import collect_env as collect_base_env
def collect_env():
env_info = collect_base_env()
env_info['MMDetection'] = f'{mmdet.__version__}+{get_git_hash()[:7]}'
return env_info
if __name__ == "__main__":
for name, val in collect_env().items():
print(f'{name}:{val}')
输出环境如下
Pytorch 版本 1.13.0
CUDA 是否可用 True
MMCV版本 2.0.0
CUDA版本 11.7
编译器版本 GCC 9.3
mmdetection版本 3.0.0
sys.platform:linux
Python:3.9.16 (main, Mar 8 2023, 14:00:05) [GCC 11.2.0]
CUDA available:True
numpy_random_seed:2147483648
GPU 0:NVIDIA GeForce GTX 1660 SUPER
CUDA_HOME:/usr/local/cuda
NVCC:Cuda compilation tools, release 11.7, V11.7.99
GCC:gcc (Ubuntu 11.3.0-1ubuntu1~22.04.1) 11.3.0
PyTorch:1.13.0
PyTorch compiling details:PyTorch built with:
- GCC 9.3
- C++ Version: 201402
- Intel(R) oneAPI Math Kernel Library Version 2023.1-Product Build 20230303 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- LAPACK is enabled (usually provided by MKL)
- NNPACK is enabled
- CPU capability usage: AVX2
- CUDA Runtime 11.7
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
- CuDNN 8.5
- Magma 2.6.1
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wunused-local-typedefs -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,
TorchVision:0.14.0
OpenCV:4.7.0
MMEngine:0.7.3
MMDetection:3.0.0+
可视化
由于命令行执行无法显示图片,可视化暂时未执行。明天进行jupyterlab环境配置后再进行可视化。
配置文件编写
_base_ = '../configs/rtmdet/rtmdet_tiny_8xb32-300e_coco.py'
data_root = '../data/cat_dataset/'
# 非常重要
metainfo = {
# 类别名,注意 classes 需要是一个 tuple,因此即使是单类,
# 你应该写成 `cat,` 很多初学者经常会在这犯错
'classes': ('cat',),
'palette': [
(220, 20, 60),
]
}
num_classes = 1
# 训练 40 epoch
max_epochs = 40
# 训练单卡 bs= 12
train_batch_size_per_gpu = 12
# 可以根据自己的电脑修改
train_num_workers = 4
# 验证集 batch size 为 1
val_batch_size_per_gpu = 1
val_num_workers = 2
# RTMDet 训练过程分成 2 个 stage,第二个 stage 会切换数据增强 pipeline
num_epochs_stage2 = 5
# batch 改变了,学习率也要跟着改变, 0.004 是 8卡x32 的学习率
base_lr = 12 * 0.004 / (32*8)
# 采用 COCO 预训练权重
load_from = 'https://download.openmmlab.com/mmdetection/v3.0/rtmdet/rtmdet_tiny_8xb32-300e_coco/rtmdet_tiny_8xb32-300e_coco_20220902_112414-78e30dcc.pth' # noqa
model = dict(
# 考虑到数据集太小,且训练时间很短,我们把 backbone 完全固定
# 用户自己的数据集可能需要解冻 backbone
backbone=dict(frozen_stages=4),
# 不要忘记修改 num_classes
bbox_head=dict(dict(num_classes=num_classes)))
# 数据集不同,dataset 输入参数也不一样
train_dataloader = dict(
batch_size=train_batch_size_per_gpu,
num_workers=train_num_workers,
pin_memory=False,
dataset=dict(
data_root=data_root,
metainfo=metainfo,
ann_file='annotations/trainval.json',
data_prefix=dict(img='images/')))
val_dataloader = dict(
batch_size=val_batch_size_per_gpu,
num_workers=val_num_workers,
dataset=dict(
metainfo=metainfo,
data_root=data_root,
ann_file='annotations/test.json',
data_prefix=dict(img='images/')))
test_dataloader = val_dataloader
# 默认的学习率调度器是 warmup 1000,但是 cat 数据集太小了,需要修改 为 30 iter
param_scheduler = [
dict(
type='LinearLR',
start_factor=1.0e-5,
by_epoch=False,
begin=0,
end=30),
dict(
type='CosineAnnealingLR',
eta_min=base_lr * 0.05,
begin=max_epochs // 2, # max_epoch 也改变了
end=max_epochs,
T_max=max_epochs // 2,
by_epoch=True,
convert_to_iter_based=True),
]
optim_wrapper = dict(optimizer=dict(lr=base_lr))
# 第二 stage 切换 pipeline 的 epoch 时刻也改变了
_base_.custom_hooks[1].switch_epoch = max_epochs - num_epochs_stage2
val_evaluator = dict(ann_file=data_root + 'annotations/test.json')
test_evaluator = val_evaluator
# 一些打印设置修改
default_hooks = dict(
checkpoint=dict(interval=10, max_keep_ckpts=2, save_best='auto'), # 同时保存最好性能权重
logger=dict(type='LoggerHook', interval=5))
train_cfg = dict(max_epochs=max_epochs, val_interval=10)
模型的训练和推理
训练代码
python tools/train.py configs.py
训练结果
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.837
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.977
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.911
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.837
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.820
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.893
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.893
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.893
06/09 22:30:23 - mmengine - INFO - bbox_mAP_copypaste: 0.837 0.977 0.911 -1.000 -1.000 0.837
06/09 22:30:23 - mmengine - INFO - Epoch(val) [40][28/28] coco/bbox_mAP: 0.8370 coco/bbox_mAP_50: 0.9770 coco/bbox_mAP_75: 0.9110 coco/bbox_mAP_s: -1.0000 coco/bbox_mAP_m: -1.0000 coco/bbox_mAP_l: 0.8370 data_time: 0.0300 time: 0.0501
检测新趋势
Open-Vocabulary Object Detection
开放词汇目标检测,给定图片和类别词汇表,检测所有物体
Grounding Object Detection
给定图片和文本描述,预测文本汇总所提到的物体再图片中物体的位置
总结
虽然根据教程的配置照猫画虎的完成了数据集的训练,但是仍然不太会修改配置,因此后面要继续学习配置的修改