mmdetection环境安装: faster-rcnn 使用过程配置 mmdetection
mmdetection 的环境安装参考官方文档
https://github.com/open-mmlab/mmdetection/blob/master/docs/get_started.md
遇到的问题如下
第三步Install mmcv-full时,发现自己的cuda是10.1的,然后pytorch是1.7.1的然后就用了这条命令:
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.1/index.html
实际上是错误的,但是没有报错,他就直接给你按了一个最新版本的mmcv-full,这和我想要的是不一样的,我要的是 cuda10.1,pytorch1.7.1的。并没有这样的组合。
安装过程
- conda create -n mmdetection python=3.7 # 创建虚拟环境mmdetection
- conda activate mmdetection / source activate mmdetection #(ubuntu服务器上可能是后面的命令)
- git clone https://github.com/open-mmlab/mmdetection.git #下载mmdetection
- cd mmdetection #要是已经下载过mmdetection,3和4两步不用执行
- conda install pytorch=1.6.0 torchvision=0.7.0 cudatoolkit=10.2 -c pytorch
- pip install mmcv-full==1.3.8 -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.6.0/index.html
- pip install -r requirements/build.txt
- pip install -v -e .
最后两步是在 mmdetection 目录下执行
数据集
mdetection 支持VOC格式数据集,还有COCO数据集格式,还可以自定义数据格式,我们采用VOC的数据格式
-VOC2007
------Annotations
------ImageSets
------Main
------JEPGImages
1.如果数据集没有进行划分, 划分数据集代码如下:(代码用于生成-VOC2007/Main 中的train.txt ,test.txt ,val.txt ,txt已有的话可以不用)参考: link
import os
import random
trainval_percent = 0.8 #划分比例
train_percent = 0.8 #划分比例
xmlfilepath = 'Annotations'
txtsavepath = 'ImageSets\Main'
total_xml = os.listdir(xmlfilepath)
num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)
ftrainval = open('ImageSets/Main/trainval.txt', 'w')
ftest = open('ImageSets/Main/test.txt', 'w')
ftrain = open('ImageSets/Main/train.txt', 'w')
fval = open('ImageSets/Main/val.txt', 'w')
for i in list:
name = total_xml[i][:-4] + '\n'
if i in trainval:
ftrainval.write(name)
if i in train:
ftrain.write(name)
else:
fval.write(name)
else:
ftest.write(name)
ftrainval.close()
ftrain.close()
fval.close()
ftest.close()
修改配置文件
参考大佬的帖子: https://zhuanlan.zhihu.com/p/162730118
- cd /mmdetection/configs/base/datasets
打开voc0712.py 修改如下:带 #*# 就是修改的地方
train_pipeline 中:# dataset settings dataset_type = 'VOCDataset' #data_root = 'data/VOCdevkit/' data_root = '/data1/lirz/DATA/data_board_pascal/zyc-voc/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadImageFromFile'), dict(type='LoadAnnotations', with_bbox=True), # dict(type='Resize', img_scale=(1000, 600), keep_ratio=True), dict(type='Resize', img_scale=(640, 512), keep_ratio=True), dict(type='RandomFlip', flip_ratio=0.5), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='DefaultFormatBundle'), dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']), ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', # img_scale=(1000, 600), img_scale=(640, 512), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]) ] data = dict( samples_per_gpu=1, # workers_per_gpu=2, workers_per_gpu=1, train=dict( type='RepeatDataset', times=3, dataset=dict( type=dataset_type, ann_file=[ data_root + 'VOC2007/ImageSets/Main/trainval.txt', # data_root + 'VOC2012/ImageSets/Main/trainval.txt' ], # img_prefix=[data_root + 'VOC2007/', data_root + 'VOC2012/'], img_prefix=[data_root + 'VOC2007/'], pipeline=train_pipeline)), val=dict( type=dataset_type, ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt', img_prefix=data_root + 'VOC2007/', pipeline=test_pipeline), test=dict( type=dataset_type, ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt', img_prefix=data_root + 'VOC2007/', pipeline=test_pipeline)) evaluation = dict(interval=1, metric='mAP')