首先是修改配置文件,config/nanodet_custom_xml_dataset.yml
主要是修改以下几项
class_names: &class_names ['aeroplane','bicycle','bird','boat','bottle','bus','car','cat','chair','cow','diningtable',
'dog','horse','motorbike','person','pottedplant','sheep','sofa','train','tvmonitor']
num_classes: 20
img_path: /media/sf_D_DRIVE/tmp/data/VOCtrainval_11-May-2012/VOCdevkit/VOC2012/JPEGImages
ann_path: /media/sf_D_DRIVE/tmp/data/VOCtrainval_11-May-2012/VOCdevkit/VOC2012/Annotations
参考全部文件
#Config File example
save_dir: workspace/nanodet_m
model:
weight_averager:
name: ExpMovingAverager
decay: 0.9998
arch:
name: NanoDetPlus
detach_epoch: 10
backbone:
name: ShuffleNetV2
model_size: 1.0x
out_stages: [2,3,4]
activation: LeakyReLU
fpn:
name: GhostPAN
in_channels: [116, 232, 464]
out_channels: 96
kernel_size: 5
num_extra_level: 1
use_depthwise: True
activation: LeakyReLU
head:
name: NanoDetPlusHead
num_classes: 20
input_channel: 96
feat_channels: 96
stacked_convs: 2
kernel_size: 5
strides: [8, 16, 32, 64]
activation: LeakyReLU
reg_max: 7
norm_cfg:
type: BN
loss:
loss_qfl:
name: QualityFocalLoss
use_sigmoid: True
beta: 2.0
loss_weight: 1.0
loss_dfl:
name: DistributionFocalLoss
loss_weight: 0.25
loss_bbox:
name: GIoULoss
loss_weight: 2.0
# Auxiliary head, only use in training time.
aux_head:
name: SimpleConvHead
num_classes: 20
input_channel: 192
feat_channels: 192
stacked_convs: 4
strides: [8, 16, 32, 64]
activation: LeakyReLU
reg_max: 7
class_names: &class_names ['aeroplane','bicycle','bird','boat','bottle','bus','car','cat','chair','cow','diningtable',
'dog','horse','motorbike','person','pottedplant','sheep','sofa','train','tvmonitor']
data:
train:
name: XMLDataset
class_names: *class_names
img_path: /media/sf_D_DRIVE/tmp/data/VOCtrainval_11-May-2012/VOCdevkit/VOC2012/JPEGImages
ann_path: /media/sf_D_DRIVE/tmp/data/VOCtrainval_11-May-2012/VOCdevkit/VOC2012/Annotations
input_size: [320,320] #[w,h]
keep_ratio: True
pipeline:
perspective: 0.0
scale: [0.6, 1.4]
stretch: [[1, 1], [1, 1]]
rotation: 0
shear: 0
translate: 0.2
flip: 0.5
brightness: 0.2
contrast: [0.8, 1.2]
saturation: [0.8, 1.2]
normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
val:
name: XMLDataset
class_names: *class_names
img_path: /media/sf_D_DRIVE/tmp/data/VOCtrainval_11-May-2012/VOCdevkit/VOC2012/JPEGImages #Please fill in val image path
ann_path: /media/sf_D_DRIVE/tmp/data/VOCtrainval_11-May-2012/VOCdevkit/VOC2012/Annotations #Please fill in val xml path
input_size: [320,320] #[w,h]
keep_ratio: True
pipeline:
normalize: [[103.53, 116.28, 123.675], [57.375, 57.12, 58.395]]
device:
gpu_ids: [0] # Set like [0, 1, 2, 3] if you have multi-GPUs
workers_per_gpu: 8
batchsize_per_gpu: 96
schedule:
# resume:
# load_model: YOUR_MODEL_PATH
optimizer:
name: AdamW
lr: 0.001
weight_decay: 0.05
warmup:
name: linear
steps: 500
ratio: 0.0001
total_epochs: 300
lr_schedule:
name: CosineAnnealingLR
T_max: 300
eta_min: 0.00005
val_intervals: 10
grad_clip: 35
evaluator:
name: CocoDetectionEvaluator
save_key: mAP
log:
interval: 10
开始训练
export PYTHONPATH=.
python tools/train.py config/nanodet_custom_xml_dataset.yml
voc的数据集的标签文件,Annotations/2011_003353.xml的ymin是一个float类型,运行时代码会报错,
ValueError: invalid literal for int() with base 10:
我就没改代码,直接把45.7改为了45
同样的还有2011_006777.xml
当然上面的配置是比较暴力,把所有数据都作为训练数据和验证数据,这肯定是不合理的
那我们可以把一部分数据移动到另一个文件夹即可,如
mv *9.xml ../test/Annotations
然后把验证数据的 ann_path 设定为新建的文件夹即可