参考:下载源码和权重
1>改classname_to_id=classname_to_id = {"name1": 1,"name2":2}
和labelme_path,saved_coco_path
运行后生成文件夹
coco
——annotations
————instances_train2017.json
————instances_val2017.json
——images
————train2017
————val2017
2>修改配置文件config.py
COCO_CLASSES={'name1','name2'}
COCO_LABEL_MAP={1:1,2:2}
dataset_base的train_images、val_images
coco_base_config的num_classes+1
yolact_base_config的max_iter调整训练次数
backbones的path修改权重文件
3>开始训练
python train.py --config=yolact_base_config --batch_size=4 --save_interval=10000
中断后继续
python train.py --config=yolact_base_config --batch_size=4 --resume=weights/yolact_base_x_x_interrupt.pth --save_interval=10000
4>验证
python eval.py --trained_model=./xxx.pth --image=./xx.jpg
运行问题
1>size mismatch
yolact.py中state_dict = torch.load(path)下用state_dict.pop()添加所有报错的weigh/bias
2>Expected a ‘cuda‘ device type for generator but found ‘cpu‘
train.py中data.DataLoader()里添加generator=torch.Generator(device = 'cuda')
3> File "mtrand.pyx", in numpy.random.mtrand.RandomState.choice
ValueError: setting an array element with a sequence
utils/augmentations.py中添加import random
4>left = random.uniform(width - w):
uniform() missing 1 required positional argument: 'b'
加另一个下限参数0
5> random.randint(2):
randint() missing 1 required positional argument: 'b'
加另一个下限参数0
6>min_iou,max_iou=mode
not enough values to unpack(expected 2,got 1)
mode = random.choices(self.sample_options)下修改
if mode == [None]:
return image, masks, boxes, labels
if mode != None:
min_iou, max_iou = mode[0][0],mode[0][1]
7>RuntimeError: CUDA out of memory
调小batch_size
yolact++
下载yolact++权重, DCNv2源码 覆盖yolact/external/DCNv2
链接本地cuda
export CUDA_HOME=/home/usr/local/cuda-x.x
验证当前环境下存在cuda
nvcc -V
编译DCNv2
python ./external/DCNv2/setup.py build develop
验证编译
python ./external/DCNv2/testcuda.py
出现Backward is not reentrant或正常运行则编译成功
修改./data/config.py
DATASETS下添加
coco2017_custom_dataset = dataset_base.copy({
'name': 'COCO 2017',
'train_info': 'xx.json',
'valid_info': 'xx.json',
'class_names': ('xx','yy','zz')
})
YOLACT++ CONFIGS下添加
yolact_coco_custom_config = yolact_plus_resnet50_config.copy({
'name': 'yolact_coco_custom',
'path': 'xx.pth',
'dataset': coco2017_custom_dataset,
'num_classes': len(coco2017_custom_dataset.class_names) + 1,
'max_size': 550,
'lr_steps': (280000, 600000, 700000, 750000),
'max_iter': 300000,
})
resnet101_backbone.copy中的path修改权重名称
开始训练
python train.py --config=yolact_coco_custom_config
运行问题
1>not compiled with GPU support;CUDNN_STATUS_NOT_INITIALIZED
torch,torchvison,torchaudio对应关系一般是1.x.0——0.x+1.0——0.x.0
确定好torch和cuda的存在和对应关系(换装torch的时候要装对应cuda的GPU版本建议whl安装),之后删除编译好的build重新编译测试
2>_tkinter.cpython-38-x86_64-linux-gnu.so
export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libpython3.x.so
3> 训练开始时停止
迁移学习未成功,认为已经训练完成