新手复现yolox(小白指南)傻瓜教程
一 复现步骤
本文复现环境:CUDA11.3 ;单卡
从0开始深度学习环境配置:从0开始深度学习环境配置
新手教程:
详细教程1
新手教程2
本文复现数据集:COCO2017 和 自制无人机数据集(labelme标注) 均转换为VOC格式处理
二 遇到的问题(教程中和复现中出现的问题)
1. 不应该直接对新建环境进行默认包安装
出错的方式
pip install -r requirements.txt
无配套torch、torchvision,直接用以上命令安装包,会导致很多版本不兼容的问题
建议的安装方式
(1)安装torch、torchvision的时候,建议官网下载,根据自己的cuda版本安装!
pytroch官网
(2)其他的包
pip install 安装包 -i 镜像源
pip install numpy -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install opencv_python -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install loguru -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install scikit-image -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install Pillow -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install thop -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install ninja -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install tabulate -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install tensorboard -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install tqdm -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install pycocotools==2.0.2 -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install onnxruntime==1.8.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install onnx==1.8.1 -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install onnx-simplifier==0.3.5 -i https://pypi.tuna.tsinghua.edu.cn/simple
3. 训练自己数据集(labelme-to-voc)出现类型错误
invalid literal for int() with base 10
2022-07-08 15:42:22 | INFO | yolox.core.trainer:261 - epoch: 10/300, iter: 320/325, mem: 4909Mb, iter_time: 0.203s, data_time: 0.004s, total_loss: 1.6, iou_loss: 0.9, l1_loss: 0.0, conf_loss: 0.4, cls_loss: 0.3, lr: 6.235e-04, size: 512, ETA: 5:28:02
2022-07-08 15:42:23 | INFO | yolox.core.trainer:356 - Save weights to ./YOLOX_outputs/VOC0707
100%|##########| 325/325 [00:21<00:00, 14.80it/s]
2022-07-08 15:42:46 | INFO | yolox.evaluators.voc_evaluator:160 - Evaluate in main process...
Writing 0 VOC results file
Eval IoU : 0.50
2022-07-08 15:42:48 | INFO | yolox.core.trainer:196 - Training of experiment is done and the best AP is 0.00
2022-07-08 15:42:48 | ERROR | yolox.core.launch:147 - An error has been caught in function '_distributed_worker', process 'SpawnProcess-1' (1800038), thread 'MainThread' (139871294735552):
Traceback (most recent call last):
...
File "/home/sunxue/anaconda3/envs/py36-new/lib/python3.6/site-packages/yolox-0.3.0-py3.6-linux-x86_64.egg/yolox/evaluators/voc_eval.py", line 27, in parse_rec
int(bbox.find("xmin").text),
│ └ <method 'find' of 'xml.etree.ElementTree.Element' objects>
└ <Element 'bndbox' at 0x7f352dfa1b88>
ValueError: invalid literal for int() with base 10: '922.5190839694656'
lableme转换的标签数据为float格式,需要对源代码格式修改
yolox/evaluators/voc_eval.py, line 27
改为:
int(float(bbox.find("xmin").text)),
int(float(bbox.find("ymin").text)),
int(float(bbox.find("xmax").text)),
int(float(bbox.find("ymax").text)),
4.KeyError ‘airplane’(自己训练的类型名)
需要再次安装yolox
python setup.py install
三 命令行执行yolox
测试环境
python tools/demo.py image -f exps/default/yolox_s.py -c ./yolox_s.pth --path assets/dog.jpg --conf 0.3 --nms 0.65 --tsize 640 --save_result --device gpu
训练命令行
python tools/train.py -f exps/example/yolox_voc/yolox_voc_s.py -d 1 -b 4 --fp16 -c yolox_s.pth
接着上一次训练结果继续训练的命令行
python tools/train.py -f exps/example/yolox_voc/yolox_voc_s.py -d 1 -b 4 -c YOLOX_outputs/yolox_voc_s/latest_ckpt.pth --resume --start_epoch=100
测试实验模型(于单张图像)
python tools/demo.py image -f exps/example/yolox_voc/yolox_voc_s.py -c YOLOX_outputs/yolox_voc_s/latest_ckpt.pth --path ./assets/dog.jpg --conf 0.3 --nms 0.65 --tsize 640 --save_result --device gpu
验证实验模型(于验证集)
python tools/eval.py -f exps/example/yolox_voc/yolox_voc_s.py -c YOLOX_outputs/yolox_voc_s/best_ckpt.pth -d 1 -b 4 --conf 0.001 --fp16
四 VOC数据集处理
1. labelme数据集转voc数据集 (labelme2xml)
import os
import numpy as np
import codecs
import json
import glob
import cv2
import shutil
from sklearn.model_selection import train_test_split
# 1.标签路径
labelme_path = "F:\data\datasets\VOC\LabelmeData/" # 原始labelme标注数据路径
saved_path = "F:\data\datasets\VOC\VOC2007/" # 保存路径
# 2.创建要求文件夹
dst_annotation_dir = os.path.join(saved_path, 'Annotations')
if not os.path.exists(dst_annotation_dir):
os.makedirs(dst_annotation_dir)
dst_image_dir = os.path.join(saved_path, "JPEGImages")
if not os.path.exists(dst_image_dir):
os.makedirs(dst_image_dir)
dst_main_dir = os.path.join(saved_path, "ImageSets", "Main")
if not os.path.exists(dst_main_dir):
os.makedirs(dst_main_dir)
# 3.获取待处理文件
org_json_files = sorted(glob.glob(os.path.join(labelme_path, '*.json')))
org_json_file_names = [i.split("\\")[-1].split(".json")[0] for i in org_json_files]
org_img_files = sorted(glob.glob(os.path.join(labelme_path, '*.jpg')))
org_img_file_names = [i.split("\\")[-1].split(".jpg")[0] for i in org_img_files]
# 4.labelme file to voc dataset
for i, json_file_ in enumerate(org_json_files):
json_file = json.load(open(json_file_, "r", encoding="utf-8"))
image_path = os.path.join(labelme_path, org_json_file_names[i]+'.jpg')
img = cv2.imread(image_path)
height, width, channels = img.shape
dst_image_path = os.path.join(dst_image_dir, "{:06d}.jpg".format(i))
cv2.imwrite(dst_image_path, img)
dst_annotation_path = os.path.join(dst_annotation_dir, '{:06d}.xml'.format(i))
with codecs.open(dst_annotation_path, "w", "utf-8") as xml:
xml.write('<annotation>\n')
xml.write('\t<folder>' + 'Pin_detection' + '</folder>\n')
xml.write('\t<filename>' + "{:06d}.jpg".format(i) + '</filename>\n')
# xml.write('\t<source>\n')
# xml.write('\t\t<database>The UAV autolanding</database>\n')
# xml.write('\t\t<annotation>UAV AutoLanding</annotation>\n')
# xml.write('\t\t<image>flickr</image>\n')
# xml.write('\t\t<flickrid>NULL</flickrid>\n')
# xml.write('\t</source>\n')
# xml.write('\t<owner>\n')
# xml.write('\t\t<flickrid>NULL</flickrid>\n')
# xml.write('\t\t<name>ChaojieZhu</name>\n')
# xml.write('\t</owner>\n')
xml.write('\t<size>\n')
xml.write('\t\t<width>' + str(width) + '</width>\n')
xml.write('\t\t<height>' + str(height) + '</height>\n')
xml.write('\t\t<depth>' + str(channels) + '</depth>\n')
xml.write('\t</size>\n')
xml.write('\t\t<segmented>0</segmented>\n')
for multi in json_file["shapes"]:
points = np.array(multi["points"])
xmin = min(points[:, 0])
xmax = max(points[:, 0])
ymin = min(points[:, 1])
ymax = max(points[:, 1])
label = multi["label"]
if xmax <= xmin:
pass
elif ymax <= ymin:
pass
else:
xml.write('\t<object>\n')
xml.write('\t\t<name>' + label + '</name>\n')
xml.write('\t\t<pose>Unspecified</pose>\n')
xml.write('\t\t<truncated>1</truncated>\n')
xml.write('\t\t<difficult>0</difficult>\n')
xml.write('\t\t<bndbox>\n')
xml.write('\t\t\t<xmin>' + str(xmin) + '</xmin>\n')
xml.write('\t\t\t<ymin>' + str(ymin) + '</ymin>\n')
xml.write('\t\t\t<xmax>' + str(xmax) + '</xmax>\n')
xml.write('\t\t\t<ymax>' + str(ymax) + '</ymax>\n')
xml.write('\t\t</bndbox>\n')
xml.write('\t</object>\n')
print(json_file_, xmin, ymin, xmax, ymax, label)
xml.write('</annotation>')
2、xml2txt
import os
import random
trainval_percent = 0.95 # 可以自己修改
train_percent = 0.95 # 可以自己修改
xmlfilepath = 'F:\data\datasets\VOC/uav_VOC2007\Annotations'
txtsavepath = 'F:\data\datasets\VOC/uav_VOC2007\ImageSets\Main'
if not os.path.exists(txtsavepath):
os.makedirs(txtsavepath)
total_xml = os.listdir(xmlfilepath)
num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)
ftrainval = open('F:\data\datasets\VOC/uav_VOC2007\ImageSets\Main/trainval.txt', 'w')
ftest = open('F:\data\datasets\VOC/uav_VOC2007\ImageSets\Main/test.txt', 'w')
ftrain = open('F:\data\datasets\VOC/uav_VOC2007\ImageSets\Main/train.txt', 'w')
fval = open('F:\data\datasets\VOC/uav_VOC2007\ImageSets\Main/val.txt', 'w')
for i in list:
name = total_xml[i][:-4] + '\n'
if i in trainval:
ftrainval.write(name)
if i in train:
ftrain.write(name)
else:
fval.write(name)
else:
ftest.write(name)
ftrainval.close()
ftrain.close()
fval.close()
ftest.close()
五 相关技巧
1. tensorboard使用
代码运行之后的可视化文件如下:
打开anaconda promt,使用以下命令,
tensorboard --logdir=文件所在路径
得到网址:tensorboard可视化网址
2.指定训练时使用的GPU
import os
os.environ['CUDA_VISIBLE_DEVICES']='0、3、5'