解决训练过程中出现的警告:
WARNING ⚠️ Box and segment counts should be equal, but got len(segments) = ****, len(boxes) = ****. To resolve this only boxes will be used and all segments will be removed. To avoid this please supply either a detect or segment dataset, not a detect-segment mixed dataset.
WARNING ⚠️ Box and segment counts should be equal, but got len(segments) = ****, len(boxes) = ****. To resolve this only boxes will be used and all segments will be removed. To avoid this please supply either a detect or segment dataset, not a detect-segment mixed dataset.
我从Roboflow标注网站的开源数据集中收集数据的时候,我发现获得的数据集中不会自动分开分割集与检测数据集(Polygon与Bounding Box类型的标注混杂在一起)
导致Yolov8运行的时候自动抛弃分割数据标注部分,导致数据集大量空缺,模型精度大幅下降(Polygon部分被抛弃)
观察txt文件,发现分割的数据集是以坐标形式存在的,而box标注的数据集是以xywh形式存在的。
为了解决这个问题,我想出两个办法。
办法一
挑出数据集中的分割标注(检测txt文件中是否存在数字个数大于5的行,有则将文件移动到一个新的文件夹中)
使用方法:将AIM_DIR 修改为目标路径写入
import os
AIM_DIR = r"E:\yolov8\yolov8-fire\data\valid\\"
OLD_LABEL = AIM_DIR + r"labels\\"
NEW_LABEL = AIM_DIR + r"target_labels\\"
OLD_IMAGE = AIM_DIR + r"images\\"
NEW_IMAGE = AIM_DIR + r"target_images\\"
# 创建新文件夹
if not os.path.exists(NEW_LABEL[:-1]):
os.makedirs(NEW_LABEL[:-1])
if not os.path.exists(NEW_IMAGE[:-1]):
os.makedirs(NEW_IMAGE[:-1])
# 获取路径内的所有文件路径列表
yolo_file = os.listdir(OLD_LABEL)
# 遍历文件夹
for label_name in yolo_file:
# 打开文件
old_label_path = OLD_LABEL + label_name
flag = False
with open(old_label_path, "r+") as f:
for line in f:
# 获取每行数字个数, 如果数字个数大于5
if len(line.split(' ')) > 5:
flag = True
break
# 关闭文件之后再进行文件移动,避免冲突
if flag:
old_image_path = OLD_IMAGE + label_name[:-3] + "jpg"
new_image_path = NEW_IMAGE + label_name[:-3] + "jpg"
new_label_path = NEW_LABEL + label_name
os.rename(old_label_path, new_label_path)
os.rename(old_image_path, new_image_path)
print(new_label_path)
办法二
直接将分割标注转化为box标注
import os
AIM_DIR = r"E:\yolov8\yolov8-fire\data\train\labels\\"
# 获取路径内的所有文件路径列表
yolo_file = os.listdir(AIM_DIR)
# 遍历文件
for label_name in yolo_file:
# 打开文件
label_path = AIM_DIR + label_name
file_data = ""
with open(label_path, "r") as f:
for line in f:
nums = line.split(' ')
if len(line.split('')) > 5:
Head = nums[0]
min_X, min_Y, max_X, max_Y = 10, 10, 0, 0
for i in range(1, len(nums), 2):
if float(nums[i]) < float(min_X):
min_X = float(nums[i])
if float(nums[i]) > float(max_X):
max_X = float(nums[i])
if float(nums[i + 1]) < float(min_Y):
min_Y = float(nums[i + 1])
if float(nums[i + 1]) > float(max_Y):
max_Y = float(nums[i + 1])
x, y, w, h = (min_X + max_X) / 2, (min_Y + max_Y) / 2, max_X - min_X, max_Y - min_Y
line = Head + " " + str(x) + " " + str(y) + " " + str(w) + " " + str(h) + '\n'
file_data += line
with open(label_path, "w") as f:
f.write(file_data)
print(label_path)