找数据集训练模型,在网上发现了个已标注的VOC格式数据集,想转为YOLO格式,
涉及到一些需要用office和python的数据处理,删除VOC数据集中train.txt和val.txt的特定列。
只保留图中红框部分,需要删掉第二列,删掉images/和./jpg
word有个查找替换功能可以删掉images/和.jpg
再用代码删掉第二列
"""
只保留txt文档的第1列数据
"""
# -*- coding:utf-8 -*-
import sys
# f = open("C:/Users/10974/Desktop/YH/DATASET/fire-smoke/train_qc.txt", encoding='utf-8')
f = open("C:/Users/10974/Desktop/YH/DATASET/fire-smoke/val_qc.txt", encoding='utf-8')
line = f.readline()
list = []
while line:
a = line.split(" ") #以空格的方式分隔开数据
b = a[0:1] #选择第1行保存下来(如果想保存第2,3行就写成b = a[1,3])即可
list.append(b)
list.append('\n')
line = f.readline()
f.close()
# with open('C:/Users/10974/Desktop/YH/DATASET/fire-smoke/train.txt', 'a') as month_file: # 提取后的数据文件
with open('C:/Users/10974/Desktop/YH/DATASET/fire-smoke/val.txt', 'a') as month_file: # 提取后的数据文件
for line in list:
s = ' '.join(line)
month_file.write(s)