- 脚本功能:
将数据集分割为训练集和验证集,并分别存放在train.txt
和val.txt
文件中。 - 代码实现如下:
import os import random def main(): random.seed(0) # 设置随机种子,保证随机结果可复现 files_path = "./VOCdevkit/VOC2012/Annotations" assert os.path.exists(files_path), "path: '{}' does not exist.".format(files_path) val_rate = 0.5 #验证集比例设置为0.5 #获取数据集的名字(不包含后缀),排序后存放到files_name列表中 files_name = sorted([file.split(".")[0] for file in os.listdir(files_path)]) files_num = len(files_name) val_index = random.sample(range(0, files_num), k=int(files_num*val_rate)) train_files = [] val_files = [] for index, file_name in enumerate(files_name): if index in val_index: val_files.append(file_name) else: train_files.append(file_name) try: train_f = open("train.txt", "x") eval_f = open("val.txt", "x") train_f.write("\n".join(train_files)) eval_f.write("\n".join(val_files)) except FileExistsError as e: print(e) exit(1) if __name__ == '__main__': main()
"\n".join(train_files)
将train_files中的每两个元素之间用"\n"连接为字符串并返回。- 脚本作者
b站wz
脚本:将数据集分割为训练集和验证集
最新推荐文章于 2022-11-19 18:25:05 发布