功能:按比例拆分一个文本文件;
输入:总文件A地址,拆分文件B和C地址,拆分比例;
输出:两个拆分好的新文本B和C;
import random
def split_file_randomly(input_file, output_file1, output_file2, ratio):
with open(input_file, 'r') as file:
lines = file.readlines()
random.shuffle(lines)
total_lines = len(lines)
split_index = int(total_lines * (ratio[0] / sum(ratio)))
lines1 = lines[:split_index]
lines2 = lines[split_index:]
with open(output_file1, 'w') as file:
file.writelines(lines1)
with open(output_file2, 'w') as file:
file.writelines(lines2)
if __name__ == "__main__":
input_file = './All.txt'
output_file1 = './train.txt'
output_file2 = './eval.txt'
#output_file1:output_file2
ratio = (9, 1)
split_file_randomly(input_file, output_file1, output_file2, ratio)