python 文本txt转csv
txt格式:
每行一条文本数据及标签,中间用"\t"隔开。
代码示例:
corpus_sentences = []
chi_labels = []
with open(txt_path, mode='r', encoding='utf-8') as fIn:
lines =list(csv.reader(fIn, delimiter='\n'))
for line in lines:
top2column = line[0].replace('\n', '').split('\t')[:2] # 获取前两列
corpus_sentences.append(top2column[0])
chi_labels.append(top2column[1])
with open(csv_path, 'w', encoding='utf-8') as t:
t.write(str('corpus_sentences')+'\t'+str('chi_labels')+'\n')
for i in range(len(labels)):
t.write(str(corpus_sentences[i])+'\t'+str(chi_labels[i])+'\n')
得到CSV文件,可以通过以下代码查看:
df = pd.read_csv(csv_path,encoding='utf-8',sep='\t')
corpus_sentences = df['corpus_sentences'].tolist()
labels = df['chi_labels'].tolist()
print(chi_labels[:3])