在学习pytorch教程中Chatbot Tutorial,跟着作者的代码复现,遇到了这个问题:在把格式化的对话数据写入文件时,发现多了空行,作者的代码是:
# Write new csv file
print("\nWriting newly formatted file...")
with open(datafile, 'w', encoding='utf-8') as outputfile:
writer = csv.writer(outputfile, delimiter=delimiter)
for pair in extractSentencePairs(conversations):
writer.writerow(pair)
输出为:
Sample lines from file:
b"Can we make this quick? Roxanne Korrine and Andrew Barrett are having an incredibly horrendous public break- up on the quad. Again.\tWell, I thought we'd start with pronunciation, if that's okay with you.\r\r\n"
b"Well, I thought we'd start with pronunciation, if that's okay with you.\tNot the hacking and gagging and spitting part. Please.\r\r\n"
b"Not the hacking and gagging and spitting part. Please.\tOkay... then how 'bout we try out some French cuisine. Saturday? Night?\r\r\n"
b"You're asking me out. That's so cute. What's your name again?\tForget it.\r\r\n"
b"No, no, it's my fault -- we didn't have a proper introduction ---\tCameron.\r\r\n"
b"Cameron.\tThe thing is, Cameron -- I'm at the mercy of a particularly hideous breed of loser. My sister. I can't date until she does.\r\r\n"
b"The thing is, Cameron -- I'm at the mercy of a particularly hideous breed of loser. My sister. I can't date until she does.\tSeems like she could get a date easy enough...\r\r\n"
b'Why?\tUnsolved mystery. She used to be really popular when she started high school, then it was just like she got sick of it or something.\r\r\n'
b"Unsolved mystery. She used to be really popular when she started high school, then it was just like she got sick of it or something.\tThat's a shame.\r\r\n"
b'Gosh, if only we could find Kat a boyfriend...\tLet me see what I can do.\r\r\n'
在参考了LU敏的博客后,发现在打开文件时,需要增加一个参数:newline=‘’
# Write new csv file
print("\nWriting newly formatted file...")
with open(datafile, 'w', newline='', encoding='utf-8') as outputfile:
writer = csv.writer(outputfile, delimiter=delimiter)
for pair in extractSentencePairs(conversations):
writer.writerow(pair)
现在输出就没有空行了:
Sample lines from file:
b"Can we make this quick? Roxanne Korrine and Andrew Barrett are having an incredibly horrendous public break- up on the quad. Again.\tWell, I thought we'd start with pronunciation, if that's okay with you.\r\n"
b"Well, I thought we'd start with pronunciation, if that's okay with you.\tNot the hacking and gagging and spitting part. Please.\r\n"
b"Not the hacking and gagging and spitting part. Please.\tOkay... then how 'bout we try out some French cuisine. Saturday? Night?\r\n"
b"You're asking me out. That's so cute. What's your name again?\tForget it.\r\n"
b"No, no, it's my fault -- we didn't have a proper introduction ---\tCameron.\r\n"
b"Cameron.\tThe thing is, Cameron -- I'm at the mercy of a particularly hideous breed of loser. My sister. I can't date until she does.\r\n"
b"The thing is, Cameron -- I'm at the mercy of a particularly hideous breed of loser. My sister. I can't date until she does.\tSeems like she could get a date easy enough...\r\n"
b'Why?\tUnsolved mystery. She used to be really popular when she started high school, then it was just like she got sick of it or something.\r\n'
b"Unsolved mystery. She used to be really popular when she started high school, then it was just like she got sick of it or something.\tThat's a shame.\r\n"
b'Gosh, if only we could find Kat a boyfriend...\tLet me see what I can do.\r\n'
看了一下这篇帖子的评论,教程里的代码在Linux直接运行应该是没问题的,因为“Linux的换行只有LF, 刚好和多出来的CR配合”(by zhzyx)
因为我是在windows下做的实验,所以加上前面提到的参数也可以解决空行的问题