我编写了一个python脚本,用于读取邮件的内容并附加到list,并在R中调用这个python脚本。在
下面是我的python脚本:{
import sys
import string
import glob
def parseOutText(f):
f.seek(0)
all_text = f.read()
### split off metadata
content = all_text.split("Bcc:")
return content
def main():
path = "D:/Hadoop/practice/machine_learning/mails/*.txt"
files = glob.glob(path)
file_list = []
for each_file in files:
ff = open(each_file, "r")
text = parseOutText(ff)
#text = sys.stdout.write(ff.read())
file_list.append(text)
ff.close()
print(file_list)
print(len(file_list))
}
以及这个的输出。在[['From: xxx@xxx.com\nTo: xyz@xxx.com\nSubject: Hi\nCc:
abc@xxx.com\nMime-Version: 1.0\nContent-Transfer-Encoding: 7bit\n', '
test@xxx.com\n\nHi,\n\nYour problem is resolved. \n\nPlease reply to
this email and let us know if it is not working.\n\nThank you
\nCCD.'], ['From: abc@xxx.com\nTo: test2@xxx.com\nSubject: Hi\nCc:
xyz@xxx.com\nMime-Version: 1.0\nContent-Transfer-Encoding: 7bit\n', '
test@xxx.com\n\nHi,\n\nThis will not work out unless and until you
work harder.\n\nThank you \nCCD.']]
2
R代码:
^{pr2}$
R输出:[[1]] [1] "[['From: xxx@xxx.com\nTo: xyz@xxx.com\nSubject: Hi\nCc:
abc@xxx.com\nMime-Version: 1.0\nContent-Transfer-Encoding: 7bit\n',
' test@xxx.com\n\nHi,\n\nYour problem is resolved. \n\nPlease
reply to this email and let us know if it is not working.\n\nThank
you \nCCD.'], ['From: abc@xxx.com\nTo: test2@xxx.com\nSubject:
Hi\nCc: xyz@xxx.com\nMime-Version: 1.0\nContent-Transfer-Encoding:
7bit\n', ' test@xxx.com\n\nHi,\n\nThis will not work out unless
and until you work harder.\n\nThank you \nCCD.']]"
print(length(output))
[1] 1
我怎样才能把两封邮件作为同一个列表中的两个元素?在
邮件:From: xxx@xxx.com
To: xyz@xxx.com
Subject: Hi
Cc: abc@xxx.com
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
Bcc: test@xxx.com
Hi,
Your problem is resolved.
Please reply to this email and let us know if it is not working.
Thank you
CCD.
第二封邮件:From: abc@xxx.com
To: test2@xxx.com
Subject: Hi
Cc: xyz@xxx.com
Mime-Version: 1.0
Content-Transfer-Encoding: 7bit
Bcc: test@xxx.com
Hi,
This will not work out unless and until you work harder.
Thank you
CCD.