python eml解析_使用emaildata 0.3.4使用Python 3.6读取.eml文件

I am using python 3.6.1 and I want to read in email files (.eml) for processing. I am using the emaildata 0.3.4 package, however whenever I try to import the Text class as in the documentation, I get the module errors:

import email

from email.text import Text

>>> ModuleNotFoundError: No module named 'cStringIO'

When I tried to correct using this update, I get the next error relating to mimetools

>>> ModuleNotFoundError: No module named 'mimetools'

Is it possible to use emaildata 0.3.4 with python 3.6 to parse .eml files? Or are there any other packages I can use to parse .eml files? Thanks

解决方案

Using the email package, we can read in the .eml files. Then, use the BytesParser library to parse the file. Finally, use a plain preference (for plain text) with the get_body() method, and get_content() method to get the raw text of the email.

import email

from email import policy

from email.parser import BytesParser

import glob

file_list = glob.glob('*.eml') # returns list of files

with open(file_list[2], 'rb') as fp: # select a specific email file from the list

msg = BytesParser(policy=policy.default).parse(fp)

text = msg.get_body(preferencelist=('plain')).get_content()

print(text) # print the email content

>>> "Hi,

>>> This is an email

>>> Regards,

>>> Mister. E"

Granted, this is a simplified example - no mention of HTML or attachments. But it gets done essentially what the question asks and what I want to do.

Here is how you would iterate over several emails and save each as a plain text file:

file_list = glob.glob('*.eml') # returns list of files

for file in file_list:

with open(file, 'rb') as fp:

msg = BytesParser(policy=policy.default).parse(fp)

fnm = os.path.splitext(file)[0] + '.txt'

txt = msg.get_body(preferencelist=('plain')).get_content()

with open(fnm, 'w') as f:

print('Filename:', txt, file = f)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值