网上资料同质化严重,也尝试了很多包,yagmail, zmail, exchanglib,大同小异。以win32com为例,汇总如下。
1,发邮件
import win32com.client as win32
outlook = win32.Dispatch('Outlook.Application')
mail_item = outlook.CreateItem(0) # 0: creat mail
# 收件人继续写下去
mail_item.Recipients.Add('xx@phfund.com.cn')
mail_item.Recipients.Add('yy@phfund.com.cn')
mail_item.Subject = 'Mail Test'
mail_item.BodyFormat = 2 # 2: Html format
mail_item.HTMLBody = '''
<H2>Hello, This is a test mail.</H2>
Hello Guys.
'''
# 附件继续写下去
mail_item.Attachments.Add('C:\\Users\\xx\\Desktop\\慢慢慢.txt')
mail_item.Attachments.Add('C:\\Users\\xx\\Desktop\\慢慢慢.jpg')
mail_item.Attachments.Add('C:\\Users\\xx\\Desktop\\慢慢慢.xlsx')
mail_item.Send()
2,读取邮件正文中的表格
这里的index是以1开始,要跟python的索引区分开
from win32com.client.gencache import EnsureDispatch as Dispatch
import pandas as pd
import datetime
from bs4 import BeautifulSoup
outlook = Dispatch("Outlook.Application").GetNamespace("MAPI") #构建实例
inbox = outlook.Folders["xx@phfund.com.cn"].Folders["收件箱"] #提取收件箱
Mail_Messages = inbox.Items #提取内容
Mail_Messages.Sort("[ReceivedTime]", True) #按照接受日期排序
# 筛选目标邮件
for mail in Mail_Messages: # 每个邮件进行遍历筛选
if hasattr(mail, 'SenderName'): # 如果有邮件sendername
if mail.SenderName == 'aa' and mail.ReceivedTime.date() > \
(datetime.datetime.now() - datetime.timedelta(days=7)).date() \
and '一对一' in mail.Subject: # 提取收件人,收件时间,收件主题
break
# 筛选目标邮件方法2
target_mail = Mail_Messages.Restrict("[SenderName]='aa' " )[1]
# Mail_Messages.restrict("[SentOn] > '5/31/2017 08:00 AM'")
a = Mail_Messages.Restrict("[ReceivedTime]>'05/31/2017'and [SenderName]='aa' " )
# 正文中有表格
mail_body = mail.HTMLBody
html_body = BeautifulSoup(mail_body,'html.parser')
html_tables = html_body.find_all('table')
df = pd.read_html(str(html_tables),header=0)[0]
3,读取附件中文件
读取附件要先下载下来,再根据格式进行读取
target_mail = Mail_Messages.Restrict("[ReceivedTime]>'05/31/2017'and [SenderName]='xx' " )[1]
target_mail.Attachments[1].SaveAsFile('D:\\pycharm_code\\邮件读取\\t.xlsx')
df = pd.read_excel('D:\\pycharm_code\\邮件读取\\t.xlsx')
4,目前不足
- 1.在筛选目标邮件中,尚不会用关键字过滤,尝试过
.restrict([subject] contains 'xx')
和.restrict('xx' in [subject])
和.restrict([subject_contains] = 'xx')
- 2.附件要下下来再读,不能直接读取
有知道的欢迎私戳