【趣味项目之Excel报表汇总】Python+pandas+xlwings实现批量提取表格信息汇总到表格并发送到邮箱


前言

利用洗完澡睡前的时间,花了三四天,做了这个程序来作为我的实习作业。
可以实现批量提取规定格式的表格,按时间排序后按指定格式汇总到表格文件,然后自动发送到邮箱,还为程序还搭了一个简单的GUI,可以手动选择文件夹来作为工作区。

一、分功能详解

1.GUI

GUI使用了比较简单可以快速建立的pysimplegui,甚至比tkinter还简单多了,主体是一个进度条
请添加图片描述
再选择文件夹后点击运行即可启动功能实现的线程
这里提到线程,没错!必须使用多线程来实现gui,并且pysimplegui是占据主线程的,这一点在官方文档也说的很明白(不得不说pysimplegui的文档做的很不错)

def init_gui():
    sg.theme('SystemDefaultForReal')
    layout = [[sg.Text('当前文件夹:'), sg.Text('', key='text_path')],
              [sg.Text('任务完成进度')],
              [sg.ProgressBar(100, orientation='h', size=(50, 20), key='progressbar')],
              [sg.Text('', key='progressname')],
              [sg.FolderBrowse('打开文件夹', key='folder', target='text_path'), sg.Button('运行'), sg.Cancel()]]
    window = sg.Window('执行进度', layout)
    progress_bar = window['progressbar']
    progress_name = window['progressname']
    return progress_bar, progress_name, window

对进度条和当前文件的更新:
需要在main中设置循环体,循环体中

	progress_bar.update_bar(progress_point)
    progress_name.update(progress_name_str)```

2.多线程

def work():
    global complete_flag
    pythoncom.CoInitialize()
    workbook, worksheet, app = init_worksheet()
    worksheet = read_excel(folder_path, worksheet, workbook, app)
    sort_elem()
    sendemail()
    complete_flag = 2

为了增强代码的可维护性,在封装上下了一点点功夫,将工作函数按流程封装到这个函数中
下面这句是为了解决在多线程中xlwings会报错的问题

	pythoncom.CoInitialize()

我们用保护模式为工作函数建立线程,保护模式就是主线程关闭这个线程就关闭

	worker_task = threading.Thread(target=work)
    worker_task.setDaemon(True)

在主线程中必不可少的进行挂起,不然工作线程是没法运行的

	time.sleep(0.1)

3.Excel处理

咱们读取数据用的是pandas,pandas提供了比较强大方便的查询功能,而进行写入使用的是xlwings

3.1 初始化xlwings

首先是xlwings初始化,xlwings会打开Excel创建一个新的表格,我们在表格上进行编辑

def init_worksheet():
    global progress_point, progress_name_str
    progress_point = 10
    progress_name_str = 'Excel初始化中'
    app = xw.App(visible=False, add_book=False)
    workbook = app.books.add()
    worksheet = workbook.sheets.add('工作记录')
    worksheet.range('A1').value = '报修时间'
    worksheet.range('B1').value = '报修部门'
    worksheet.range('C1').value = '报修班组'
    worksheet.range('D1').value = '报修内容'
    worksheet.range('E1').value = '维修判定'
    worksheet.range('F1').value = '材料更换'
    worksheet.range('G1').value = '维修人员'
    return workbook, worksheet, app

3.2 os库批量读取Excel表格

利用os库实现对表格文件的遍历,在此函数中调用函数让表格元素插入新表格

def read_excel(folder_path, worksheet, workbook, app):
    global progress_point, progress_name_str
    file_list = os.listdir(folder_path)
    per_point = 60 / len(file_list)
    for i in file_list:
        if i == '总工单.xlsx':
            continue
        progress_point += per_point
        progress_name_str = i
        worksheet = read_elem(worksheet, i)
    workbook.save(output_path)
    workbook.close()
    app.quit()
    return worksheet

3.3 pandas库读取表格元素

用pandas库将读取表格文件建立dataframe,再按位置提取出来,提取出来的元素插入新表格中。
这一步是要对处理的表格进行定制的,这也是我一直想改进的地方,但是感觉太麻烦了

def read_elem(worksheet, filename):
    data = pd.DataFrame(pd.read_excel(folder_path + '\\' + filename))
    time = data.iloc[1][1]  # 行列
    content = data.iloc[2][1]
    department = data.iloc[0][1]
    classes = data.iloc[0][3]
    worker = data.iloc[0][5]
    result = data.iloc[3][1]
    goods = data.iloc[4][1]
    worksheet = insert_elem(worksheet, time, department, classes, content, result, goods, worker)

    return worksheet

3.4 xlwings插入元素到表格

咱按照位置给元素插到一排去,这也是汇总的核心了

def insert_elem(worksheet, time, department, classes, content, result, goods, worker):
    global count
    count += 1
    worksheet.range('A' + str(count)).value = time
    worksheet.range('B' + str(count)).value = department
    worksheet.range('C' + str(count)).value = classes
    worksheet.range('D' + str(count)).value = content
    worksheet.range('E' + str(count)).value = result
    worksheet.range('F' + str(count)).value = goods
    worksheet.range('G' + str(count)).value = worker

    return worksheet

3.5 pandas库给表格分类

前面保存的文件只是中间文件,将已经输出的文件再次读取,利用sort_values对时间进行排序

def sort_elem():
    global progress_point, progress_name_str
    progress_point = 90
    progress_name_str = 'Excel文件排序中'
    data = pd.DataFrame(pd.read_excel(output_path))
    data = data.sort_values(by='报修时间', ascending=True)
    data.to_excel(output_path, index=False)
    shape_excel()

3.6 xlwings给设置表格外观

将排序好的表格文件读取进行表格形状的设置,如果不设置的话会比较不好看,我们可以按照自己想要的样子进行设置表格属性,这里不涉及循环,时间复杂度低。

def shape_excel():
    app = xw.App(visible=False, add_book=False)
    workbook = app.books.open(output_path)
    worksheet = workbook.sheets[0]
    value = worksheet.range('A1').expand('down')
    value.column_width = 16  # 宽度
    value = worksheet.range('D1').expand('down')
    value.column_width = 20  # 宽度
    value = worksheet.range('E1').expand('down')
    value.column_width = 30  # 宽度
    value = worksheet.range('F1').expand('down')
    value.column_width = 16  # 宽度
    value = worksheet.range('A1').expand('right')
    value.row_height = 30  # 行高 磅数
    value = worksheet.range('A2').expand('table')
    value.row_height = 20  # 行高 磅数
    workbook.save(output_path)
    workbook.close()
    app.quit()

4. 发送邮件

以下涉及到的信息都是虚构的,这段代码借鉴了一些博主的方法

def sendemail():
    global progress_point, progress_name_str
    progress_name_str = '邮件发送中'
    host_server = 'smtp.126.com'  # 126邮箱smtp服务器
    sender_email = 'email666hhh@126.com'  # 发件人邮箱
    pwd = '去邮箱申请一个stmp的授权码'
    receiver = ['1********6@qq.com']  # 收件人邮箱
    mail_title = '通信信息班7-8月工单汇总'  # 邮件标题
    mail_content = "大王,本月报表请过目!"  # 邮件正文内容
    msg = MIMEMultipart()
    msg["Subject"] = Header(mail_title, 'utf-8')
    msg["From"] = sender_email
    msg["To"] = ";".join(receiver)
    msg.attach(MIMEText(mail_content, 'plain'))  # html/plain
    attachment = MIMEApplication(open(output_path, 'rb').read())
    attachment["Content-Type"] = 'application/octet-stream'
    basename = "通信信息班7-8月工单汇总.xlsx"  # 给附件重命名 不能改文件扩展名
    attachment.add_header('Content-Disposition', 'attachment',
                          filename=('utf-8', '', basename))  # 注意:此处basename要转换为gbk编码,否则中文会有乱码。
    msg.attach(attachment)
    try:
        smtp = SMTP_SSL(host_server)  # ssl登录连接到邮件服务器
        smtp.set_debuglevel(1)  # 0是关闭,1是开启debug
        smtp.ehlo(host_server)  # 跟服务器打招呼,告诉它我们准备连接,最好加上这行代码
        smtp.login(sender_email, pwd)
        smtp.sendmail(sender_email, receiver, msg.as_string())
        smtp.quit()
        print("邮件发送成功")
        progress_point = 100
        progress_name_str = '任务完成'
    except smtplib.SMTPException:
        print("无法发送邮件")

二、执行结果

1.执行结果

下面表格纯属虚构
这样的表格
请添加图片描述
试验了二十张
请添加图片描述
汇总结果
请添加图片描述
邮箱收到了请添加图片描述
GUI界面如下
请添加图片描述

2.代码

import numpy as np
import time
import pandas as pd
import xlwings as xw
import os
import os.path
import smtplib
import string
from smtplib import SMTP_SSL
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
from email.header import Header
from email.mime.application import MIMEApplication  # 用于添加附件
import PySimpleGUI as sg
import threading
import pythoncom
import win32

complete_flag = 0
count = 1
progress_point = 0
progress_name_str = ''
folder_path = ''
output_path = folder_path + r'\总工单.xlsx'


def init_gui():
    sg.theme('SystemDefaultForReal')
    layout = [[sg.Text('当前文件夹:'), sg.Text('', key='text_path')],
              [sg.Text('任务完成进度')],
              [sg.ProgressBar(100, orientation='h', size=(50, 20), key='progressbar')],
              [sg.Text('', key='progressname')],
              [sg.FolderBrowse('打开文件夹', key='folder', target='text_path'), sg.Button('运行'), sg.Cancel()]]
    window = sg.Window('执行进度', layout)
    progress_bar = window['progressbar']
    progress_name = window['progressname']
    return progress_bar, progress_name, window


def init_worksheet():
    global progress_point, progress_name_str
    progress_point = 10
    progress_name_str = 'Excel初始化中'
    app = xw.App(visible=False, add_book=False)
    workbook = app.books.add()
    worksheet = workbook.sheets.add('工作记录')
    worksheet.range('A1').value = '报修时间'
    worksheet.range('B1').value = '报修部门'
    worksheet.range('C1').value = '报修班组'
    worksheet.range('D1').value = '报修内容'
    worksheet.range('E1').value = '维修判定'
    worksheet.range('F1').value = '材料更换'
    worksheet.range('G1').value = '维修人员'
    return workbook, worksheet, app


def insert_elem(worksheet, time, department, classes, content, result, goods, worker):
    global count
    count += 1
    worksheet.range('A' + str(count)).value = time
    worksheet.range('B' + str(count)).value = department
    worksheet.range('C' + str(count)).value = classes
    worksheet.range('D' + str(count)).value = content
    worksheet.range('E' + str(count)).value = result
    worksheet.range('F' + str(count)).value = goods
    worksheet.range('G' + str(count)).value = worker

    return worksheet


def read_elem(worksheet, filename):
    data = pd.DataFrame(pd.read_excel(folder_path + '\\' + filename))
    time = data.iloc[1][1]  # 行列
    content = data.iloc[2][1]
    department = data.iloc[0][1]
    classes = data.iloc[0][3]
    worker = data.iloc[0][5]
    result = data.iloc[3][1]
    goods = data.iloc[4][1]
    worksheet = insert_elem(worksheet, time, department, classes, content, result, goods, worker)

    return worksheet


def read_excel(folder_path, worksheet, workbook, app):
    global progress_point, progress_name_str
    file_list = os.listdir(folder_path)
    per_point = 60 / len(file_list)
    for i in file_list:
        if i == '总工单.xlsx':
            continue
        progress_point += per_point
        progress_name_str = i
        worksheet = read_elem(worksheet, i)
    workbook.save(output_path)
    workbook.close()
    app.quit()
    return worksheet


def sort_elem():
    global progress_point, progress_name_str
    progress_point = 90
    progress_name_str = 'Excel文件排序中'
    data = pd.DataFrame(pd.read_excel(output_path))
    data = data.sort_values(by='报修时间', ascending=True)
    data.to_excel(output_path, index=False)
    shape_excel()


def shape_excel():
    app = xw.App(visible=False, add_book=False)
    workbook = app.books.open(output_path)
    worksheet = workbook.sheets[0]
    value = worksheet.range('A1').expand('down')
    value.column_width = 16  # 宽度
    value = worksheet.range('D1').expand('down')
    value.column_width = 20  # 宽度
    value = worksheet.range('E1').expand('down')
    value.column_width = 30  # 宽度
    value = worksheet.range('F1').expand('down')
    value.column_width = 16  # 宽度
    value = worksheet.range('A1').expand('right')
    value.row_height = 30  # 行高 磅数
    value = worksheet.range('A2').expand('table')
    value.row_height = 20  # 行高 磅数
    workbook.save(output_path)
    workbook.close()
    app.quit()


def sendemail():
    global progress_point, progress_name_str
    progress_name_str = '邮件发送中'
    host_server = 'smtp.126.com'  # 126邮箱smtp服务器
    sender_email = 'esfsfsfc@126.com'  # 发件人邮箱
    pwd = '**********'
    receiver = ['1*******6@qq.com']  # 收件人邮箱
    mail_title = '通信信息班7-8月工单汇总'  # 邮件标题
    mail_content = "大王,本月报表请过目!"  # 邮件正文内容
    msg = MIMEMultipart()
    msg["Subject"] = Header(mail_title, 'utf-8')
    msg["From"] = sender_email
    msg["To"] = ";".join(receiver)
    msg.attach(MIMEText(mail_content, 'plain'))  # html/plain
    attachment = MIMEApplication(open(output_path, 'rb').read())
    attachment["Content-Type"] = 'application/octet-stream'
    basename = "通信信息班7-8月工单汇总.xlsx"  # 给附件重命名 不能改文件扩展名
    attachment.add_header('Content-Disposition', 'attachment',
                          filename=('utf-8', '', basename))  # 注意:此处basename要转换为gbk编码,否则中文会有乱码。
    msg.attach(attachment)
    try:
        smtp = SMTP_SSL(host_server)  # ssl登录连接到邮件服务器
        smtp.set_debuglevel(1)  # 0是关闭,1是开启debug
        smtp.ehlo(host_server)  # 跟服务器打招呼,告诉它我们准备连接,最好加上这行代码
        smtp.login(sender_email, pwd)
        smtp.sendmail(sender_email, receiver, msg.as_string())
        smtp.quit()
        print("邮件发送成功")
        progress_point = 100
        progress_name_str = '任务完成'
    except smtplib.SMTPException:
        print("无法发送邮件")


def work():
    global complete_flag
    pythoncom.CoInitialize()
    workbook, worksheet, app = init_worksheet()
    worksheet = read_excel(folder_path, worksheet, workbook, app)
    sort_elem()
    sendemail()
    complete_flag = 2


if __name__ == "__main__":
    progress_bar, progress_name, window = init_gui()
    worker_task = threading.Thread(target=work)
    worker_task.setDaemon(True)
    while True:
        event, values = window.read(timeout=10)
        time.sleep(0.1)
        if event == 'Cancel' or complete_flag == 2:
            break
        if values['folder']:
            folder_path = values['folder']
            complete_flag = 1
        if event == '运行' and complete_flag == 1:
            worker_task.start()
        if event == '运行' and complete_flag == 0:
            sg.popup('请先选择文件夹后再点击运行', title='Error', auto_close=True, auto_close_duration=5, grab_anywhere=True)
        progress_bar.update_bar(progress_point)
        progress_name.update(progress_name_str)

  • 5
    点赞
  • 23
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

我叫方程

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值