0、前言
因产品的同学试着用python统计Excel表格,但百度的程序又一直有各种问题, 所以叫我帮用python实现一个统计Excel表格程序;我之前也没有写过python操作Excel表格的程序,快速搜索实现了一下,还是发现一些问题,所以分享一下。
1、需求
我们使用TAPD管理项目,这里从TAPD导出的原Excel报表,统计后再生成的新Excel表。
原Excel报表是任务(标题)的信息列表,需要按需求分类分别统计各处理人的预估工时,如下(图1原始Excel表格有200多行、图2统计生成的新Excel表格):


2、python操作excel表格说明
python操作excel表格有不少组件,如:xlwt、openpyxl、pandas、xlrd、xlwings
一些组件在使用时有问题,如下:
https://blog.csdn.net/weixin_42555985/article/details/102872781
https://www.jianshu.com/p/8640abf11297
这里选用xlwings,xlwings使用接近VBA的语法从Python自动与Excel交互,注意是使用到Excel软件;
官方文档及一些参考如下:
https://docs.xlwings.org/en/stable/index.html
https://blog.csdn.net/whalefall/article/details/102665002
http://www.dszhp.com/xlwings-range.html
https://www.cnblogs.com/cyanrose/p/12059040.html
https://blog.csdn.net/lh_hebine/article/details/104559382
当然xlwing使用中也发现一个问题,range().expand()选择表格范围时无法识别一行/一列中的空值,遇到空值默认读取终止:
https://blog.csdn.net/weixin_44781801/article/details/88692982
下面程序会跳过一行/一列的范围选择问题。
3、python xlwings操作excel表格程序
目录下有111.py脚本、原始excel表格222.xls和生成的excel表格333.xls;
直接运行程序: $ python 111.py
即可实现222.xls --统计、生成--> 333.xls,效果如前面的excel表格图片;
下面是python 111.py脚本程序,可在python2.7运行:
# -*- coding: utf-8 -*-
import xlwings as xw
import sys
class Story(object):
def __init__(self, handler, workingHours, demandClassification):
self.handler = handler
self.workingHours = workingHours
self.demandClassification = demandClassification
def get_handler(self):
return self.handler
def get_workingHours(self):
return self.workingHours
def get_demandClassification(self):
return self.demandClassification
def __str__(self):
return "处理人:" + self.handler + ", 预估工时:" + str(self.workingHours) + ", 需求分类:" + self.demandClassification
filePath = r'222.xls'
newFilePath = r'333.xls'
reload(sys)
sys.setdefaultencoding('utf8')
app = None
wb = None
wbNew = None
appNew = None
try:
app = xw.App(visible=False, add_book=False)
app.display_alerts = False
app.screen_updating = False
wb = app.books.open(filePath)
sht = wb.sheets.active
# shtName=sht.name
# print(shtName)
rowCount = sht.range('A1').expand('table').rows.count
# print(rowCount)
data = sht.range('A3:G' + str(rowCount)).value
# print(data)
storyList = list()
for i in range(len(data)):
# print(data[i])
# for j in range(len(data[i])):
# print(data[i][j])
handler = data[i][1]
workingHours = data[i][4]
demandClassification = data[i][5]
if handler is None:
handler = "未安排处理人"
if workingHours is None:
workingHours = 0
if demandClassification is None:
demandClassification = "未知需求分类"
story = Story(str(handler), int(str(workingHours)),
str(demandClassification))
storyList.append(story)
# print(str(story))
if len(storyList) > 0:
handlerSet = set()
demandClassificationSet = set()
workingHoursMap = dict()
for story in storyList:
handlerSet.add(story.get_handler())
demandClassificationSet.add(story.get_demandClassification())
key = str(story.get_handler() + story.get_demandClassification())
if workingHoursMap.get(key) is None:
workingHoursMap[key] = 0
workingHoursMap[key] = workingHoursMap.get(
key) + story.get_workingHours()
handlerList = list(handlerSet)
demandClassificationList = list(demandClassificationSet)
workingHoursList = []
handlerGbkList = []
demandClassificationGbkList = []
change = False
for j in range(len(demandClassificationList)):
workingHoursList.append([])
for k in range(len(handlerList)):
key = str(handlerList[k] + demandClassificationList[j])
workingHoursList[j].append(workingHoursMap.get(key))
if change is False:
handlerGbkList.append(handlerList[k].encode("gbk"))
change = True
demandClassificationGbkList.append(
demandClassificationList[j].encode("gbk"))
try:
appNew = xw.App(visible=False, add_book=False)
appNew.display_alerts = False
appNew.screen_updating = False
wbNew = appNew.books.add()
shtNew = wbNew.sheets['sheet1']
shtNew.range('A2').options(
transpose=True).value = demandClassificationGbkList
shtNew.range('B1').value = handlerGbkList
shtNew.range('B2').value = workingHoursList
wbNew.save(newFilePath)
finally:
if wbNew is not None:
wbNew.close()
if appNew is not None:
appNew.quit()
finally:
if wb is not None:
wb.close()
if app is not None:
app.quit()