Fintech系列(三) -- python对excel操作模块汇总||推荐指数||用法示例

Working with Excel Files in Python 总览

http://www.python-excel.org/

读写Excel的python第三方开源模块

下面这些插件不要求Windows,和安装 Excel

  • openpyxl:The recommended package for reading and writing Excel 2010 files (ie: .xlsx)
  • xlsxwriter:An alternative package for writing data, formatting information and, in particular, charts in the Excel 2010 format (ie: .xlsx)
  • pyxlsb:This package allows you to read Excel files in the xlsb format.
  • pylightxl: This package allows you to read xlsx and xlsm files and write xlsx files.
  • xlrd: This package is for reading data and formatting information from older Excel files (ie: .xls)
  • xlwt: This package is for writing data and formatting information to older Excel files (ie: .xls)
  • xlutils: This package collects utilities that require both xlrd and xlwt, including the ability to copy and modify or filter existing excel files. (NB: In general, these use cases are now covered by openpyxl!)

Excel写操作插件

下面这些模块要求安装MS Excel

  • PyXLL: PyXLL is a commercial product that enables writing Excel add-ins in Python with no VBA. Python functions can be exposed as worksheet functions (UDFs), macros, menus and ribbon tool bars.
  • xlwings:xlwings is an open-source library to automate Excel with Python instead of VBA and works on Windows and macOS: you can call Python from Excel and vice versa and write UDFs in Python (Windows only). xlwings PRO is a commercial add-on with additional functionality.

pandas:推荐指数 *****

https://www.pypandas.cn/
Pandas是一个开源的,BSD许可的库,为python编程语言提供了高性能,易于使用的数据结构和数据分析工具。Python在数据处理和准备方面一直做得很好,但在数据分析和建模方面就没那么好了。Pandas帮助填补了这一空白,使您能够在Python中执行整个数据分析工作流程,而不必切换到更特定于领域的语言,如R。
用于在内存数据结构和不同格式之间读写数据的工具:CSV和文本文件、Microsoft Excel、SQL数据库和快速HDF 5格式。
Python与Pandas在广泛的学术和商业领域中使用,包括金融,神经科学,经济学,统计学,广告,网络分析,等等。

在这里插入图片描述
1

Excel writer engines:By default, pandas uses the XlsxWriter for .xlsx, openpyxl for .xlsm, and xlwt for .xls files.
用法:

df.to_excel('path_to_file.xlsx', sheet_name='Sheet1')

# In order to write separate DataFrames to separate sheets in a single Excel file, one can pass an ExcelWriter.

with pd.ExcelWriter('path_to_file.xlsx') as writer:
    df1.to_excel(writer, sheet_name='Sheet1')
    df2.to_excel(writer, sheet_name='Sheet2')

读的方法: The read_excel() method can read Excel 2003 (.xls) files using the xlrd Python module. Excel 2007+ (.xlsx) files can be read using either xlrd or openpyxl. Binary Excel (.xlsb) files can be read using pyxlsb.

# using the ExcelFile class
data = {}
with pd.ExcelFile('path_to_file.xls') as xls:
    data['Sheet1'] = pd.read_excel(xls, 'Sheet1', index_col=None,
                                   na_values=['NA'])
    data['Sheet2'] = pd.read_excel(xls, 'Sheet2', index_col=None,
                                   na_values=['NA'])

# equivalent using the read_excel function
data = pd.read_excel('path_to_file.xls', ['Sheet1', 'Sheet2'],
                     index_col=None, na_values=['NA'])

one example:比如你想获取蔚来汽车,理想汽车,小鹏汽车从上市到昨天的股票价格历史记录,可以使用如下代码将数据存在 f’US_stock_analysis_{now_time}.xlsx’ excel文件中的各自单独的sheet页面。

import akshare as ak
import datetime as dt
import pandas as pd


stock_list = ['NIO','LI','XPEV']
now_time = dt.datetime.now().strftime('%F')

data = {}
for symbol in stock_list:
    # get/print the data, and write to excel
    data[symbol] = ak.stock_us_daily(symbol=symbol)

with pd.ExcelWriter(f'US_stock_analysis_{now_time}.xlsx') as writer:
    #hist_df = ak.stock_us_daily(symbol=symbol)  # Get U.S. NIO's price info
    for symbol in stock_list:
        print(data[symbol])
        data[symbol].to_excel(writer, sheet_name=symbol)

具体这个例子的使用场景,可以结合参考我的另一篇文章打造属于你自己的私人大数据顾问(科技|财经|生活购物方面)
在这里插入图片描述
在这里插入图片描述

win32com:推荐指数 ****

大部分Windows环境的开发人员都会选择Microsoft Excel API。实际上不仅仅是Python,几乎各种语言都有相应的方法使用它,因为核心的逻辑完全是由Microsft Excel自身提供的。语言相关的部分只是负责跟Windows的COM组件进行通信。从命名上就可以看出,这是一个处理 windows 应用的扩展,Excel 只是该库能实现的一小部分功能。该库还支持 office 的众多操作。需要注意的是,该库不单独存在,可通过安装 pypiwin32 或者 pywin32 获取。当然你还必须要安装某一个版本的Microsoft Office Excel,它内部的DLL负责实际的操作。
在Python中首先需要安装Python for Windows extensions(pywin32),具体的文档可以查阅activePython3.8.2 Documentation win part
在这里插入图片描述
pywin32的API文档,并不是很好找。也可以git clone https://github.com/mhammond/pywin32, 下载源代码后。在本地打开 xxx/pywin32/com/win32com/index.html, 参阅使用说明。也可以通过源代码中的demo文件夹下的例子进行参考使用。2
下面是pywin32的使用例子3,更多的例子可以在https://github.com/pythonexcels/examples找到。

#
# Add a workbook and save to My Documents / Documents Library
# For really old versions of Excel, use the .xls file extension
#
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Add()
wb.SaveAs('add_a_workbook.xlsx')
excel.Application.Quit()


#
# Open an existing workbook
#
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open('workbook1.xlsx')
# Alternately, specify the full path to the workbook
# wb = excel.Workbooks.Open(r'C:\myfiles\excel\workbook2.xlsx')
excel.Visible = True

#
# Add a workbook, add a worksheet,
# name it 'MyNewSheet' and save
#
import win32com.client as win32
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Add()
ws = wb.Worksheets.Add()
ws.Name = "MyNewSheet"
wb.SaveAs('add_a_worksheet.xlsx')
excel.Application.Quit()

openpyxl:推荐指数 ***

https://openpyxl.readthedocs.io/en/stable/index.html
openpyxl 是一个python第三方库,他可以读、写 Excel 2010 的xlsx/xlsm/xltx/xltm格式文件。这个模块的优势在于,对excel单元格样式的设置方面比较详细。

示例代码

from openpyxl import Workbook
wb = Workbook()

# grab the active worksheet
ws = wb.active

# Data can be assigned directly to cells
ws['A1'] = 42

# Rows can also be appended
ws.append([1, 2, 3])

# Python types will automatically be converted
import datetime
ws['A2'] = datetime.datetime.now()

# Save the file
wb.save("sample.xlsx")

xlsxwriter: 推荐指数 ***

https://xlsxwriter.readthedocs.io
xlsxwriter拥有丰富的特性,支持图片/表格/图表/筛选/格式/公式等,功能与openpyxl相似,优点是相比 openpyxl 还支持 VBA 文件导入,迷你图等功能,缺点是只能用来写文件,不能打开/修改已有文件,意味着使用 xlsxwriter 需要从零开始。

在这里插入图片描述

示例代码可完成上图功能

import xlsxwriter
# Create an new Excel file and add a worksheet.
workbook = xlsxwriter.Workbook('demo.xlsx')
worksheet = workbook.add_worksheet()

# Widen the first column to make the text clearer.
worksheet.set_column('A:A', 20)

# Add a bold format to use to highlight cells.
bold = workbook.add_format({'bold': True})

# Write some simple text.
worksheet.write('A1', 'Hello')

# Text with formatting.
worksheet.write('A2', 'World', bold)

# Write some numbers, with row/column notation.
worksheet.write(2, 0, 123)
worksheet.write(3, 0, 123.456)

# Insert an image.
worksheet.insert_image('B5', 'logo.png')

workbook.close()

xlutils/xlrd/xlwt:推荐指数 **

xlrd,xlwt主要是针对Office 2013或更早版本的XLS文件格式。xlutils使用需要依赖xlrd和xlwt,扩充这两个库的功能。XlsxWriter和OpenPyXL都不支持XLS格式,从这个角度看,xlrd,xlwt仍然有一定的不可替代性。对XLSX支持比较差。功能很弱

xlutils

https://pypi.org/project/xlutils/

xlrd

https://pypi.org/project/xlrd/

xlwt

https://pypi.org/project/xlwt/

示例代码

import xlwt
    from datetime import datetime
    style0 = xlwt.easyxf('font: name Times New Roman, color-index red, bold on',
      num_format_str='#,##0.00')
    style1 = xlwt.easyxf(num_format_str='D-MMM-YY')
    wb = xlwt.Workbook()
    ws = wb.add_sheet('A Test Sheet')
    ws.write(0, 0, 1234.56, style0)
    ws.write(1, 0, datetime.now(), style1)
    ws.write(2, 0, "hello world")
    ws.write(3, 0, 1)
    ws.write(3, 1, 3)
    ws.write(3, 2, xlwt.Formula("A4+B4"))
    wb.save('example.xls')

pyexcel: 推荐指数 **

https://pypi.org/project/pyexcel/
该工具如果是公司使用需要支付一定费用,大概$10,个人使用的话是免费的。

示例代码:

import pyexcel as p # make sure you had pyexcel-xls installed
    a_list_of_dictionaries = [
    {
    "Name": 'Adam',
    "Age": 28
    },
    {
    "Name": 'Beatrice',
    "Age": 29
    },
    {
    "Name": 'Ceri',
    "Age": 30
    },
    {
    "Name": 'Dean',
    "Age": 26
    }
    ]
    pyexcel.save_as(records=a_list_of_dictionaries, dest_file_name="your_file.xls")

xlwings:推荐指数 **

http://docs.xlwings.org/en/stable/

Xlwings 是BSD-licensed python库。他提供了从python和excel自动化和交互操作。语法和VBA很接近。支持Numpy arrays和Pandas Series/DataFrames.

import xlwings as xw

#连接到excel
workbook = xw.Book(r'path/myexcel.xlsx')#连接excel文件
#连接到指定单元格
data_range = workbook.sheets('Sheet1').range('A1')
#写入数据
data_range.value = [1,2,3]
#保存
workbook.save()

4 5 6 7

参考资料:


  1. pandas API IO doc ↩︎

  2. https://www.cnblogs.com/paul-liang/p/9187503.html ↩︎

  3. Python Excel Mini Cookbook ↩︎

  4. http://pythonexcels.com ↩︎

  5. https://cloud.tencent.com/developer/article/1587491 ↩︎

  6. https://www.cnblogs.com/aziji/p/9929038.html ↩︎

  7. https://www.cnblogs.com/hushaojun/p/7792503.html ↩︎

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值