df = pd.read_excel('./xxx.xlsx')
默认engine=xlrd
xlrd 2.0不支持xlsx,读取报错:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib64/python3.6/site-packages/pandas/util/_decorators.py", line 296, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib64/python3.6/site-packages/pandas/io/excel/_base.py", line 304, in read_excel
io = ExcelFile(io, engine=engine)
File "/usr/local/lib64/python3.6/site-packages/pandas/io/excel/_base.py", line 867, in __init__
self._reader = self._engines[engine](self._io)
File "/usr/local/lib64/python3.6/site-packages/pandas/io/excel/_xlrd.py", line 22, in __init__
super().__init__(filepath_or_buffer)
File "/usr/local/lib64/python3.6/site-packages/pandas/io/excel/_base.py", line 353, in __init__
self.book = self.load_workbook(filepath_or_buffer)
File "/usr/local/lib64/python3.6/site-packages/pandas/io/excel/_xlrd.py", line 37, in load_workbook
return open_workbook(filepath_or_buffer)
File "/home/koyasu/.local/lib/python3.6/site-packages/xlrd/__init__.py", line 170, in open_workbook
raise XLRDError(FILE_FORMAT_DESCRIPTIONS[file_format]+'; not supported')
xlrd.biffh.XLRDError: Excel xlsx file; not supported
使用openpyxl,执行:
df = pd.read_excel('./xxx.xlsx', engine='openpyxl')
在大部分情况下成功执行,但有时候会报错:
>>> df = pd.read_excel("xxxx.xlsx", engine="openpyxl")
/usr/local/lib/python3.6/site-packages/openpyxl/styles/stylesheet.py:226: UserWarning: Workbook contains no default style, apply openpyxl's default
warn("Workbook contains no default style, apply openpyxl's default")
>>> df
Empty DataFrame
Columns: [Column1]
Index: []
网上搜索解决思路,发现要么文章收费,要么让你用Excel打开再保存。
Excel打开再保存显然不在程序员的考虑范围内。
xlrd==1.2.0 以前的版本是可以打开xlsx文件的。
pip3 uninstall xlrd
pip3 install xlrd==1.2.0
成功读入文件。