1. Python版本和运行环境
- Python3.9.0
- jupyter notebook
2. 代码
import pandas as pd
#需要读取的Excel的路径,最好是简单的英文路径
#路径前面的r是防止字符转义
file_path = r"E:/project/data.xlsx"
sheet1 = pd.read_excel(io=file_path)
#默认显示Excel的前5行
sheet1.head()
代码就是简单的读取Excel的内容,代码完全没问题,如果使用的是以前版本的Python,那么应该正常读取显示Excel前5行的内容。
3. Python3.9.0报错
针对上述的代码,Python3.9.0会出现如下报错:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-6-9d38e4d56bbe> in <module>
1 import pandas as pd
2 file_path = r"E:/project/data.xlsx"
----> 3 sheet1 = pd.read_excel(io=file_path)
4 sheet1.head()
c:\python\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
294 )
295 warnings.warn(msg, FutureWarning, stacklevel=stacklevel)
--> 296 return func(*args, **kwargs)
297
298 return wrapper
c:\python\lib\site-packages\pandas\io\excel\_base.py in read_excel(io, sheet_name, header, names, index_col, usecols, squeeze, dtype, engine, converters, true_values, false_values, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, parse_dates, date_parser, thousands, comment, skipfooter, convert_float, mangle_dupe_cols)
302
303 if not isinstance(io, ExcelFile):
--> 304 io = ExcelFile(io, engine=engine)
305 elif engine and engine != io.engine:
306 raise ValueError(
c:\python\lib\site-packages\pandas\io\excel\_base.py in __init__(self, path_or_buffer, engine)
865 self._io = stringify_path(path_or_buffer)
866
--> 867 self._reader = self._engines[engine](self._io)
868
869 def __fspath__(self):
c:\python\lib\site-packages\pandas\io\excel\_xlrd.py in __init__(self, filepath_or_buffer)
20 err_msg = "Install xlrd >= 1.0.0 for Excel support"
21 import_optional_dependency("xlrd", extra=err_msg)
---> 22 super().__init__(filepath_or_buffer)
23
24 @property
c:\python\lib\site-packages\pandas\io\excel\_base.py in __init__(self, filepath_or_buffer)
351 self.book = self.load_workbook(filepath_or_buffer)
352 elif isinstance(filepath_or_buffer, str):
--> 353 self.book = self.load_workbook(filepath_or_buffer)
354 elif isinstance(filepath_or_buffer, bytes):
355 self.book = self.load_workbook(BytesIO(filepath_or_buffer))
c:\python\lib\site-packages\pandas\io\excel\_xlrd.py in load_workbook(self, filepath_or_buffer)
35 return open_workbook(file_contents=data)
36 else:
---> 37 return open_workbook(filepath_or_buffer)
38
39 @property
c:\python\lib\site-packages\xlrd\__init__.py in open_workbook(filename, logfile, verbosity, use_mmap, file_contents, encoding_override, formatting_info, on_demand, ragged_rows)
128 if 'xl/workbook.xml' in component_names:
129 from . import xlsx
--> 130 bk = xlsx.open_workbook_2007_xml(
131 zf,
132 component_names,
c:\python\lib\site-packages\xlrd\xlsx.py in open_workbook_2007_xml(zf, component_names, logfile, verbosity, use_mmap, formatting_info, on_demand, ragged_rows)
810 del zflo
811 zflo = zf.open(component_names['xl/workbook.xml'])
--> 812 x12book.process_stream(zflo, 'Workbook')
813 del zflo
814 props_name = 'docprops/core.xml'
c:\python\lib\site-packages\xlrd\xlsx.py in process_stream(self, stream, heading)
264 self.tree = ET.parse(stream)
265 getmethod = self.tag2meth.get
--> 266 for elem in self.tree.iter() if Element_has_iter else self.tree.getiterator():
267 if self.verbosity >= 3:
268 self.dump_elem(elem)
AttributeError: 'ElementTree' object has no attribute 'getiterator'
4. 原因和解决方法
- 原因
报错信息最后一句:
AttributeError: 'ElementTree' object has no attribute 'getiterator'
大概意思就是:ElementTree对象没有叫做getiterator的方法。
定位到报错的xlsx.py文件,此文件在我们的xlrd库中,xlrd库是一个很常用的读取excel文件的库。
理论上来说使用pip install xlrd下载的库应该是和我们的Python版本对应的库才对,不应该出现库里面的方法找不打才对。
于是我查到了官方文档:
xml.etree.ElementTree.Element.getiterator() has been deprecated since Python 2.7, and has been removed in Python 3.9. Replace all instances of Element.getiterator(tag) with Element.iter(tag) in ase/io/exciting.py. This MR also removes extraneous whitespace on otherwise empty lines in that file
大概意思是:xml.etree.ElementTree.Element.getiterator()自Python 2.7起已弃用,并已在Python 3.9中删除。应该替换所有实例Element.getiterator(tag)为Element.iter(tag)。
到这基本上可以破案了:不是代码的问题;不是Excel文件的问题;不是路径的问题(排除问题太自闭了)。然而我不知道既然废除了Element.getiterator()方法,为啥我更新的xlrd库不做相应的更新(咱也不敢问)。
- 解决办法
定位自己的xlrd\xlsx.py所在位置:win+R——>cmd——>pip show xlrd
打开xlsx.py
文件,查找替换(建议先备份xlsx.py
)所有的getiterator()
方法为iter()
,保存即可。 - 解决