项目场景:
Python利用pandas读取pickle文件(dataframe)
问题描述:
Python读取pickle文件(这里是pkl压缩了的gzip文件哟)报错:ValueError: unsupported pickle protocol: 5
Python读取pickle文件代码:
import pandas as pd
df = pd.read_pickle("filename.pkl", compression='gzip')
原因分析:
明显是pickle 的问题,所以直接找到官网!pickle官方说明
可以看到:
There are currently 6 different protocols which can be used for pickling. The higher the protocol used, the more recent the version of Python needed to read the pickle produced.
Protocol version 0 is the original “human-readable” protocol and is backwards compatible with earlier versions of Python.
Protocol version 1 is an old binary format which is also compatible with earlier versions of Python.
Protocol version 2 was introduced in Python 2.3. It provides much more efficient pickling of new-style classes. Refer to PEP 307 for information about improvements brought by protocol 2.
Protocol version 3 was added in Python 3.0. It has explicit support for bytes objects and cannot be unpickled by Python 2.x. This was the default protocol in Python 3.0–3.7.
Protocol version 4 was added in Python 3.4. It adds support for very large objects, pickling more kinds of objects, and some data format optimizations. It is the default protocol starting with Python 3.8. Refer to PEP 3154 for information about improvements brought by protocol 4.
Protocol version 5 was added in Python 3.8. It adds support for out-of-band data and speedup for in-band data. Refer to PEP 574 for information about improvements brought by protocol 5.
对应的协议5是3.8中的哦!!!!!所以当时在pickle.dump的时候是在python3.8中使用协议5序列化的!
解决方案:
那就是安装python3.8及其以上版本才能load()啦,卸载了之前的3.6重新安装就成功了!接下来就是DataFrame - 访问数据咯。
结论
你好! 这是你第一次使用 Markdown编辑器 所展示的欢迎页。如果你想学习如何使用Markdown编辑器, 可以仔细阅读这篇文章,了解一下Markdown的基本语法知识。