python gzip pickle_numpy - Python gzip: OverflowError size does not fit in an int

I am trying to serialize a large python object, composed of a tuple of numpy arrays using pickle/cPickle and gzip. The procedure works well up to a certain size of the data, and after that I receive the following error:

--> 121 cPickle.dump(dataset_pickle, f)

***/gzip.pyc in write(self, data)

238 print(type(self.crc))

239 print(self.crc)

--> 240 self.crc = zlib.crc32(data, self.crc) & 0xffffffffL

241 self.fileobj.write( self.compress.compress(data) )

OverflowError: size does not fit in an int

The size of the numpy array is around 1.5 GB and the string sent to zlib.crc32 exceeds 2 GB. I am working on a 64-bit machine and my Python is also 64-bit

>>> import sys

>>> sys.maxsize

9223372036854775807

Is it a bug with python or am I doing something wrong? Are there any good alternatives for compressing and serializing numpy arrays? I am taking a look at numpy.savez, PyTables and HDF5 right now, but it would be good to know why I am having this problems since I have enough memory

Update: I remember reading somewhere that this could be caused by using an old version of Numpy (and I was), but I've fully switched to numpy.save/savez instead which is actually faster than cPickle (at least in my case)

python

numpy

serialization

gzip

pickle

|

this question

edited Feb 28 '16 at 15:24 asked May 21 '15 at 14:07

gsmafra 543 4 13

|

1 Answers

1

---Accepted---Accepted---Accepted---

This seems to be a bug in python 2.7ist and a numpy list. My code is import timeitimport numpy as npt = timeit.Timer("range(1000)")print t.timeit()u = timeit.Timer("np.arange(1000)")print u.timeit() Calculation for t is fine, but for u NameError: global name 'np' is n

From inspecting the bug report, it does not look like there is a pending solution to it. Your best bet would be to move to python 3 which apparently did not exhibit this bug.

|

this answer answered Jul 4 '16 at 5:34

Perennial 61 4      Looks like the issue was closed. –

Francisco Couzo Nov 1 '16 at 18:58

|

on) that contain a specific parameter. XML is about 12 GB unpacked. abcde

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值