python读取一个文件的大小有限制吗_read（）的文件大小限制？

最新推荐文章于 2022-09-05 01:03:25 发布

weixin_39672979

最新推荐文章于 2022-09-05 01:03:25 发布

阅读量989

点赞数

文章标签： python读取一个文件的大小有限制吗

本文链接：https://blog.csdn.net/weixin_39672979/article/details/111448771

版权

I'm running into a problem while trying to load large files using Python 3.5. Using read() with no arguments sometimes gave an OSError: Invalid argument. I then tried reading only part of the file and it seemed to work fine. I've determined that it starts to fail somewhere around 2.2GB, below is the example code:

>>> sys.version

'3.5.1 (v3.5.1:37a07cee5969, Dec 5 2015, 21:12:44) \n[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]'

>>> x = open('/Users/username/Desktop/large.txt', 'r').read()

Traceback (most recent call last):

File "", line 1, in

OSError: [Errno 22] Invalid argument

>>> x = open('/Users/username/Desktop/large.txt', 'r').read(int(2.1*10**9))

>>> x = open('/Users/username/Desktop/large.txt', 'r').read(int(2.2*10**9))

Traceback (most recent call last):

File "", line 1, in

OSError: [Errno 22] Invalid argument

I also noticed that this does not happen in Python 2.7. Here is the same code run in Python 2.7:

>>> sys.version

'2.7.10 (default, Aug 22 2015, 20:33:39) \n[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.1)]'

>>> x = open('/Users/username/Desktop/large.txt', 'r').read(int(2.1*10**9))

>>> x = open('/Users/username/Desktop/large.txt', 'r').read(int(2.2*10**9))

>>> x = open('/Users/username/Desktop/large.txt', 'r').read()

>>>

I am using OS X El Capitan 10.11.1.

Is this a bug or should use another method for reading the files?

解决方案

Yes, you have bumped into a bug.

Good news is that someone else has also found it and already created an issue for it in the Python bug tracker, see: Issue24658 - open().write() fails on 2 GB+ data (OS X). This, seems, is platform depended (OS-X only) and is reproducible when using read and/or write. Apparently an issue exists with the way fread.c is implemented in the libc implementation for OS-X see here.

Bad News is that it is still open (and, currently, inactive) so, you'll have to wait until it is resolved. Either way, you can still take a look at the discussion there if you're interested for the specifics.

Regardless, I'm pretty sure you can side-step the issue, until it is fixed, by reading in chunks and chaining the chunks during processing and doing the same when writing. Clunky but, it might do the trick.