python文件太大无法读取_Python读取大文件,python

read

(

size=-1

)

Read up to

size

bytes from the object and return them. As a convenience, if

size

is unspecified or -1,

all bytes until EOF are returned

.

Otherwise, only one system call is ever made.

Fewer than

size

bytes may be returned

if the operating system call returns fewer than

size

bytes.

If 0 bytes are returned, and

size

was not 0, this indicates end of file. If the object is in non-blocking mode and no bytes are available,

None

is returned.

readline

(

size=-1

)

Read and return one line from the stream.

If

size

is specified, at most

size

bytes will be read.

The line terminator is always

b'\n'

for binary files; for text files, the

newline

argument to

open()

can be used to select the line terminator(s) recognized.

readlines

(

hint=-1

)

Read and return

a list of lines from the stream.

hint

can be specified to control the number of lines read

: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds

hint

.

Note that it’s already possible to iterate on file objects using

for line in file: ...

without calling

file.readlines()

.

处理7GB的文本文件:

# File: readline-example-3.py

file = open("sample.txt")

while 1:

lines = file.readlines(100000)

if not lines:

break

for line in lines:

pass # do something**strong text**

为了每秒处理96,900行文本。其他

作者

建议使用islice()

from itertools import islice

with open(...) as f:

while True:

next_n_lines = list(islice(f, n))

if not next_n_lines:

break

# process next_n_lines

list(islice(f, n))

将返回

n

文件的下一行列表

f

。在循环中使用它将为您提供大量

n

行的文件

itertools.

islice

(

iterable

,

start

,

stop

[,

step

])

Make an iterator that

returns selected elements from the iterable.

If

start

is non-zero, then elements from the iterable are skipped until start is reached. Afterward, elements are returned consecutively unless

step

is set higher than one which results in items being skipped. If

stop

is

None

, then iteration continues until the iterator is exhausted, if at all; otherwise, it stops at the specified position. Unlike regular slicing,

islice()

does not support negative values for

start

,

stop

, or

step

. Can be used to extract related fields from data where the internal structure has been flattened (for example, a multi-line report may list a name field on every third line). Roughly equivalent to:

def islice(iterable, *args):

# islice('ABCDEFG', 2) --> A B

# islice('ABCDEFG', 2, 4) --> C D

# islice('ABCDEFG', 2, None) --> C D E F G

# islice('ABCDEFG', 0, None, 2) --> A C E G

s = slice(*args)

start, stop, step = s.start or 0, s.stop or sys.maxsize, s.step or 1

it = iter(range(start, stop, step))

try:

nexti = next(it)

except StopIteration:

# Consume *iterable* up to the *start* position.

for i, element in zip(range(start), iterable):

pass

return

try:

for i, element in enumerate(iterable):

if i == nexti:

yield element

nexti = next(it)

except StopIteration:

# Consume to *stop*.

for i, element in zip(range(i + 1, stop), iterable):

pass

If

start

is

None

, then iteration starts at zero. If

step

is

None

, then the step defaults to one.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值