read
(
size=-1
)
Read up to
size
bytes from the object and return them. As a convenience, if
size
is unspecified or -1,
all bytes until EOF are returned
.
Otherwise, only one system call is ever made.
Fewer than
size
bytes may be returned
if the operating system call returns fewer than
size
bytes.
If 0 bytes are returned, and
size
was not 0, this indicates end of file. If the object is in non-blocking mode and no bytes are available,
None
is returned.
readline
(
size=-1
)
Read and return one line from the stream.
If
size
is specified, at most
size
bytes will be read.
The line terminator is always
b'\n'
for binary files; for text files, the
newline
argument to
open()
can be used to select the line terminator(s) recognized.
readlines
(
hint=-1
)
Read and return
a list of lines from the stream.
hint
can be specified to control the number of lines read
: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds
hint
.
Note that it’s already possible to iterate on file objects using
for line in file: ...
without calling
file.readlines()
.
处理7GB的文本文件:
# File: readline-example-3.py
file = open("sample.txt")
while 1:
lines = file.readlines(100000)
if not lines:
break
for line in lines:
pass # do something**strong text**
为了每秒处理96,900行文本。其他
作者
建议使用islice()
from itertools import islice
with open(...) as f:
while True:
next_n_lines = list(islice(f, n))
if not next_n_lines:
break
# process next_n_lines
list(islice(f, n))
将返回
n
文件的下一行列表
f
。在循环中使用它将为您提供大量
n
行的文件
itertools.
islice
(
iterable
,
start
,
stop
[,
step
])
Make an iterator that
returns selected elements from the iterable.
If
start
is non-zero, then elements from the iterable are skipped until start is reached. Afterward, elements are returned consecutively unless
step
is set higher than one which results in items being skipped. If
stop
is
None
, then iteration continues until the iterator is exhausted, if at all; otherwise, it stops at the specified position. Unlike regular slicing,
islice()
does not support negative values for
start
,
stop
, or
step
. Can be used to extract related fields from data where the internal structure has been flattened (for example, a multi-line report may list a name field on every third line). Roughly equivalent to:
def islice(iterable, *args):
# islice('ABCDEFG', 2) --> A B
# islice('ABCDEFG', 2, 4) --> C D
# islice('ABCDEFG', 2, None) --> C D E F G
# islice('ABCDEFG', 0, None, 2) --> A C E G
s = slice(*args)
start, stop, step = s.start or 0, s.stop or sys.maxsize, s.step or 1
it = iter(range(start, stop, step))
try:
nexti = next(it)
except StopIteration:
# Consume *iterable* up to the *start* position.
for i, element in zip(range(start), iterable):
pass
return
try:
for i, element in enumerate(iterable):
if i == nexti:
yield element
nexti = next(it)
except StopIteration:
# Consume to *stop*.
for i, element in zip(range(i + 1, stop), iterable):
pass
If
start
is
None
, then iteration starts at zero. If
step
is
None
, then the step defaults to one.