在Python中,对于二进制文件,我可以这样编写:
buf_size=1024*64 # this is an important size...
with open(file, "rb") as f:
while True:
data=f.read(buf_size)
if not data: break
# deal with the data....
对于要逐行读取的文本文件,我可以编写以下代码:
with open(file, "r") as file:
for line in file:
# deal with each line....
简写为:
with open(file, "r") as file:
for line in iter(file.readline, ""):
# deal with each line....
PEP
234中记录了该惯用语,但我无法为二进制文件找到类似的惯用语。
我已经试过了:
>>> with open('dups.txt','rb') as f:
... for chunk in iter(f.read,''):
... i+=1
>>> i
1 # 30 MB file, i==1 means read in one go...
我尝试放置,iter(f.read(buf_size),'')但这是语法错误,因为在iter()中的callable之后有括号。
我知道我可以编写一个函数,但是默认习惯用法有没有办法在for chunk in file:哪里使用缓冲区大小而不是面向行?
感谢您忍受Python新手尝试编写他的第一个平凡而又惯用的Python脚本。