Python 超大文件处理

参考:http://www.nikhilgopal.com/2010/12/dealing-with-large-files-in-python.html 及stackoverflow

line by line的:

1.使用with关键词:

1 with open('somefile.txt', 'r') as FILE:
2     for line in FILE:
3         # operation

  类似于:

 

1 for line in open('somefile.txt'):
2     process_data(line)

 

2.使用模块fileinput:

1 import fileinput
2 for i in fileinput.input('somefile.txt'):
3     # operation

3.建立缓存,精确控制缓冲区大小(readlines 和 read 均可):

1 BUFFER = int(10E6) #10 megabyte buffer
2 file = open('somefile.txt', 'r')
3 text = file.readlines(BUFFER)
4 while text != []:
5     for t in text:
6         # operation
7     text = file.readlines(BUFFER)

4.结合方法3使用yield:

 1 def read_in_chunks(file_object, chunk_size=1024):
 2     """Lazy function (generator) to read a file piece by piece.
 3     Default chunk size: 1k."""
 4     while True:
 5         data = file_object.read(chunk_size)
 6         if not data:
 7             break
 8         yield data
 9 
10 
11 f = open('really_big_file.dat')
12 for piece in read_in_chunks(f):
13     process_data(piece)

5.使用iter:

1 f = open('really_big_file.dat')
2 def read1k():
3     return f.read(1024)
4 
5 for piece in iter(read1k, ''):
6     process_data(piece)

   再比如:

1 f = ... # file-like object, i.e. supporting read(size) function and 
2         # returning empty string '' when there is nothing to read
3 
4 def chunked(file, chunk_size):
5     return iter(lambda: file.read(chunk_size), '')
6 
7 for data in chunked(f, 65536):
8     # process the data

 

转载于:https://www.cnblogs.com/warbean/p/3461452.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值