如果在一个4G的大文件test.log中提取里面有error的log:
- 大家第一想法就是通过打开文件然后每一行的查找,耗时19s
import time start_time = time.time() with open('/Users/test/Downloads/test.log') as f: for line in f.readlines(): if 'error' in line: print line end_time = time.time() print "cost time is: ", end_time-start_time #cost time is: 19.1660439968
- 列表解析:大多数工作在python解释器内部完成,比等价的语句要快很多,特别是大文件,我们可以看见在相同的大文件中使用列表解析时间为4s,大大提高了效率
import time start_time = time.time() lines = [line for line in open('/Users/test/Downloads/test.log') if 'error' in line] for line in lines: print line end_time = time.time() print "cost time is: ", end_time-start_time # cost time is: 4.45141601562