运维经常要读写大文本文件,几个G用生成器读写,速度很快。
几十G的先切分为N个小文件,再处理。
# coding:utf-8
"""
黄哥python远程视频培训班
https://github.com/pythonpeixun/article/blob/master/index.md
黄哥python培训试看视频播放地址
https://github.com/pythonpeixun/article/blob/master/python_shiping.md
"""
import time
start_time = time.time()
def find_ip(path):
# urllist = []
for line in open(path):
s = line.find('"Sogou web spider')
if s >=0 :
yield line[:s].strip()
p = find_ip("bigfile.txt")
p = list(set(list(p)))
for item in p:
print(item)
print(time.time() - start_time, "seconds")