呃, 理解错题主的意思, 重新写下代码, 我承认用filehandler.readlines()是自己打脸了~
其实如果只是觉得生成的文件有些大的话, *nix有一款自带的小工具split非常适合, 可以随意把大文件拆分成若干小的
下面的代码如果不考虑结果分割可以简单修改write2file函数, 然后id_generator函数及相关模块(random, string)可以删掉
def write2file(item):
with open("dict.txt", "a") as fh, open("file1.txt", "r") as f1:
for i in f1.readlines():
for j in item:
fh.write("{}{}\n".format(i.strip(), j))
import random
import string
from multiprocessing.dummy import Pool
def id_generator(size=8, chars=string.ascii_letters + string.digits):
return ''.join(random.choice(chars) for _ in range(size))
def generate_index(n, step=5):
for i in range(0, n, step):
if i + step < n:
yield i, i+step
else:
yield i, None
def write2file(item):
ext_id = id_generator()
with open("dict_{}.txt".format(ext_id), "w") as fh, open("file1.txt", "r") as f1:
for i in f1.readlines():
for j in item:
fh.write("{}{}\n".format(i.strip(), j))
def multi_process(lst):
pool = Pool()
pool.map(write2file, b_lst)
pool.close()
pool.join()
if __name__ == "__main__":
with open("file2.txt") as f2:
_b_lst = [_.strip() for _ in f2.readlines()]
b_lst = (_b_lst[i: j] for i, j in generate_index(len(_b_lst), 5))
multi_process(b_lst)
结果如图, 会生成若干dict_加8位随机字符串的文本文档
其中一个内容dict_3txVnToL.txt
zhangwei123
zhangwei123456
zhangwei@123
zhangwei888
zhangwei999
wangwei123
wangwei123456
wangwei@123
wangwei888
wangwei999
...
以下是旧内容
满足你的渴望,放码:
with open("file1") as f1, open("file2") as f2, open("new", "w") as new:
b = f2.readline().strip()
while b:
a = f1.readline().strip()
for i in range(5):
if b:
new.write("{}{}\n".format(a, b))
else: break
b = f2.readline().strip()
每次只按行读取,无论多大的文件都能hold住,节能环保,结果示意:
$ head new
zhangwei123
zhangwei123456
zhangwei@123
zhangwei888
zhangwei999
wangwei666
wangwei2015
wangwei2016
wangwei521
wangwei123
PS:如楼上所说,尽量避免使用readlines方法,内存有限的情况下,如果碰到超大文件会是个灾难