[Python] 生成器按行读取大文件

我们平时很少读取1个G或者N个G的大文件。但假如要读取500G的大文件,是不可能直接通过 f.read() 读到内存的,因为内存会爆掉··· 如果是超过内存容量的大文件,需要分次从磁盘内读取到内存中,这时候生成器就格外的重要了。直接上代码,非常简单。

按行读取生成器:

def read_file(file):
    with open(file, mode='r', encoding='utf8') as f:
        while True:
            one_line = f.readline().strip()
            if not one_line:
                return
            yield one_line

按行读取csv文件

有个 student_info.csv 文件的部分内容如下:

Name,Date,English,Math,Chinese,Money,Other_1,Other_2,Other_3
XiaoMing,2020-07-20T01:07:00Z,42,93,0.45,3077,5739,0.54,1
XiaoHu,2020-07-20T01:07:31Z,320,852,0.38,18874,37143,0.51,1
XiaoWang,2020-07-20T01:07:48Z,38,118,0.32,3581,34875,0.1,1
XiaoYe,2020-07-20T01:12:48Z,312,477,0.65,3210,4935,0.65,1
XiaoAn,2020-07-20T01:14:04Z,163,263,0.62,2152,4117,0.52,1
XiaoChen,2020-07-20T01:17:30Z,8,10,0.8,423,777,0.54,0.98
XiaoPeng,2020-07-20T01:17:44Z,5,9,0.56,1053,1398,0.75,1
XiaoHong,2020-07-20T01:19:33Z,392,797,0.49,8969,15366,0.58,1
XiaoNing,2020-07-20T01:20:59Z,24,41,0.59,1387,2677,0.52,1
XiaoJing,2020-07-20T01:23:22Z,16,53,0.3,696,1378,0.51,1
XiaoChong,2020-07-20T01:23:45Z,76,111,0.68,2127,2713,0.78,1
XiaoMing,2020-07-20T01:24:01Z,52,135,0.39,3251,6695,0.49,0.99

现在使用生成器按行读取输出前 6 列前 5 行

import csv
from collections import namedtuple

def read_file(file):
    with open(file, mode='r', encoding='utf8') as f:
        while True:
            one_line = f.readline().strip()
            if not one_line:
                return
            yield one_line


lines = read_file("student_info.csv")	# lines 是一个生成器
csv_reader = csv.reader(lines)  

header = next(csv_reader)[:6]	# 只使用前 6 列
print(header)

# Student = namedtuple("Student", header)
Student = namedtuple("Student", "Name Date English Math Chinese Money")

for index, row in enumerate(csv_reader):
    _ = Student._make(row[:6])  # 适配前 6 列
    print(row[:4])      # 输出前 4 列
    print(_.Name, _.Date, _.English, _.Math, _.Chinese, _.Chinese)
    if index == 5:      # 输出前 5 行
        break
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值