python分块读取大数据,避免内存不足的方法

最新推荐文章于 2023-04-16 23:19:14 发布

weixin520520

最新推荐文章于 2023-04-16 23:19:14 发布

阅读量924

点赞数

分类专栏： python面试题文章标签： python

本文链接：https://blog.csdn.net/weixin520520/article/details/105451313

版权

python面试题专栏收录该内容

30 篇文章 0 订阅

订阅专栏

import pandas as pd
def read_data(file_name):
    '''
    file_name:文件地址
    '''
    inputfile = open(file_name, 'rb') 
    data = pd.read_csv(inputfile, iterator=True)
    loop = True
    chunkSize = 1000 
    chunks = []
    while loop:
        try:
            chunk = dcs.get_chunk(chunkSize)
            chunks.append(chunk)
        except StopIteration:
            loop = False
            print("Iteration is stopped.")
    data = pd.concat(chunks, ignore_index=True)
    
    return data