在机器学习过程中,如果要处理的文件上G,单纯使用pandas.read_csv(r'D:\format\total_csv_till20180405.csv')容易出错,读不进来,在这种情况下,可以使用以下代码来一块块读进来,并拼接到一起。
import pandas as pd
reader = pd.read_csv(r'D:\total_csv_till20180405.csv', iterator=True)
loop = True
chunkSize = 100000
chunks = []
while loop:
try:
chunk = reader.get_chunk(chunkSize)
chunks.append(chunk)
except StopIteration:
loop = False
print("Iteration is stopped.")
df = pd.concat(chunks, ignore_index=True)