1. 数据分批读取
df_train_org = pd.read_csv(train_file, chunksize = 10000000, iterator = True)
df_test_org = pd.read_csv(test_file, chunksize = 10000000, iterator = True)
for chunk in df_train_org:
#每一个chunk为大小最大为n*10000000的DataFrame
df_train_org.to_csv('train_sample.csv', index=False)
for chunk in df_test_org:
#每一个chunk为大小最大为n*10000000的DataFrame
df_test_org.to_csv('test_sample.csv', index=False)
pandas 读取CSV数据
最新推荐文章于 2021-05-26 23:45:02 发布