进行数据处理时数据量一大,excel文件就力不从心。
这次对三个文件格式的读取速度做大比拼。
# -*- coding: UTF-8 -*-
import time
import pandas as pd
"""
csv
excel
pkl
速度大比拼
"""
start = time.clock()
df = pd.read_pickle('table.pkl')
elapsed = (time.clock() - start)
print("PKL Time used:", elapsed)
start = time.clock()
df = pd.read_csv('table.csv')
elapsed = (time.clock() - start)
print("CSV Time used:", elapsed)
start = time.clock()
df = pd.read_excel('table.xlsx')
elapsed = (time.clock() - start)
print("EXCEL Time used:", elapsed)
输出结果
PKL Time used: 0.0913808
CSV Time used: 0.2128232
EXCEL Time used: 10.9964416
pickle完美胜出。参考链接中有大佬的更详细的比拼。