代码:
import pandas as pd
filePath = "/Users/zhangjie/Documents/zhangjie/self/data/excel_python/data/data.csv"
df = pd.read_csv(filePath, "gdb");
df.info()
报错信息如下:
/Users/zhangjie/Documents/zhangjie/self/project/python/LearnExcelPython.py:5: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'.
df = pd.read_csv(filePath, "gdb");
<class 'pandas.core.frame.DataFrame'>
经查询,需要加上engine=python,默认为c。具体原因不明。
修改后的代码如下:
import pandas as pd
filePath = "/Users/zhangjie/Documents/zhangjie/self/data/excel_python/data/data.csv"
df = pd.read_csv(filePath, "gdb", engine='python');
df.info()
但是,pandas 读取文本时,使用engine=python和c时间差了1倍
尽量使用engine=c而不是python,5g的文本,c很快读出来了,python用了超过1小时。