利用pandas读取数据数据报错:
File "pandas/_libs/parsers.pyx", line 876, in pandas._libs.parsers.TextReader.read
File "pandas/_libs/parsers.pyx", line 891, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas/_libs/parsers.pyx", line 945, in pandas._libs.parsers.TextReader._read_rows
File "pandas/_libs/parsers.pyx", line 932, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas/_libs/parsers.pyx", line 2112, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 31 fields in line 107943, saw 32
从上面可以看到,在所读取的文件的第107943行出现了错误,本需要32个字段却只给了31个字段
说白了,也就是数据源其他行都是32个字段除了第107943行是31个字段,而我们dataFrame是结构数据,肯定不允许这样的事情
具体解决可以先忽略这个错误
before:
test_df=pd.read_csv(testDataPath,encoding='utf-8',sep=';')
after:
test_df=pd.read_csv(testDataPath,encoding='utf-8',sep=';',error_bad_lines=False)
不过这个肯定是指标不治本,最好去具体那一行看看数据源,把数据源修改正确才是正解