1.UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xc9 in position 0: invalid continuation byte
pandas读取一个文件出错。
data = pd.read_csv('D:/code/data/rating22.csv')
解决方案:用notepad++打开这个文件,将文件用Unicode编码重新保存。
Traceback (most recent call last):
File "C:/xin/code/gitlab/datascience-py/search-sort/read-rating.py", line 5, in <module>
data = pd.read_csv('D:/code/data/rating22-2.csv')
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 678, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 440, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 787, in __init__
self._make_engine(self.engine)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1014, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\parsers.py", line 1708, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas\_libs\parsers.pyx", line 539, in pandas._libs.parsers.TextReader.__cinit__
File "pandas\_libs\parsers.pyx", line 767, in pandas._libs.parsers.TextReader._get_header
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc9 in position 0: invalid continuation byte
2.UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xd0 in position 0: invalid continuation byte
编码问题,pandas 无法读取文件。
data = pd.read_csv('D:/code/data/original-data/item-sort/11-8-11-7-new-rule2-orl.csv', header=None)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 0: invalid continuation byte
解决方案:改编码为U8。
因为文件太大,平常用的notepad++打不开,所以用sublime打开,设置编码为utf-8.
官网下载的sublime默认不支持中文,参考
https://blog.csdn.net/qq_22260641/article/details/70666960
切换编码方式
3.用idea修改编码
可以把文件复制到idea中来修改编码。