Problem :
for line
in reader:
UnicodeDecodeError: 'gbk' codec can't decode bytes in position
461-462: illegal multibyte sequence
1. i change open function:
fObj =
open('a.csv','rU',encoding='UTF-8')
another error :
UnicodeDecodeError: 'utf8' codec can't decode
byte 0xb1 in position 461: invalid start byte
2.
>>> import locale
>>> locale.getdefaultlocale()
('zh_CN', 'cp936')
cp936 : 就是指系统里第936号编码格式,也就是GB2312。
3. 打开文件后,另存为utf-8 格式,从新run程序,结果ok。
#中文(简体, 中国)
#sys.getdefaultencoding(): gbk
#sys.getfilesystemencoding(): mbcs
#locale.getdefaultlocale(): ('zh_CN', 'cp936')
#locale.getpreferredencoding(): cp936
#'\xba\xba'.decode('mbcs'): u'\u6c49'
#英语(美国)
#sys.getdefaultencoding(): UTF-8
#sys.getfilesystemencoding(): mbcs
#locale.getdefaultlocale(): ('zh_CN', 'cp1252')
#locale.getpreferredencoding(): cp1252
#'\xba\xba'.decode('mbcs'): u'\xba\xba'
#德语(德国)
#sys.getdefaultencoding(): gbk
#sys.getfilesystemencoding(): mbcs
#locale.getdefaultlocale(): ('zh_CN', 'cp1252')
#locale.getpreferredencoding(): cp1252
#'\xba\xba'.decode('mbcs'): u'\xba\xba'