我正在读取具有UTF8编码的CSV文件:
ifile = open(fname, "r")
for row in csv.reader(ifile):
name = row[0]
print repr(row[0])
这很好用,并打印出我希望它打印出来的东西; UTF8编码 str :
> '\xc3\x81lvaro Salazar'
> '\xc3\x89lodie Yung'
...
此外,当我只是打印 str (而不是 repr() )时,输出显示正常(我不知道这会导致错误吗?):
> Álvaro Salazar
> Élodie Yung
但是当我尝试将我的UTF8编码 strs 转换为 unicode 时:
ifile = open(fname, "r")
for row in csv.reader(ifile):
name = row[0]
print unicode(name, 'utf-8') # or name.decode('utf-8')
我得到了臭名昭着的:
Traceback (most recent call last):
File "scripts/script.py", line 33, in
print unicode(fullname, 'utf-8')
UnicodeEncodeError: 'ascii' codec can't encode chara