I have the following lines in my code
outs = codecs.getwriter('utf-8')(sys.stdout)
# dJSON contains JSON message with non-ASCII chars
outs.write(json.dumps(dJSON,encoding='utf-8', ensure_ascii=False, indent=indent_val))
I am getting the following exception:
outs.write(json.dumps(dJSON,encoding='utf-8', ensure_ascii=False, indent=indent_val))
File "/usr/lib/python2.7/json/__init__.py", line 238, in dumps
**kw).encode(obj)
File "/usr/lib/python2.7/json/encoder.py", line 204, in encode
return ''.join(chunks)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 27: ordinal not in range(128)
I through that by specifying encoding='utf-8' in the json.dumps statement, I avoid this type of problem. Why am I still getting the error?
解决方案
My guess is that dJSON object does not contain pure unicode but it contains mix of unicode and strings already encoded as utf-8 e.g. this fails
>>> d = {u'name':u'पाइथन'.encode('utf-8')}
>>> json.dumps(d, encoding='utf-8', ensure_ascii=False)
Traceback (most recent call last):
File "", line 1, in
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 238, in dumps
**kw).encode(obj)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 204, in encode
return ''.join(chunks)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 1: ordinal not in range(128)
But this works (everything unicode)
>>> d = {u'name':u'पाइथन'}
>>> json.dumps(d, encoding='utf-8', ensure_ascii=False)
u'{"name": "\u092a\u093e\u0907\u0925\u0928"}
Though this also works (everything string)
>>> d = {'name':u'पाइथन'.encode('utf-8')}
>>> json.dumps(d, encoding='utf-8', ensure_ascii=False)
'{"name": "\xe0\xa4\xaa\xe0\xa4\xbe\xe0\xa4\x87\xe0\xa4\xa5\xe0\xa4\xa8"}'
在尝试使用json.dumps将包含非ASCII字符的JSON消息写入stdout时,遇到'ascii' codec无法解码的问题。尽管指定了'utf-8'编码,但错误仍然出现。原因可能是dJSON对象中混合了unicode和已编码为utf-8的字符串。纯unicode对象或已全部转为utf-8编码的字符串可以避免该错误。

8545

被折叠的 条评论
为什么被折叠?



