编辑:
下面的打印显示了我的预期价值。
(sys.stdout.encoding和sys.stdin.encoding都是“UTF-8”)。
为什么变量值与打印值不同?我需要把原始值转换成一个变量。>>username = 'Jo\xc3\xa3o'
>>username.decode('utf-8').encode('latin-1')
'Jo\xe3o'
>>print username.decode('utf-8').encode('latin-1')
João
原始问题:
我在查询BD和将值解码成Python时遇到问题。
我确认我的数据库使用select property_value from database_properties where property_name='NLS_CHARACTERSET';
'''AL32UTF8 stores characters beyond U+FFFF as four bytes (exactly as Unicode defines
UTF-8). Oracle’s “UTF8” stores these characters as a sequence of two UTF-16 surrogate
characters encoded using UTF-8 (or six bytes per character)'''
os.environ["NLS_LANG"] = ".AL32UTF8"
....
conn_data = str('%s/%s@%s') % (db_usr, db_pwd, db_sid)
sql = "select user_name apex.users where user_id = '%s'" % userid
...
cursor.execute(sql)
ldap_username = cursor.fetchone()
...
其中print ldap_username
>>'Jo\xc3\xa3o'
我俩都试过了(结果一样)ldap_username.decode('utf-8')
>>u'Jo\xe3o'
unicode(ldap_username, 'utf-8')
>>u'Jo\xe3o'
其中u'João'.encode('utf-8')
>>'Jo\xc3\xa3o'
如何将查询结果返回到正确的“João”?