这是一个常见的问题,所以这里有一个相对彻底的说明。
对于非unicode字符串(即那些没有u前缀的字符串,如u'\xc4pple'),必须从本机编码(iso8859-1/latin1,除非modified with the enigmatic ^{}函数)解码到^{},然后编码到可以显示所需字符的字符集,在这种情况下,我建议使用^{}。
首先,这里有一个方便的实用程序函数,可以帮助说明Python2.7字符串和unicode的模式:>>> def tell_me_about(s): return (type(s), s)
普通字符串>>> v = "\xC4pple" # iso-8859-1 aka latin1 encoded string
>>> tell_me_about(v)
(, '\xc4pple')
>>> v
'\xc4pple' # representation in memory
>>> print v
?pple # map the iso-8859-1 in-memory to iso-8859-1 chars
# note that '\xc4' has no representation in iso-8859-1,
# so is printed as "?".
解码iso8859-1字符串-将普通字符串转换为unicode>>> uv = v.decode("iso-8859-1")
>>> uv
u'\xc4pple&#