字符串在Python内部表示是unicode编码,因此在做转换时通常以unicode作为中间码。
先将其他编码的字符串decode()成unicode,再从unicode encode()成另一种编码。
encode(...)
S.encode(encoding='utf-8', errors='strict') -> bytes
Encode S using the codec registered for encoding. Default encoding
is 'utf-8'. errors may be given to set a different error
handling scheme. Default is 'strict' meaning that encoding errors raise
a UnicodeEncodeError. Other possible values are 'ignore', 'replace' and
'xmlcharrefreplace' as well as any other name registered with
codecs.register_error that can handle UnicodeEncodeErrors.
如果操作系统默认编码不是utf8,就无法输出中文
在文件前加上:
import sys
sys.setdefaultencoding('utf8')
即可
查看系统默认编码方式:
>>> import sys
>>> print(sys.getdefaultencoding())
utf-8