python代码转换_python编码转换

最新推荐文章于 2022-05-08 16:00:26 发布

weixin_39596835

最新推荐文章于 2022-05-08 16:00:26 发布

阅读量405

点赞数

文章标签： python代码转换

Pyton内部的字符串一般都是unicode编码或字节字符串编码；

代码中字符串的默认编码与代码文件本身的编码是一致的；

编码转换通常需要以unicode编码作为中间编码进行转换，即先将其他编码的字符串解码(decode)成unicode字符串，再从unicode编码(encode)成需要的编码；

编码和解码的方式要一致；

不同运行环境的默认编码也可能不一样；dos下默认是：ascii(gbk)

dos环境下:

1.获取系统默认编码：

>>> import sys

>>> sys.getdefaultencoding()

'ascii'

>>>

字节字符串：

>>> s="abc"

>>> type(s)

unicode字符串：

>>> s=u"中文"

>>> type(s)

2.英文字符串编码转换：英文字符串可以decode或encode(除unicode外)任何需要的编码

>>> s="abc" #英文可以decode和encode(除unicode外)任何需要的编码

>>> s.decode()

u'abc'

>>> s.decode("gbk")

u'abc'

>>> s.decode("ascii")

u'abc'

>>> s.decode("utf-8")

u'abc'

>>> s.decode("gb2312")

u'abc'

>>> s.decode("unicode")

Traceback (most recent call last):

File "", line 1, in

LookupError: unknown encoding: unicode

>>> s="abc" #英文可以decode和encode(除unicode外)任何需要的编码

>>> s.encode()

'abc'

>>> s.encode("gbk")

'abc'

>>> s.encode("ascii")

'abc'

>>> s.encode("utf-8")

'abc'

>>> s.encode("gb2312")

'abc'

>>> s.encode("unicode")

Traceback (most recent call last):

File "", line 1, in

LookupError: unknown encoding: unicode

>>>

>>> s=u"abc" #英文可以decode和encode(除unicode外)任何需要的编码

>>> s.decode()

u'abc'

>>> s.decode("gbk")

u'abc'

>>> s.decode("ascii")

u'abc'

>>> s.decode("utf-8")

u'abc'

>>> s.decode("gb2312")

u'abc'

>>> s.decode("unicode")

Traceback (most recent call last):

File "", line 1, in

LookupError: unknown encoding: unicode

>>> s=u"abc" #英文可以decode和encode(除unicode外)任何需要的编码

>>> s.encode()

'abc'

>>> s.encode("gbk")

'abc'

>>> s.encode("ascii")

'abc'

>>> s.encode("utf-8")

'abc'

>>> s.encode("gb2312")

'abc'

>>> s.encode("unicode")

Traceback (most recent call last):

File "", line 1, in

LookupError: unknown encoding: unicode

>>>

3.中文编解码：

(1)dos环境下默认编码是gbk，所以只能decode(gbk/gb2312)

(2)unicode编码的中文只能encode，不能decode；

>>> s="中文" #dos的默认编码是gbk，所以此例只能decode(gbk/gb2312)

>>> s.decode()

Traceback (most recent call last):

File "", line 1, in

UnicodeDecodeError: 'ascii' codec can't decode byte 0xd6 in position 0: ordinal

not in range(128)

>>> s.decode("gbk")

u'\u4e2d\u6587'

>>> s.decode("ascii")

Traceback (most recent call last):

File "", line 1, in

UnicodeDecodeError: 'ascii' codec can't decode byte 0xd6 in position 0: ordinal

not in range(128)

>>> s.decode("utf-8")

Traceback (most recent call last):

File "", line 1, in

File "D:\Python27\lib\encodings\utf_8.py", line 16, in decode

return codecs.utf_8_decode(input, errors, True)

UnicodeDecodeError: 'utf8' codec can't decode byte 0xd6 in position 0: invalid c

ontinuation byte

>>> s.decode("gb2312")

u'\u4e2d\u6587'

>>> s.decode("unicode")

Traceback (most recent call last):

File "", line 1, in

LookupError: unknown encoding: unicode

>>>

>>> s="中文" #dos的默认编码是gbk，所以此例只能先decode(gbk/gb2312)，再encode成需要的编码

>>> s.encode()

Traceback (most recent call last):

File "", line 1, in

UnicodeDecodeError: 'ascii' codec can't decode byte 0xd6 in position 0: ordinal

not in range(128)

>>> s.encode("gbk")

Traceback (most recent call last):

File "", line 1, in

UnicodeDecodeError: 'ascii' codec can't decode byte 0xd6 in position 0: ordinal

not in range(128)

>>> s.encode("ascii")

Traceback (most recent call last):

File "", line 1, in

UnicodeDecodeError: 'ascii' codec can't decode byte 0xd6 in position 0: ordinal

not in range(128)

>>> s.encode("utf-8")

Traceback (most recent call last):

File "", line 1, in

UnicodeDecodeError: 'ascii' codec can't decode byte 0xd6 in position 0: ordinal

not in range(128)

>>> s.encode("gb2312")

Traceback (most recent call last):

File "", line 1, in

UnicodeDecodeError: 'ascii' codec can't decode byte 0xd6 in position 0: ordinal

not in range(128)

>>> s.encode("unicode")

Traceback (most recent call last):

File "", line 1, in

LookupError: unknown encoding: unicode

>>>

>>> s=u"中文" #unicode编码的中文只能encode，不能再decode

>>> s.decode()

Traceback (most recent call last):

File "", line 1, in

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordin

al not in range(128)

>>> s.decode("gbk")

Traceback (most recent call last):

File "", line 1, in

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordin

al not in range(128)

>>> s.decode("ascii")

Traceback (most recent call last):

File "", line 1, in

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordin

al not in range(128)

>>> s.decode("utf-8")

Traceback (most recent call last):

File "", line 1, in

File "D:\Python27\lib\encodings\utf_8.py", line 16, in decode

return codecs.utf_8_decode(input, errors, True)

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordin

al not in range(128)

>>> s.decode("gb2312")

Traceback (most recent call last):

File "", line 1, in

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordin

al not in range(128)

>>> s.decode("unicode")

Traceback (most recent call last):

File "", line 1, in

LookupError: unknown encoding: unicode

>>>

>>> s=u"中文" #unicode编码的中文只能encode，不能再decode

>>> s.encode()

Traceback (most recent call last):

File "", line 1, in

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordin

al not in range(128)

>>> s.encode("gbk")

'\xd6\xd0\xce\xc4'

>>> s.encode("ascii")

Traceback (most recent call last):

File "", line 1, in

UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordin

al not in range(128)

>>> s.encode("utf-8")

'\xe4\xb8\xad\xe6\x96\x87'

>>> s.encode("gb2312")

'\xd6\xd0\xce\xc4'

>>> s.encode("unicode")

Traceback (most recent call last):

File "", line 1, in

LookupError: unknown encoding: unicode

>>>

weixin_39596835

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫