python保留字符串之外的,128范围之外的Python字符串

最新推荐文章于 2021-12-21 17:37:17 发布

CVRunner

最新推荐文章于 2021-12-21 17:37:17 发布

阅读量78

点赞数

文章标签： python保留字符串之外的

Hi,

Could anyone explain me how the python string "é" is mapped to

the binary code "\xe9" in my python interpreter ?

"é" is not present in the 7-bit ASCII table that is the default

encoding, right ? So is the mapping "é" -"\xe9" portable ?

(site-)configuration dependent ? Can anyone have something

different of "é" when ''print "\xe9"'' is executed ? If the process

is config-dependent, what kind of config info is used ?

Regards,

SB

解决方案

Sébastien Boisgérault schrieb:

Hi,

Could anyone explain me how the python string "é" is mapped to

the binary code "\xe9" in my python interpreter ?

"é" is not present in the 7-bit ASCII table that is the default

encoding, right ? So is the mapping "é" -"\xe9" portable ?

(site-)configuration dependent ? Can anyone have something

different of "é" when ''print "\xe9"'' is executed ? If the process

is config-dependent, what kind of config info is used ?

The default encoding has nothing to do with this. "\xe9" is just a byte.

You can write it into a file (which the terminal is basically), and no

default encoding whatsoever in the mix.

The default-encoding comes into play when you write unicode(!) strings

to a file. Then the unicode string is converted to a byte string using

the default-eocoding. Which will fail miserably if the default encoding

is ascii (as it is supposed to be) and your unicode string contains any

"funny" characters.

But even if you encode the unicode string explicitely with an encoding

like latin1 or utf-8, the resulting byte strings will just be written to

the file. And it is a totally different question (and actually not

controllable by you/python) if the terminal will interpret the bytes

correct or not.

Diez

Sébastien Boisgérault wrote:

Could anyone explain me how the python string "é" is mapped to

the binary code "\xe9" in my python interpreter ?

in the iso-8859-1 character set, the character é is represented by the code

0xE9 (233 in decimal). there''s no mapping going on here; there''s only one

character in the string. how it appears on your screen depends on how you

print it, and what encoding your terminal is using.

>>s = "é"

len(s)

1

>>ord(s)

233

>>hex(ord(s))

''0xe9''

>>s

''\xe9''

>>print repr(s)

''\xe9''

>>print s

é

>>print chr(233)

é

Fredrik Lundh wrote:

in the iso-8859-1 character set, the character é is represented by the code

0xE9 (233 in decimal). there''s no mapping going on here; there''s only one

character in the string. how it appears on your screen depends on how you

print it, and what encoding your terminal is using.

Crystal clear. Thanks !

SB

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python保留字符串之外的,128范围之外的Python字符串

Hi,Could anyone explain me how the python string "é" is mapped tothe binary code "\xe9" in my python interpreter ?"é" is not present in the 7-bit ASCII table that is the defaultencoding, right ? So is...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。