python中合法的八进制_在Python中八进制转义UTF-8字符的正确方法

1586010002-jmsa.png

I need to get the octal escape sequence for UTF-8 characters in Python and was wondering whether there's any simpler way of doing what I want to do, e.g. something in the standard library that I overlooked. I have a makeshift string manipulation function but I'm hoping there is a better solution.

I want to get from (e.g.): 𐅥

To: \360\220\205\245

Right now I'm doing this:

char = '\U00010165' # this is how Python hands it over to me

char = str(char.encode())

# char = "b'\xf0\x90\x85\xa5'"

arr = char[4:-1].split(“\\x”)

# arr = ['f0', '90', '85', 'a5']

char = ''

for i in arr:

char += '\\' + str(oct(int(i,16)))

# char = \0o360\0o220\0o205\0o245

char = char.replace("0o", "")

Any suggestions?

解决方案

Use format(i, '03o') to format to octal numbers without leading 0o indicator, or str.format() to include the literal backslash too:

>>> format(16, '03o')

'020'

>>> '\\{:03o}'.format(16)

'\\020'

and just loop over the encoded bytes value; each character is yielded as an integer:

char = ''.join(['\\{:03o}'.format(c) for c in char.encode('utf8')])

Demo:

>>> char = '\U00010165'

>>> ''.join(['\\{:03o}'.format(c) for c in char.encode('utf8')])

'\\360\\220\\205\\245'

>>> print(''.join(['\\{:03o}'.format(c) for c in char.encode('utf8')]))

\360\220\205\245

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值