python中如何表示非_如何在Python中的非打印ASCII字符处分割线

How can I split a line in Python at a non-printing ascii character (such as the long minus sign hex 0x97 , Octal 227)?

I won't need the character itself. The information after it will be saved as a variable.

解决方案

You can use re.split.

>>> import re

>>> re.split('\W+', 'Words, words, words.')

['Words', 'words', 'words', '']

Adjust the pattern to only include the characters you want to keep.

Example (w/ the long minus):

>>> # \xe2\x80\x93 represents a long dash (or long minus)

>>> s = 'hello – world'

>>> s

'hello \xe2\x80\x93 world'

>>> import re

>>> re.split("\xe2\x80\x93", s)

['hello ', ' world']

Or, the same with unicode:

>>> # \u2013 represents a long dash, long minus or so called en-dash

>>> s = u'hello – world'

>>> s

u'hello \u2013 world'

>>> import re

>>> re.split(u"\u2013", s)

[u'hello ', u' world']

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值