python正则匹配任意字符_正则表达式：匹配以任意字符开头的字符串，然后是连字符...

最新推荐文章于 2022-07-22 15:13:06 发布

weixin_39907850

最新推荐文章于 2022-07-22 15:13:06 发布

阅读量436

点赞数

文章标签： python正则匹配任意字符

试试这段代码：str = u"BBC \xe2 abc - Here is the text"

m = re.search(ur'^(.*? [-\xe2] )?(.*)', str, re.UNICODE)

# or equivalent

# m = re.match(ur'(.*? [-\xe2] )?(.*)', str, re.UNICODE)

# You don't really need re.UNICODE, but if you want to use unicode

# characters, it's better you conside à to be a letter :-) , so re.UNICODE

# group(1) contains the part before the hypen

if m.group(1) is not None:

print m.group(1)

# group(2) contains the part after the hypen or all the string

# if there is no hypen

print m.group(2)

正则表达式的解释：^ is the beginning of the string (the match method always use the beginning

of the string)

(...) creates a capturing group (something that will go in group(...)

(...)? is an optional group

[-\xe2] one character between - and \xe2 (you can put any number of characters

in the [], like [abc] means a or b or c

.*? [-\xe2] (there is a space after the ]) any character followed by a space, an hypen and a space

the *? means that the * is "lazy" so it will try to catch only the

minimum number possible of characters, so ABC - DEF - GHI

.* - would catch ABC - DEF -, while .* - will catch ABC -

so

(.* [-\xe2] )? the string could start with any character followed by an hypen

if yes, put it in group(1), if no group(1) will be None

(.*) and it will be followed by any character. You dont need the

$ (that is the end-of the string, opposite of ^) because * will

always eat all the characters it can eat (it's an eager operator)

weixin_39907850

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。