python正则表达式详解 pandas_Python pandas计算正则表达式与字符串中复合词的匹配...

我有一个正则表达式字典,我想计算字典中包含复合词的主题和正则表达式的匹配项。在import pandas as pd

terms = {'animals':"(fox|russian brown deer|bald eagle|arctic fox)",

'people':'(John Adams|Rob|Steve|Superman|Super man)',

'games':'(basketball|basket ball|bball)'

}

df=pd.DataFrame({

'Score': [4,6,2,7,8],

'Foo': ['Superman was looking for a russian brown deer.', 'John adams started to play basket ball with rob yesterday before steve called him','Basketball or bball is a sport played by Steve afterschool','The bald eagle flew pass the arctic fox three times','The fox was sptted playing basket ball?']

})

为了计算匹配数,我可以使用类似于问题的代码:Python pandas count number of Regex matches in a string。但是它用空格分割字符串,然后计算不包含复合项的项。有什么替代方法可以让由空格连接的复合词包含在内?在

^{pr2}$

最终结果应该是:Foo Score animals people \

0 Superman was looking for a russian brown deer. 4 1 1

1 John adams started to play basket ball with ro... 6 0 3

2 Basketball or bball is a sport played by Steve... 2 0 1

3 The bald eagle flew pass the artic fox three t... 7 3 0

4 The fox was sptted playing basket ball 8 1 0

games

0 0

1 1

2 2

3 0

4 1

请注意,对于第三行,北极狐中的“狐狸”一词和“北极狐”一词应分别计算一次(两次合计),作为动物列。在

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值