python精确匹配字符串_熊猫python中字符串的完全匹配

1586010002-jmsa.png

I have a column in data frame which ex df:

A

0 Good to 1. Good communication EI : tathagata.kar@ae.com

1 SAP ECC Project System EI: ram.vaddadi@ae.com

2 EI : ravikumar.swarna Role:SSE Minimum Skill

I have a list of of strings

ls=['tathagata.kar@ae.com','a.kar@ae.com']

Now if i want to filter out

for i in range(len(ls)):

df1=df[df['A'].str.contains(ls[i])

if len(df1.columns!=0):

print ls[i]

I get the output

tathagata.kar@ae.com

a.kar@ae.com

But I need only tathagata.kar@ae.com

How Can It be achieved?

As you can see I've tried str.contains But I need something for extact match

解决方案

You could simply use ==

string_a == string_b

It should return True if the two strings are equal. But this does not solve your issue.

Edit 2: You should use len(df1.index) instead of len(df1.columns). Indeed, len(df1.columns) will give you the number of columns, and not the number of rows.

Edit 3: After reading your second post, I've understood your problem. The solution you propose could lead to some errors.

For instance, if you have:

ls=['tathagata.kar@ae.com','a.kar@ae.com', 'tathagata.kar@ae.co']

the first and the third element will match str.contains(r'(?:\s|^|Ei:|EI:|EI-)'+ls[i])

And this is an unwanted behaviour.

You could add a check on the end of the string: str.contains(r'(?:\s|^|Ei:|EI:|EI-)'+ls[i]+r'(?:\s|$)')

Like this:

for i in range(len(ls)):

df1 = df[df['A'].str.contains(r'(?:\s|^|Ei:|EI:|EI-)'+ls[i]+r'(?:\s|$)')]

if len(df1.index != 0):

print (ls[i])

(Remove parenthesis in the "print" if you use python 2.7)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值