python关键字列表_python关键字匹配(关键字列表-列)

supposed dataset,

Name Value

0 K Ieatapple

1 Y bananaisdelicious

2 B orangelikesomething

3 Q bluegrape

4 C appleislike

and I have keyword list like

[apple, banana]

In this dataset, matching column 'Value' - [keyword list]

*I mean matching is keyword in list in 'Value'

I would like to see how the keywords in the list match column,

so.. I want to find out how much the matching rate is.

Ultimately, what I want to know is

'Finding match rate between keywords and columns'

Percentage, If I can, filtered dataframe

Thank you.

Edit

In my real dataset, There are keywords in the sentence,

Ex,

Ilikeapplethanbananaandorange

so It doesn`t work if use keyword - keyword matching(1:1).

解决方案

Use str.contains to match words to your sentences:

keywords = ['apple', 'banana']

df['Value'].str.contains("|".join(keywords)).sum() / len(df)

# 0.6

Or if you want to keep the rows:

df[df['Value'].str.contains("|".join(keywords))]

Name Value

0 K I eat apple

1 Y banana is delicious

4 C appleislike

More details

The pipe | is the or operator in regular expression:

So we join our list of words with a pipe to match one of these words:

>>> keywords = ['apple', 'banana']

>>> "|".join(keywords)

'apple|banana'

So in regular expression we have the statement now:

match rows where the sentence contains "apple" OR "banana"

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值