python中多个条件求值怎么算,如何使用python pandas对列进行分组并按条件计算值？...

阿内本人

于 2020-12-08 11:02:36 发布

阅读量268

点赞数

文章标签： python中多个条件求值怎么算

Input:

df=pd.DataFrame({

'BusId':['abc1','abc2','abc3','abc1','abc2','abc4'],

"Fair":[5,6,7,10,5,4]

})

Need to group by BusId and need the following output

Output:

BusId Count of Fair>=5 Count of Fair>=10

abc1 2 1

abc2 1 0

abc3 1 0

abc4 0 0

Thanks for the help.

解决方案

Using agg on your series with two helper functions to count the values above each of your thresholds.

However, aggregation on a Series as I am doing here will be deprecated in a future version of pandas.

df.groupby('BusId').Fair.agg({

'gt5': lambda x: (x>=5).sum(),

'gt10': lambda x: (x>=10).sum()

})

gt5 gt10

BusId

abc1 2 1

abc2 2 0

abc3 1 0

abc4 0 0

You could also remove the use of lambda:

out = df.assign(gt5=df.Fair.ge(5), gt10=df.Fair.ge(10))

out.groupby('BusId').agg({'gt5': 'sum', 'gt10': 'sum'}).astype(int)

gt5 gt10

BusId

abc1 2 1

abc2 2 0

abc3 1 0

abc4 0 0

The second approach will be slightly faster:

%%timeit

df.groupby('BusId').Fair.agg({

'gt5': lambda x: (x>=5).sum(),

'gt10': lambda x: (x>=10).sum()

})

5.05 ms ± 69 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%%timeit

out = df.assign(gt5=df.Fair.ge(5), gt10=df.Fair.ge(10))

out.groupby('BusId').agg({'gt5': 'sum', 'gt10': 'sum'}).astype(int)

3.76 ms ± 44.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python中多个条件求值怎么算,如何使用python pandas对列进行分组并按条件计算值？...

Input:df=pd.DataFrame({'BusId':['abc1','abc2','abc3','abc1','abc2','abc4'],"Fair":[5,6,7,10,5,4]})Need to group by BusId and need the following outputOutput:BusId Count of Fair>=5 Count of Fai...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。