python中多个条件求值怎么算,如何使用python pandas对列进行分组并按条件计算值?...

Input:

df=pd.DataFrame({

'BusId':['abc1','abc2','abc3','abc1','abc2','abc4'],

"Fair":[5,6,7,10,5,4]

})

Need to group by BusId and need the following output

Output:

BusId Count of Fair>=5 Count of Fair>=10

abc1 2 1

abc2 1 0

abc3 1 0

abc4 0 0

Thanks for the help.

解决方案

Using agg on your series with two helper functions to count the values above each of your thresholds.

However, aggregation on a Series as I am doing here will be deprecated in a future version of pandas.

df.groupby('BusId').Fair.agg({

'gt5': lambda x: (x>=5).sum(),

'gt10': lambda x: (x>=10).sum()

})

gt5 gt10

BusId

abc1 2 1

abc2 2 0

abc3 1 0

abc4 0 0

You could also remove the use of lambda:

out = df.assign(gt5=df.Fair.ge(5), gt10=df.Fair.ge(10))

out.groupby('BusId').agg({'gt5': 'sum', 'gt10': 'sum'}).astype(int)

gt5 gt10

BusId

abc1 2 1

abc2 2 0

abc3 1 0

abc4 0 0

The second approach will be slightly faster:

%%timeit

df.groupby('BusId').Fair.agg({

'gt5': lambda x: (x>=5).sum(),

'gt10': lambda x: (x>=10).sum()

})

5.05 ms ± 69 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%%timeit

out = df.assign(gt5=df.Fair.ge(5), gt10=df.Fair.ge(10))

out.groupby('BusId').agg({'gt5': 'sum', 'gt10': 'sum'}).astype(int)

3.76 ms ± 44.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值