pandas 如何删掉第一行_删除特定值之前的第一行-Pandas

I am trying to remove all rows before an initial value for a group. For instance, if my max_value = 250, then all rows for a group before that value should be removed. If a consequtive value of 250 or less appears again for that group, it is not removed.

import pandas as pd

df = pd.DataFrame({

'date': ['2019-01-01','2019-02-01','2019-03-01', '2019-04-01',

'2019-01-01','2019-02-01','2019-03-01', '2019-04-01',

'2019-01-01','2019-02-01','2019-03-01', '2019-04-01'],

'Asset': ['Asset A', 'Asset A', 'Asset A', 'Asset A', 'Asset A', 'Asset A', 'Asset B', 'Asset B',

'Asset B', 'Asset B', 'Asset B', 'Asset B'],

'Monthly Value': [100, 200, 300, 400, 500, 600, 100, 200, 300, 200, 300, 200]

})

unique_list = list(df['Asset'].unique())

max_value = 250

print(df)

date Asset Monthly Value

0 2019-01-01 Asset A 100

1 2019-02-01 Asset A 200

2 2019-03-01 Asset A 300

3 2019-04-01 Asset A 400

4 2019-01-01 Asset A 500

5 2019-02-01 Asset A 600

6 2019-03-01 Asset B 100

7 2019-04-01 Asset B 200

8 2019-01-01 Asset B 300

9 2019-02-01 Asset B 200

10 2019-03-01 Asset B 300

11 2019-04-01 Asset B 200

if the threshold or max_value is 250, then the dataframe should look like this (below). Notice the first time a value under 250 is detected for a group, all of those rows are removed. If the value 250 or higher is shown again, it is kept. Any help would be appreciated.

date Asset Monthly Value

2 2019-03-01 Asset A 300

3 2019-04-01 Asset A 400

4 2019-01-01 Asset A 500

5 2019-02-01 Asset A 600

8 2019-01-01 Asset B 300

9 2019-02-01 Asset B 200

10 2019-03-01 Asset B 300

11 2019-04-01 Asset B 200

解决方案

This should do the trick:

df[df.groupby('Asset')['Monthly Value'].apply(lambda x: x.gt(max_value).cumsum().ne(0))]

Yields:

date Asset Monthly Value

2 2019-03-01 Asset A 300

3 2019-04-01 Asset A 400

4 2019-01-01 Asset A 500

5 2019-02-01 Asset A 600

8 2019-01-01 Asset B 300

9 2019-02-01 Asset B 200

10 2019-03-01 Asset B 300

11 2019-04-01 Asset B 200

Additionally, if you store your max values in a dictionary like max_value = {'Asset A': 250, 'Asset B': 250}, you can do the following to achieve the same result:

df[df.groupby('Asset')['Monthly Value'].apply(lambda x: x.gt(max_value[x.name]).cumsum().ne(0))]

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值