python比较时间的最大值_在Python熊猫中查找每日最大值及其时间戳(yyyy:mm:dd hh:mm:ss)...

I actually have a 150 MB data of daily minute-wise measurements for everyday for two years. I have given here a sample data. I want to create a new data frame with maximum of each day with its time-stamp.

My sample data is:

DateTime Power

01-Aug-16 10:43:00.000 229.9607961

01-Aug-16 10:43:23.000 230.9030781

01-Aug-16 10:44:00.000 231.716212

01-Aug-16 10:45:00.000 232.4485882

01-Aug-16 10:46:00.000 233.2739154

02-Aug-16 09:42:00.000 229.6851724

02-Aug-16 09:43:00.000 230.9163998

02-Aug-16 09:43:06.000 230.9883337

02-Aug-16 09:44:00.000 231.2569098

02-Aug-16 09:49:00.000 229.5774805

02-Aug-16 09:50:00.000 229.8758693

02-Aug-16 09:51:00.000 229.9825204

03-Aug-16 10:09:00.000 231.3605982

03-Aug-16 10:10:00.000 231.6827163

03-Aug-16 10:11:00.000 231.1580262

03-Aug-16 10:12:00.000 230.4054286

03-Aug-16 10:13:00.000 229.6507959

03-Aug-16 10:13:02.000 229.6268353

03-Aug-16 10:14:00.000 230.4584964

03-Aug-16 10:15:00.000 230.9004206

03-Aug-16 10:16:00.000 231.189036

My present code is:

max_per_day = df.groupby(pd.Grouper(key='time',freq='D')).max()

print(max_per_day)

My present output is:

time

2016-08-01 237.243835

2016-08-02 239.658539

2016-08-03 237.424683

2016-08-04 236.790695

2016-08-05 240.163910

Presently it outputs yyyy:mm:dd and value. But I want even hh:mm (or hh:mm:ss) against each maximum value. I tried following code:

max_pmpp_day = df.loc[df.groupby(pd.Grouper(freq='D')).idxmax().iloc[:,0]]

The output is:

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Int64Index'

I tried @jezrael answer

df['DateTime'] = pd.to_datetime(df['time'])

s = df.groupby(pd.Grouper(key='DateTime', freq='D'))['Pmpp'].transform('max')

df = df[df['Pmpp'] == s]

print(df)

and output is

time Pmpp DateTime

34 2016-08-01 11:11:00 237.243835 2016-08-01 11:11:00

434 2016-08-02 13:30:02 239.658539 2016-08-02 13:30:02

648 2016-08-03 12:39:00 237.424683 2016-08-03 12:39:00

解决方案

You can use GroupBy.transform or Resampler.transform for return max values in new Series and compare with original column:

df['DateTime'] = pd.to_datetime(df['DateTime'])

s = df.groupby(pd.Grouper(key='DateTime', freq='D'))['Power'].transform('max')

#alternative

#s = df.resample('D', on='DateTime')['Power'].transform('max')

df = df[df['Power'] == s]

print (df)

DateTime Power

4 2016-08-01 10:46:00 233.273915

8 2016-08-02 09:44:00 231.256910

13 2016-08-03 10:10:00 231.682716

Or create DatetimeIndex and add column after groupby for check idxmax:

df['DateTime'] = pd.to_datetime(df['DateTime'])

df = df.set_index('DateTime')

df = df.loc[df.groupby(pd.Grouper(freq='D'))['Power'].idxmax()]

print (df)

Power

DateTime

2016-08-01 10:46:00 233.273915

2016-08-02 09:44:00 231.256910

2016-08-03 10:10:00 231.682716

Solution of @Jon Clements, thank you:

df = (df.sort_values('Power')

.groupby(df.DateTime.dt.to_period('D'))

.last()

.reset_index(drop=True))

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值