pandas最大的时间间隔_在pandas DataFrame / Series中快速选择时间间隔

my problem is that I want to filter a DataFrame to only include times within the interval [start, end) . If do not care about the day, I would like to filter only for start and end time for each day. I have a solution for this but it is slow. So my question is if there is a faster way to do the time based filtering.

Example

import pandas as pd

import time

index=pd.date_range(start='2012-11-05 01:00:00', end='2012-11-05 23:00:00', freq='1S').tz_localize('UTC')

df=pd.DataFrame(range(len(index)), index=index, columns=['Number'])

# select from 1 to 2 am, include day

now=time.time()

df2=df.ix['2012-11-05 01:00:00':'2012-11-05 02:00:00']

print 'Took %s seconds' %(time.time()-now) #0.0368609428406

# select from 1 to 2 am, for every day

now=time.time()

selector=(df.index.hour>=1) & (df.index.hour<2)

df3=df[selector]

print 'Took %s seconds' %(time.time()-now) #Took 0.0699911117554

As you can see if I remove the day (second case) it takes almost twice as much. The computation time increases rapidly if I have a number of different days, e.g from 5 to 7 Nov:

index=pd.date_range(start='2012-11-05 01:00:00', end='2012-11-07 23:00:00', freq='1S').tz_localize('UTC')

So, to summarize is there a faster method to filter by time of the day, across many days?

Thx

解决方案

You need between_time method.

In [14]: %timeit df.between_time(start_time='01:00', end_time='02:00')

100 loops, best of 3: 10.2 ms per loop

In [15]: %timeit selector=(df.index.hour>=1) & (df.index.hour<2); df[selector]

100 loops, best of 3: 18.2 ms per loop

I had done these tests with 5th to 7th November as index.

Documentation

Definition: df.between_time(self, start_time, end_time, include_start=True, include_end=True)

Docstring:

Select values between particular times of the day (e.g., 9:00-9:30 AM)

Parameters

----------

start_time : datetime.time or string

end_time : datetime.time or string

include_start : boolean, default True

include_end : boolean, default True

Returns

-------

values_between_time : type of caller

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值