pandas time series/data functionality

# 72 hours starting with midnight Jan 1st, 2011
In [1]: rng = date_range(’1/1/2011’, periods=72, freq=’H’)
In [3]: ts = Series(randn(len(rng)), index=rng)
In [4]: ts.head()
Out[4]:
2011-01-01 00:00:00 0.469112
2011-01-01 01:00:00 -0.282863
2011-01-01 02:00:00 -1.509059
2011-01-01 03:00:00 -1.135632
2011-01-01 04:00:00 1.212112
Freq: H, dtype: float64

Change frequency and fill gaps:

# to 45 minute frequency and forward fill
In [5]: converted = ts.asfreq(’45Min’, method=’pad’)
In [6]: converted.head()
Out[6]:
2011-01-01 00:00:00 0.469112
2011-01-01 00:45:00 0.469112
2011-01-01 01:30:00 -0.282863
2011-01-01 02:15:00 -1.509059
2011-01-01 03:00:00 -1.135632
Freq: 45T, dtype: float64

Resample:

# Daily means
In [7]: ts.resample(’D’, how=’mean’)
Out[7]:
2011-01-01 -0.319569
2011-01-02 -0.337703
2011-01-03 0.117258
Freq: D, dtype: float64

Converting to Timestamps

In [16]: to_datetime(Series([’Jul 31, 2009’, ’2010-01-10’, None]))
Out[16]:
0 2009-07-31
1 2010-01-10
2 NaT
dtype: datetime64[ns]
In [17]: to_datetime([’2005/11/23’, ’2010.12.31’])
Out[17]:
<class ’pandas.tseries.index.DatetimeIndex’>
[2005-11-23, 2010-12-31]
Length: 2, Freq: None, Timezone: None

If you use dates which start with the day first (i.e. European style), you can pass the dayfirst flag:

In [18]: to_datetime([’04-01-2012 10:00’], dayfirst=True)
Out[18]:
<class ’pandas.tseries.index.DatetimeIndex’>
[2012-01-04 10:00:00]
Length: 1, Freq: None, Timezone: None
In [19]: to_datetime([’14-01-2012’, ’01-14-2012’], dayfirst=True)
Out[19]:
<class ’pandas.tseries.index.DatetimeIndex’>
[2012-01-14, 2012-01-14]
Length: 2, Freq: None, Timezone: None

Invalid Data

In [20]: to_datetime([’2009-07-31’, ’asd’])
Out[20]: array([’2009-07-31’, ’asd’], dtype=object)
In [21]: to_datetime([’2009-07-31’, ’asd’], coerce=True)
Out[21]:
<class ’pandas.tseries.index.DatetimeIndex’>
[2009-07-31, NaT]
Length: 2, Freq: None, Timezone: None

Generating Ranges of Timestamps

In [27]: dates = [datetime(2012, 5, 1), datetime(2012, 5, 2), datetime(2012, 5, 3)]
In [28]: index = DatetimeIndex(dates)
In [29]: index # Note the frequency information
Out[29]:
<class ’pandas.tseries.index.DatetimeIndex’>
[2012-05-01, ..., 2012-05-03]
Length: 3, Freq: None, Timezone: None
In [32]: index = date_range(’2000-1-1’, periods=1000, freq=’M’)
In [33]: index
Out[33]:
<class ’pandas.tseries.index.DatetimeIndex’>
[2000-01-31, ..., 2083-04-30]
Length: 1000, Freq: M, Timezone: None
In [34]: index = bdate_range(’2012-1-1’, periods=250)
In [35]: index
Out[35]:
<class ’pandas.tseries.index.DatetimeIndex’>
[2012-01-02, ..., 2012-12-14]
Length: 250, Freq: B, Timezone: None
In [36]: start = datetime(2011, 1, 1)
In [37]: end = datetime(2012, 1, 1)
In [38]: rng = date_range(start, end)
In [39]: rng
Out[39]:
<class ’pandas.tseries.index.DatetimeIndex’>
[2011-01-01, ..., 2012-01-01]
Length: 366, Freq: D, Timezone: None
In [40]: rng = bdate_range(start, end)
In [41]: rng
Out[41]:
<class ’pandas.tseries.index.DatetimeIndex’>
[2011-01-03, ..., 2011-12-30]
Length: 260, Freq: B, Timezone: None
In [42]: date_range(start, end, freq=’BM’)
Out[42]:
<class ’pandas.tseries.index.DatetimeIndex’>
[2011-01-31, ..., 2011-12-30]
Length: 12, Freq: BM, Timezone: None
In [43]: date_range(start, end, freq=’W’)
Out[43]:
<class ’pandas.tseries.index.DatetimeIndex’>
[2011-01-02, ..., 2012-01-01]
Length: 53, Freq: W-SUN, Timezone: None
In [44]: bdate_range(end=end, periods=20)
Out[44]:
<class ’pandas.tseries.index.DatetimeIndex’>
[2011-12-05, ..., 2011-12-30]
Length: 20, Freq: B, Timezone: None
In [45]: bdate_range(start=start, periods=20)
Out[45]:
<class ’pandas.tseries.index.DatetimeIndex’>
[2011-01-03, ..., 2011-01-28]
Length: 20, Freq: B, Timezone: None

DatetimeIndex

In [56]: dft = DataFrame(randn(100000,1),columns=[’A’],index=date_range(’20130101’,periods=100000,freq='h'))
In [57]: dft
Out[57]:
A
2013-01-01 00:00:00 0.176444
2013-01-01 00:01:00 0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00 0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
... ...
2013-03-11 10:33:00 -0.293083
2013-03-11 10:34:00 -0.059881
2013-03-11 10:35:00 1.252450
2013-03-11 10:36:00 0.046611
2013-03-11 10:37:00 0.059478
2013-03-11 10:38:00 -0.286539
2013-03-11 10:39:00 0.841669
[100000 rows x 1 columns]
In [58]: dft[’2013’]
Out[58]:
A
2013-01-01 00:00:00 0.176444
2013-01-01 00:01:00 0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00 0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
... ...
2013-03-11 10:33:00 -0.293083
2013-03-11 10:34:00 -0.059881
2013-03-11 10:35:00 1.252450
2013-03-11 10:36:00 0.046611
2013-03-11 10:37:00 0.059478
2013-03-11 10:38:00 -0.286539
2013-03-11 10:39:00 0.841669
[100000 rows x 1 columns]
In [59]: dft[’2013-1’:’2013-2’]
Out[59]:
A
2013-01-01 00:00:00 0.176444
2013-01-01 00:01:00 0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00 0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
... ...
2013-02-28 23:53:00 0.103114
2013-02-28 23:54:00 -1.303422
2013-02-28 23:55:00 0.451943
2013-02-28 23:56:00 0.220534
2013-02-28 23:57:00 -1.624220
2013-02-28 23:58:00 0.093915
2013-02-28 23:59:00 -1.087454
[84960 rows x 1 columns]
In [60]: dft[’2013-1’:’2013-2-28’]
Out[60]:
A
2013-01-01 00:00:00 0.176444
2013-01-01 00:01:00 0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00 0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
... ...
2013-02-28 23:53:00 0.103114
2013-02-28 23:54:00 -1.303422
2013-02-28 23:55:00 0.451943
2013-02-28 23:56:00 0.220534
2013-02-28 23:57:00 -1.624220
2013-02-28 23:58:00 0.093915
2013-02-28 23:59:00 -1.087454
[84960 rows x 1 columns]
In [61]: dft[’2013-1’:’2013-2-28 00:00:00’]
Out[61]:
A
2013-01-01 00:00:00 0.176444
2013-01-01 00:01:00 0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00 0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
... ...
2013-02-27 23:54:00 0.897051
2013-02-27 23:55:00 -0.309230
2013-02-27 23:56:00 1.944713
2013-02-27 23:57:00 0.369265
2013-02-27 23:58:00 0.053071
2013-02-27 23:59:00 -0.019734
2013-02-28 00:00:00 1.388189
[83521 rows x 1 columns]
In [62]: dft[’2013-1-15’:’2013-1-15 12:30:00’]
Out[62]:
A
2013-01-15 00:00:00 0.501288
2013-01-15 00:01:00 -0.605198
2013-01-15 00:02:00 0.215146
2013-01-15 00:03:00 0.924732
2013-01-15 00:04:00 -2.228519
2013-01-15 00:05:00 1.517331
2013-01-15 00:06:00 -1.188774
... ...
2013-01-15 12:24:00 1.358314
2013-01-15 12:25:00 -0.737727
2013-01-15 12:26:00 1.838323
2013-01-15 12:27:00 -0.774090
2013-01-15 12:28:00 0.622261
2013-01-15 12:29:00 -0.631649
2013-01-15 12:30:00 0.193284
[751 rows x 1 columns]

Datetime Indexing

In [64]: dft[datetime(2013, 1, 1):datetime(2013,2,28)]
Out[64]:
                    A
2013-01-01 00:00:00 0.176444
2013-01-01 00:01:00 0.403310
2013-01-01 00:02:00 -0.154951
2013-01-01 00:03:00 0.301624
2013-01-01 00:04:00 -2.179861
2013-01-01 00:05:00 -1.369849
2013-01-01 00:06:00 -0.954208
... ...
2013-02-27 23:54:00 0.897051
2013-02-27 23:55:00 -0.309230
2013-02-27 23:56:00 1.944713
2013-02-27 23:57:00 0.369265
2013-02-27 23:58:00 0.053071
2013-02-27 23:59:00 -0.019734
2013-02-28 00:00:00 1.388189
[83521 rows x 1 columns]
In [65]: dft[datetime(2013, 1, 1, 10, 12, 0):datetime(2013, 2, 28, 10, 12, 0)]
Out[65]:
A
2013-01-01 10:12:00 -0.246733
2013-01-01 10:13:00 -1.429225
2013-01-01 10:14:00 -1.265339
2013-01-01 10:15:00 0.710986
2013-01-01 10:16:00 -0.818200
2013-01-01 10:17:00 0.543542
2013-01-01 10:18:00 1.577713
... ...
2013-02-28 10:06:00 0.311249
2013-02-28 10:07:00 2.366080
2013-02-28 10:08:00 -0.490372
2013-02-28 10:09:00 0.373340
2013-02-28 10:10:00 0.638442
2013-02-28 10:11:00 1.330135
2013-02-28 10:12:00 -0.945450
[83521 rows x 1 columns]

DateOffset objects

In [70]: from pandas.tseries.offsets import *
In [71]: d + DateOffset(months=4, days=5)
Out[71]: Timestamp(’2008-12-23 09:00:00’)
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值