利用python进行数据分析-时间序列2

1.带有重复索引的时间序列

dates=pd.DatetimeIndex(['1/1/2000','1/2/2000','1/2/2000','1/2/2000','1/3/2000'])
dup_ts=Series(np.arange(5),index=dates)
print dup_ts

结果为:

2000-01-01    0
2000-01-02    1
2000-01-02    2
2000-01-02    3
2000-01-03    4
dtype: int32

通过检查索引的is_unique属性,我们就可以知道它是不是唯一的

print dup_ts.index.is_unique

结果为:

False

对这个时间序列进行索引,要么产生标量值,要么产生切片,具体要看所选的时间点是否重复

print dup_ts['1/3/2000']  #不重复
print dup_ts['1/2/2000']  #重复

结果为:

4
2000-01-02    1
2000-01-02    2
2000-01-02    3
dtype: int32

假设你想要对具有唯一时间戳的数据进行聚合。一个办法是使用groupby,并传入level=0(索引的唯一 一层!)

grouped=dup_ts.groupby(level=0)
print grouped.mean()
print grouped.count()

结果为:

2000-01-01    0
2000-01-02    2
2000-01-03    4
dtype: int32
2000-01-01    1
2000-01-02    3
2000-01-03    1
dtype: int64


2.日期的范围、频率以及移动

dates=[datetime.datetime(2011,1,2),datetime.datetime(2011,1,5),
       datetime.datetime(2011,1,7),datetime.datetime(2011,1,8),
       datetime.datetime(2011,1,10),datetime.datetime(2011,1,12)]
ts=Series(np.random.randn(6),index=dates)
print ts
print ts.resample('D')

结果为:

2011-01-02    1.068995
2011-01-05    0.564281
2011-01-07    1.910822
2011-01-08   -0.339067
2011-01-10   -1.671388
2011-01-12   -0.679710
dtype: float64
2011-01-02    1.068995
2011-01-03         NaN
2011-01-04         NaN
2011-01-05    0.564281
2011-01-06         NaN
2011-01-07    1.910822
2011-01-08   -0.339067
2011-01-09         NaN
2011-01-10   -1.671388
2011-01-11         NaN
2011-01-12   -0.679710
Freq: D, dtype: float64


3.生成日期范围

index=pd.date_range('4/1/2012','6/1/2012')
print index

结果为:

DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20',
               '2012-04-21', '2012-04-22', '2012-04-23', '2012-04-24',
               '2012-04-25', '2012-04-26', '2012-04-27', '2012-04-28',
               '2012-04-29', '2012-04-30', '2012-05-01', '2012-05-02',
               '2012-05-03', '2012-05-04', '2012-05-05', '2012-05-06',
               '2012-05-07', '2012-05-08', '2012-05-09', '2012-05-10',
               '2012-05-11', '2012-05-12', '2012-05-13', '2012-05-14',
               '2012-05-15', '2012-05-16', '2012-05-17', '2012-05-18',
               '2012-05-19', '2012-05-20', '2012-05-21', '2012-05-22',
               '2012-05-23', '2012-05-24', '2012-05-25', '2012-05-26',
               '2012-05-27', '2012-05-28', '2012-05-29', '2012-05-30',
               '2012-05-31', '2012-06-01'],
              dtype='datetime64[ns]', freq='D')

默认情况下,date_range会产生按天计算的时间点。如果只传入起始或结束日期,那就还得传入一个表示一段时间的数字

print pd.date_range(start='4/1/2012',periods=20)
print pd.date_range(end='6/1/2012',periods=20)

结果为:

DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
               '2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
               '2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
               '2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
               '2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20'],
         

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值