TSAP(4) : 时间序列采样[asfreq( ) VS resample( )]

TSAP : TimeSeries Analysis with Python

import pandas as pd
import numpy as np
rng = pd.date_range('1/1/2011', periods=10, freq='H')
ts = pd.Series(np.random.randn(len(rng)), index=rng)
# 时间跨度为小时
ts
    2011-01-01 00:00:00   -1.065583
    2011-01-01 01:00:00   -0.586701
    2011-01-01 02:00:00   -0.554193
    2011-01-01 03:00:00   -0.316603
    2011-01-01 04:00:00    0.534045
    2011-01-01 05:00:00   -0.764800
    2011-01-01 06:00:00    0.196573
    2011-01-01 07:00:00    0.201643
    2011-01-01 08:00:00   -0.694384
    2011-01-01 09:00:00    0.555979
    Freq: H, dtype: float64
# 改变时间跨度(间隔为45分钟), value的值向后填充
converted = ts.asfreq('45Min', method='pad')

converted
2011-01-01 00:00:00   -1.065583
2011-01-01 00:45:00   -1.065583
2011-01-01 01:30:00   -0.586701
2011-01-01 02:15:00   -0.554193
2011-01-01 03:00:00   -0.316603
2011-01-01 03:45:00   -0.316603
2011-01-01 04:30:00    0.534045
2011-01-01 05:15:00   -0.764800
2011-01-01 06:00:00    0.196573
2011-01-01 06:45:00    0.196573
2011-01-01 07:30:00    0.201643
2011-01-01 08:15:00   -0.694384
2011-01-01 09:00:00    0.555979
Freq: 45T, dtype: float64

改变时间点的采样频率

缺失值的填充方式.

  • method : {‘backfill’, ‘bfill’, ‘pad’, ‘ffill’, None}
# backfill在缺失的时间点上,value的值向前填充
ts.asfreq('45Min', method='backfill') 

2011-01-01 00:00:00 -1.065583
2011-01-01 00:45:00 -0.586701
2011-01-01 01:30:00 -0.554193
2011-01-01 02:15:00 -0.316603
2011-01-01 03:00:00 -0.316603
2011-01-01 03:45:00 0.534045
2011-01-01 04:30:00 -0.764800
2011-01-01 05:15:00 0.196573
2011-01-01 06:00:00 0.196573
2011-01-01 06:45:00 0.201643
2011-01-01 07:30:00 -0.694384
2011-01-01 08:15:00 0.555979
2011-01-01 09:00:00 0.555979
Freq: 45T, dtype: float64

# bfill = backfill
ts.asfreq('45Min', method='bfill')

2011-01-01 00:00:00 -1.065583
2011-01-01 00:45:00 -0.586701
2011-01-01 01:30:00 -0.554193
2011-01-01 02:15:00 -0.316603
2011-01-01 03:00:00 -0.316603
2011-01-01 03:45:00 0.534045
2011-01-01 04:30:00 -0.764800
2011-01-01 05:15:00 0.196573
2011-01-01 06:00:00 0.196573
2011-01-01 06:45:00 0.201643
2011-01-01 07:30:00 -0.694384
2011-01-01 08:15:00 0.555979
2011-01-01 09:00:00 0.555979
Freq: 45T, dtype: float64

# ffill 向后填充缺失
# 01:30:00 用 01:00:00的值来填充
converted.asfreq('45Min', method='ffill')

2011-01-01 00:00:00 -1.065583
2011-01-01 00:45:00 -1.065583
2011-01-01 01:30:00 -0.586701
2011-01-01 02:15:00 -0.554193
2011-01-01 03:00:00 -0.316603
2011-01-01 03:45:00 -0.316603
2011-01-01 04:30:00 0.534045
2011-01-01 05:15:00 -0.764800
2011-01-01 06:00:00 0.196573
2011-01-01 06:45:00 0.196573
2011-01-01 07:30:00 0.201643
2011-01-01 08:15:00 -0.694384
2011-01-01 09:00:00 0.555979
Freq: 45T, dtype: float64

# 时间频率切换到低频,向前填充
converted.asfreq('90Min', method = 'ffill')

2011-01-01 00:00:00 -1.065583
2011-01-01 01:30:00 -0.586701
2011-01-01 03:00:00 -0.316603
2011-01-01 04:30:00 0.534045
2011-01-01 06:00:00 0.196573
2011-01-01 07:30:00 0.201643
2011-01-01 09:00:00 0.555979
Freq: 90T, dtype: float64

resample VS asfreq( )

ts.asfreq('D').sum()
-1.0655834142614131
ts.resample('D').sum()

2011-01-01 -2.494026
Freq: D, dtype: float64

ts.asfreq('2H')

2011-01-01 00:00:00 -1.065583
2011-01-01 02:00:00 -0.554193
2011-01-01 04:00:00 0.534045
2011-01-01 06:00:00 0.196573
2011-01-01 08:00:00 -0.694384
Freq: 2H, dtype: float64

ts.resample('2H').sum()
2011-01-01 00:00:00   -1.652284
2011-01-01 02:00:00   -0.870797
2011-01-01 04:00:00   -0.230756
2011-01-01 06:00:00    0.398216
2011-01-01 08:00:00   -0.138405
Freq: 2H, dtype: float64

What is the difference between .resample() and .asfreq()?

  • asfreq() : 采样时间点的value
  • resample() : 采样时间段内value
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值