机器学习-数据科学库 07 pandas 时间序列

eddiechen10081

于 2022-09-26 08:50:06 发布

阅读量229

点赞数

原文链接：https://blog.csdn.net/weixin_46274168/article/details/109590534

版权

概述

时间序列就是一段时间范围。

不管在什么行业, 时间序列都是一种非常重要的数据形式, 很多统计数据以及数据的规律也都和时间序列有着非常重要的联系, 而且在 pandas 中处理时间序列是非常简单的。

生成

# start和end以及freq配合能够生成start和end范围内以频率freq的一组时间索引
# start和periods以及freq配合能够生成从start开始的频率为freq的periods个时间索引
pd.date_range(start=None, end=None, periods=None, freq='D')

freq 缩写的参考 https://pandas.pydata.org/docs/user_guide/timeseries.html#timeseries-offset-aliases

# 例子
# https://www.w3resource.com/pandas/series/series-resample.php
freq = "3D"
freq = "30s"

文档：https://pandas.pydata.org/docs/reference/api/pandas.date_range.html

在 DataFrame 中使用时间序列

dr = pd.date_range("20170101", periods=10)
df = pd.DataFrame(np.random.rand(10), index=dr)

# DatetimeIndex

转化为 pandas datetime object

format 注意需要考虑数据的格式，必要时需要传入模板字符串

di = pd.to_datetime(df[“timeStamp”], format=“”)


# pandas 重采样

重采样：指的是将时间序列从一个频率转化为另一个频率进行处理的过程，将高频率数据转化为低频率数据为降采样，低频率转化为高频率为升采样。

pandas提供了一个resample的方法来帮助我们实现频率转化

rule 参考 https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.resample.html

当为字符串可以看做 freq 缩写

out = DataFrame.resample(rule)


# PeriodIndex

之前所学习的 DatetimeIndex 可以理解为时间戳，那么现在我们要学习的 PeriodIndex 可以理解为时间段。

创建

pi = pd.PeriodIndex(year=data[year], month=data[month], day=data[day], hour=data[hour], freq=freq)


输出


## 例子：时间段降采样

通过包装为 DataFrame 进行重采样

df2 = df.set_index(periods).resample(“10D”)


> 参考：https://stackoverflow.com/questions/55287911/resample-by-periodindex-using-kind-parameter

eddiechen10081

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫