import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime
from datetime import timedelta
获取现在时间
now = datetime.now()
print(now)
print(now.year, now.month, now.day)
2020 5 7
计算日期间隔
data1 = datetime(2020, 5, 20)
data2 = datetime(2020, 6, 20)
delta = data2 - data1
print(delta)
31 days, 0:00:00
差了多少秒
print(delta.total_seconds())
2678400.0
格式化时间
date = datetime(2016, 3, 20, 8, 30)
print(date.strftime("%Y/%m/%d %H:%M:%S"))
2016/03/20 08:30:00
创建时间序列
dates = [datetime(2016, 3, 1), datetime(2016, 3, 2), datetime(2016, 3, 3), datetime(2016, 3, 4)]
s = pd.Series(np.random.randn(4), index=dates)
print(s)
2016-03-01 1.291515
2016-03-02 -0.860440
2016-03-03 2.306542
2016-03-04 0.038444
生成Datatime的时间序列
print(pd.date_range('20160320', '20160330'))
DatetimeIndex(['2016-03-20', '2016-03-21', '2016-03-22', '2016-03-23',
'2016-03-24', '2016-03-25', '2016-03-26', '2016-03-27',
'2016-03-28', '2016-03-29', '2016-03-30'],
dtype='datetime64[ns]', freq='D')
以月为单位
print(pd.date_range(start='20160320', periods=10, freq='M'))
print(pd.date_range(start='20160320', periods=10, freq='M'))
Pandas的时期
在月份上加2
p = pd.Period(2010, freq='M')
print(p+2)
2010-03
转换频度
从年转换到月,以它的起始时间
print(a.asfreq('M', how='start'))
2016-01
时间段频率
p = pd.Period('2016Q4', 'Q-JAN')
print(p.asfreq('M', how='start'), p.asfreq('M', how='end'))
2015-11 2016-01
重采样(resampling)指的是将时间序列从一个频率转换到另一个频率的处理过程;
将高频率(间隔短)数据聚合到低频率(间隔长)称为降采样(downsampling);
将低频率数据转换到高频率则称为升采样(unsampling);
有些采样即不是降采样也不是升采样,例如将W-WED(每周三)转换为W-FRI;
Timestamp和Period相互转换
s = pd.Series(np.random.randn(5), index=pd.date_range('2016-04-01', periods=5, freq='M'))
print(s.to_period())
2016-04 -1.526392
2016-05 0.391265
2016-06 -0.820315
2016-07 0.517220
2016-08 0.011468
分析每分钟的交易量,进行5分钟的数据量相加
ts = pd.Series(np.random.randint(0, 50, 60), index=pd.date_range('2016-04-25 09:30', periods=60))
ts.resample('5min', how='sum')
2016-04-25 09:30:00 39
2016-04-25 09:35:00 0
2016-04-25 09:40:00 0
O:代表开盘汇率Open
H:代表最高汇率High
L: 代表最低汇率Low
C: 代表收盘汇率Close
print(ts.resample('5min', how='ohlc'))
通过groupby方法进行重采样
月份重采样求和
print(ts.groupby(lambda x: x.month).sum())
3 870
4 759
5 882
6 236
dtype: int32
第二种方法以月为单位的时期
print(ts.groupby(ts.index.to_period('M')).sum())