一、日期和时间数据类型
1、datetime数据类型
from datetime import datetime
now = datetime.now()
print now
print now.year, now.month, now.day
2018-01-29 12:42:51.378000
2018 1 29
2、字符串和datetime相互转换
stamp = datetime(2017, 1, 3)
print str(stamp)
2017-01-03 00:00:00
value = '2017-01-04'
print datetime.strptime(value, '%Y-%m-%d')
2017-01-04 00:00:00
二、时间序列基础
最基本类型的时间序列类型以时间戳为索引的Series:
dates = [datetime(2017, 1, 2), datetime(2017, 1, 5), datetime(2017, 1, 7), datetime(2017, 1, 8),
datetime(2017, 1, 10), datetime(2017, 1, 12)]
ts = pd.Series(np.random.randn(6), index=dates)
print ts
print ts.index
2017-01-02 1.891046
2017-01-05 -0.580931
2017-01-07 -1.205874
2017-01-08 1.581906
2017-01-10 -1.345287
2017-01-12 -0.856211
dtype: float64
DatetimeIndex(['2017-01-02', '2017-01-05', '2017-01-07', '2017-01-08','2017-01-10', '2017-01-12'],dtype='datetime64[ns]', freq=None)
1、索引
stamp = ts.index[2]
print ts[stamp]
2.17794314871
print ts['20170107']
2.17794314871
2.切片
print ts['1/2/2017':'1/8/2017']
2017-01-02 1.598877
2017-01-05 -1.510436
2017-01-07 0.231524
2017-01-08 -0.472447
dtype: float64
3.日期的范围
pandas.date_range生成固定长度的DatetimeIndex,默认按天计算时间点。
index = pd.date_range('4/1/2017', '4/10/2017')
print index
DatetimeIndex(['2017-04-01', '2017-04-02', '2017-04-03', '2017-04-04',
'2017-04-05', '2017-04-06', '2017-04-07', '2017-04-08',
'2017-04-09', '2017-04-10', '2017-04-11', '2017-04-12',
'2017-04-13', '2017-04-14', '2017-04-15', '2017-04-16',
'2017-04-17', '2017-04-18', '2017-04-19', '2017-04-20'],
dtype='datetime64[ns]', freq='D')
传入起始时间和periods
index = pd.date_range(end='4/1/2017', periods=10)
print index
DatetimeIndex(['2017-03-23', '2017-03-24', '2017-03-25', '2017-03-26',
'2017-03-27', '2017-03-28', '2017-03-29', '2017-03-30',
'2017-03-31', '2017-04-01'],
dtype='datetime64[ns]', freq='D')
传入指定频率
index = pd.date_range('3/1/2017', '10/1/2017', freq='BM')
print index
DatetimeIndex(['2017-03-31', '2017-04-28', '2017-05-31', '2017-06-30',
'2017-07-31', '2017-08-31', '2017-09-29'],
dtype='datetime64[ns]', freq='BM')
4.重采样
rng = pd.date_range('1/1/2017', periods=100, freq='D')
ts = pd.Series(np.random.randn(len(rng)), index=rng)
print ts.resample('M', how='mean')
2017-01-31 -0.080247
2017-02-28 -0.119564
2017-03-31 -0.113525
2017-04-30 -0.131885
Freq: M, dtype: float64
5.降采样
rng = pd.date_range('1/1/2017', periods=12, freq='T')
ts = pd.Series(np.arange(12), index=rng)
print ts
2017-01-01 00:00:00 0
2017-01-01 00:01:00 1
2017-01-01 00:02:00 2
2017-01-01 00:03:00 3
2017-01-01 00:04:00 4
2017-01-01 00:05:00 5
2017-01-01 00:06:00 6
2017-01-01 00:07:00 7
2017-01-01 00:08:00 8
2017-01-01 00:09:00 9
2017-01-01 00:10:00 10
2017-01-01 00:11:00 11
print ts.resample('5min', how='sum')
2017-01-01 00:00:00 10
2017-01-01 00:05:00 35
2017-01-01 00:10:00 21
Freq: 5T, dtype: int32