1.创建以时间戳为索引的Series -> DatetimeIndex
- 指定index为datetime的list
- pd.date_range()
from datetime import datetime
import pandas as pd
import numpy as np
# 指定index为datetime的list
date_list = [datetime(2017, 2, 18), datetime(2017, 2, 19),
datetime(2017, 2, 25), datetime(2017, 2, 26),
datetime(2017, 3, 4), datetime(2017, 3, 5)]
time_s = pd.Series(np.random.randn(6), index=date_list)
print(time_s)
print(type(time_s.index))
answer:
2017-02-18 -0.543551
2017-02-19 -0.759103
2017-02-25 0.058956
2017-02-26 0.275448
2017-03-04 -0.957346
2017-03-05 -1.143108
dtype: float64
<class 'pandas.tseries.index.DatetimeIndex'>
<class 'pandas.core.series.Series'>
# pd.date_range()
dates = pd.date_range('2017-02-18', # 起始日期
periods=5, # 周期
freq='W-SAT') # 频率
print(dates)
print(pd.Series(np.random.randn(5), index=dates))
answer:
DatetimeIndex(['2017-02-18', '2017-02-25', '2017-03-04', '2017-03-11',
'2017-03-18'],
dtype='datetime64[ns]', freq='W-SAT')
2017-02-18 -0.921937
2017-02-25 0.722167
2017-03-04 -0.171531
2017-03-11 -1.104664
2017-03-18 1.259994
Freq: W-SAT, dtype: float64
2.索引
- 索引位置
- 索引值
- 解析的日期字符串
- 按“年份”、“月份”索引
- 切片操作
# 索引位置
print(time_s[0])
# 索引值
print(time_s[datetime(2017, 2, 18)])
# 可以被解析的日期字符串
print(time_s['2017/02/18'])
# 按“年份”、“月份”索引
print(time_s['2017-2'])
# 切片操作
print(time_s['2017-2-26':])
3.过滤 time_series.truncate()
time_s.truncate(before='2017-2-25')
answer:
2017-02-25 -0.814823
2017-02-26 0.118521
2017-03-04 0.479252
2017-03-05 -0.052865
dtype: float64
time_s.truncate(after='2017-2-25')
answer:
2017-02-18 -0.466020
2017-02-19 -0.914442
2017-02-25 -0.814823
dtype: float64
4.生成日期范围
- 传入开始、结束日期,默认生成的该时间段的时间点是按天计算的
- 只传入开始或结束日期,还需要传入时间段
- 规范化时间戳 normalize=True
# 传入开始、结束日期,默认生成的该时间段的时间点是按天计算的
date_index = pd.date_range('2017/02/18', '2017/03/18')
print(date_index)
answer:
DatetimeIndex(['2017-02-18', '2017-02-19', '2017-02-20', '2017-02-21',
'2017-02-22', '2017-02-23', '2017-02-24', '2017-02-25',
'2017-02-26', '2017-02-27', '2017-02-28', '2017-03-01',
'2017-03-02', '2017-03-03', '2017-03-04', '2017-03-05',
'2017-03-06', '2017-03-07', '2017-03-08', '2017-03-09',
'2017-03-10', '2017-03-11', '2017-03-12', '2017-03-13',
'2017-03-14', '2017-03-15', '2017-03-16', '2017-03-17',
'2017-03-18'],
dtype='datetime64[ns]', freq='D')
# 只传入开始或结束日期,还需要传入时间段
print(pd.date_range(start='2017/02/18', periods=10))
answer:
DatetimeIndex(['2017-02-18', '2017-02-19', '2017-02-20', '2017-02-21',
'2017-02-22', '2017-02-23', '2017-02-24', '2017-02-25',
'2017-02-26', '2017-02-27'],
dtype='datetime64[ns]', freq='D')
print(pd.date_range(end='2017/03/18', periods=10))
answer:
DatetimeIndex(['2017-03-09', '2017-03-10', '2017-03-11', '2017-03-12',
'2017-03-13', '2017-03-14', '2017-03-15', '2017-03-16',
'2017-03-17', '2017-03-18'],
dtype='datetime64[ns]', freq='D')
# 规范化时间戳
print(pd.date_range(start='2017/02/18 12:13:14', periods=10))
print(pd.date_range(start='2017/02/18 12:13:14', periods=10, normalize=True))
answer:
DatetimeIndex(['2017-02-18 12:13:14', '2017-02-19 12:13:14',
'2017-02-20 12:13:14', '2017-02-21 12:13:14',
'2017-02-22 12:13:14', '2017-02-23 12:13:14',
'2017-02-24 12:13:14', '2017-02-25 12:13:14',
'2017-02-26 12:13:14', '2017-02-27 12:13:14'],
dtype='datetime64[ns]', freq='D')
DatetimeIndex(['2017-02-18 12:13:14', '2017-02-19 12:13:14',
'2017-02-20 12:13:14', '2017-02-21 12:13:14',
'2017-02-22 12:13:14', '2017-02-23 12:13:14',
'2017-02-24 12:13:14', '2017-02-25 12:13:14',
'2017-02-26 12:13:14', '2017-02-27 12:13:14'],
dtype='datetime64[ns]', freq='D')
5.频率Freq,由基础频率的倍数组成,基础频率有:
- BM,business end of month,每个月最后一个工作日
- D:天,M:月 等
print(pd.date_range('2017/02/18', '2017/03/18', freq='2D'))
answer:
DatetimeIndex(['2017-02-18', '2017-02-20', '2017-02-22', '2017-02-24',
'2017-02-26', '2017-02-28', '2017-03-02', '2017-03-04',
'2017-03-06', '2017-03-08', '2017-03-10', '2017-03-12',
'2017-03-14', '2017-03-16', '2017-03-18'],
dtype='datetime64[ns]', freq='2D')
6.移动数据 shift()
- 沿时间轴将数据前移或后移,保持索引不变
ts = pd.Series(np.random.randn(5), index=pd.date_range('20170218', periods=5, freq='W-SAT'))
print(ts)
answer:
2017-02-18 0.400190
2017-02-25 1.495394
2017-03-04 -1.331107
2017-03-11 2.943859
2017-03-18 0.813070
Freq: W-SAT, dtype: float64
print(ts.shift(1))
print(ts.shift(-1))
answer:
2017-02-18 NaN
2017-02-25 0.400190
2017-03-04 1.495394
2017-03-11 -1.331107
2017-03-18 2.943859
Freq: W-SAT, dtype: float64
2017-02-18 1.495394
2017-02-25 -1.331107
2017-03-04 2.943859
2017-03-11 0.813070
2017-03-18 NaN
Freq: W-SAT, dtype: float64