python的时间序列,python时间序列一

最新推荐文章于 2024-03-20 03:22:01 发布

weixin_39914938

最新推荐文章于 2024-03-20 03:22:01 发布

阅读量223

点赞数

文章标签： python的时间序列

python时间序列

1.显示当前日期

import datetime

now=datetime.datetime.now()

print(now)

2.可以做时间上的差值运算

delta=datetime.datetime(2019,02,22)-datetime.datetime(2019,01,01)

print(delta)

3.字符串与datet日期型互换

1.strptime型

value='2019-02-22'

time_1=pd.datetime.strptime(value,'%Y-%m-%d')

print(time_1)

print(type(time_1))

结果输出

2019-02-22 00:00:00

2.strftime型

value=datetime.datetime(2019,2,22)

new_value=value.strftime('%Y-%m-%d')

print(new_value)

print(type(new_value))

结果输出

2019-02-22

strptime和strftime的区别在哪？

首先，两者在拼写上就只有一字之差，我们可以这样理解：strptime中的p为**present目前的首字母，strftime中的f为***future将来***的首字母。

其次，两者的参数strptime中有两个参数，第一第二分别表示要转换的字符串，此时字符串的格式。而strftime只有一个参数,表示为将要转化的格式。

再次，strptime是将字符串型转日期型，而strftime是将日期型转字符串型。

(但怎么记？可以这样理解目前的时间“是真正的时间”因为时间最重要就是要把握现在，其他什么将来都是扯淡)

# datetime.strptime

从上面的字符日期型数据转化的格式中，我们都要写格式，datetime.strptime是在已知格式的情况下转换日期的好方式。但每次都必须编写一个格式代码可能有点烦人怎么办呢？使用第三方包dateutil包的parser.parse方法

from dateutil.parser import parse

print(parse('2011-01-03'))

print(parse('Jan 31,1997 10:45 PM'))

结果显示

2011-01-03 00:00:00

2019-01-31 22:45:00

补充说明一下

#在国际场合下，日期出现在月份之前很常见，因此你可以传递dayfirst=True来表明这种情况

print(parse('16/12/2018',dayfirst=True))

结果显示

2018-12-16 00:00:00

索引、选择、子集

1.根据索引选择相应的数据

dates=[datetime.datetime(2011,1,2),datetime.datetime(2011,1,5),datetime.datetime(2011,1,7),datetime.datetime(2011,1,8),datetime.datetime(2011,1,10),datetime.datetime(2011,1,12)]

ts=pd.Series(np.random.randn(6),index=dates)

print(ts)

stamp=ts.index[2]

print(ts[stamp])

ts:

2011-01-02 -0.564252

2011-01-05 -1.092965

2011-01-07 1.794356

2011-01-08 0.765545

2011-01-10 0.511482

2011-01-12 -0.180311

dtype: float64

ts[stamp]:

1.7943558404193065

2.根据一个能解释为日期的字符串选择对应的数据

print(ts['1/10/2011'])

3.对于长达上千上万的时间序列，可以通过相应的年或年月来对数据进行选取

longer_ts=pd.Series(np.random.randn(1000),index=pd.date_range('1/1/2000',periods=1000))

print(longer_ts)

print(longer_ts['2001'])

print(longer_ts['2001-05'])

当然也可以使用datetime对象进行切片

print(longer_ts[datetime.datetime(2011,1,2):])

print(ts['1/6/2011':'1/11/2011'])

print(ts.truncate(after='1/9/2011'))

含有重复索引的时间序列

dates=pd.DatetimeIndex(['1/1/2000','1/2/2000','1/2/2000','1/2/2000','1/3/2000'])

dup_ts=pd.Series(np.arange(5),index=dates) #复重

print(dup_ts)

print(dup_ts.index.is_unique)

print(dup_ts['1/3/2000']) # 不重复

print(dup_ts['1/2/2000']) # 重复

# 遇到的疑惑，unique和is_unique有什么关联吗？

print(dup_ts.unique()) # 默认是对值进行去重的值

print(dup_ts.index.unique()) # 要相对索引进行去重查看，加index

'''

总结两者的区别：

is_unique:对总体(可以是对值，也可以是对索引，本例是对索引)进行判断，有重复的返回True，否则返回Flase

unique：对总体(可以是对值，也可以是对索引，本例是对索引)进行考虑，只留下去重后的值给我们看

'''

结果输出

2000-01-01 0

2000-01-02 1

2000-01-02 2

2000-01-02 3

2000-01-03 4

dtype: int32

False

2000-01-02 1

2000-01-02 2

2000-01-02 3

dtype: int32

假设你想要聚合含有非唯一时间戳的数据，一种方式就是使用groupby并传递level=0：

# 对含有重复值的时间类型(索引)进行分组，在grouped里面要用到level=0

grouped=dup_ts.groupby(level=0)

print(grouped.mean())

print(grouped.count())

结果输出：

2000-01-01 0

2000-01-02 2

2000-01-03 4

dtype: int32

2000-01-01 1

2000-01-02 3

2000-01-03 1

dtype: int64

日期范围、频率和位移

1.生成日期范围

大概分为三种情形

#1.知道起止日期的

index=pd.date_range('2012-04-01','2012-06-01')

print(index)

#2.知道开始时间和期数的

index1=pd.date_range(start='2012-01-01',periods=20)

print(index1)

#3.知道截止日期和期数的

index2=pd.date_range(end='2012-06-1',periods=5)

print(index2)

如果你需要一个包含每月最后业务日期的时间索引，你可以传递‘BM’频率(Business end of month，月度业务结尾)但是只有在或在日期范围内的日期会被包括

dd=pd.date_range('2000-01-01','2000-12-01',freq='BM')

print(dd)

d1=pd.date_range('2012-05-02 12:56:31',periods=5)

print(d1)

# normalize 用于生成标准化为零点的时间戳

d2=pd.date_range('2012-05-02',periods=20,normalize=True)

print(d2)

其实，你还可以这样做

g1=pd.date_range('2000-01-01','2000-01-03 23:59'.freq='4h')

print(g1)

g2=pa.date_range('2000-01-01',periods=10,freq='1h30min')

print(g2)

月中某星期的日期

(week of month)

# 每个月的第三周的星期五

g3=pd.date_range('2012-01-01','2012-12-31',freq='WOM-3FRI')

print(g3)

结果输出

[Timestamp(‘2019-01-18 00:00:00’, freq=‘WOM-3FRI’), Timestamp(‘2019-02-15 00:00:00’, freq=‘WOM-3FRI’), Timestamp(‘2019-03-15 00:00:00’, freq=‘WOM-3FRI’), Timestamp(‘2019-04-19 00:00:00’, freq=‘WOM-3FRI’), Timestamp(‘2019-05-17 00:00:00’, freq=‘WOM-3FRI’), Timestamp(‘2019-06-21 00:00:00’, freq=‘WOM-3FRI’), Timestamp(‘2019-07-19 00:00:00’, freq=‘WOM-3FRI’), Timestamp(‘2019-08-16 00:00:00’, freq=‘WOM-3FRI’)]

移位(前向和后向)日期

ts=pd.Series(np.random.randn(4),index=pd.date_range('2000-01-01',freq='M',periods=4)

print(ts)

# 当前数据向后面的数据移动，索引值不变

print(ts.shift(2))

# 当前数据向前面的数据移动，索引值不变

print(ts.shift(-2))

结果输出

2000-01-31 0.120563

2000-02-29 1.195027

2000-03-31 -0.972080

2000-04-30 0.773595

Freq: M, dtype: float64

2000-01-31 NaN

2000-02-29 NaN

2000-03-31 0.120563

2000-04-30 1.195027

Freq: M, dtype: float64

2000-01-31 -0.972080

2000-02-29 0.773595

2000-03-31 NaN

2000-04-30 NaN

Freq: M, dtype: float64

应用场景：

shift经常用于计算时间序列或者DataFrame多列时间序列的百分比变化，代码实现如下

# 口述翻译：第二期/第一期-1放在第二期上面，做法有点像差分

print(ts/ts.shift(1)-1)

#但是这样由于简单移位并不改变索引，一些数据会被丢弃。因此，如果频率是已知的，则可以将频率传递给shift来推移时间戳而不是简单的数据。

print(ts.shift(1,freq='M'))

# 这段代码输出的结果如下，其中前三行的数据是不变的，时间轴整体也像后面的数据月份+1，而原始数据最后一行就是第四行，后面没有数据了，默认最后一行为原始数据的最后一行。

结果输出：

2000-02-29 -1.267211

2000-03-31 -0.492248

2000-04-30 -0.278425

2000-05-31 -0.131732

Freq: M, dtype: float64

weixin_39914938

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
python的时间序列,python时间序列一

python时间序列1.显示当前日期import datetimenow=datetime.datetime.now()print(now)2.可以做时间上的差值运算delta=datetime.datetime(2019,02,22)-datetime.datetime(2019,01,01)print(delta)3.字符串与datet日期型互换1.strptime型value='2019-0...
复制链接

扫一扫