Pandas时间索引的骚操作

最新推荐文章于 2023-02-14 12:41:16 发布

songcream1

最新推荐文章于 2023-02-14 12:41:16 发布

阅读量350

点赞数 1

本文链接：https://blog.csdn.net/laigengsong/article/details/118917210

版权

当DataFrame的索引为时间对象时，可以很方便的进行数据的筛选。

时间范围生成

#生成时间范围
pd.date_range('2010-01-01','2010-01-08')
>>>DatetimeIndex(['2010-01-01', '2010-01-02', '2010-01-03', '2010-01-04',
               '2010-01-05', '2010-01-06', '2010-01-07', '2010-01-08'],
              dtype='datetime64[ns]', freq='D')

#指定时间长度生成时间范围
pd.date_range('2010-01-01',periods=10)
>>>DatetimeIndex(['2010-01-01', '2010-01-02', '2010-01-03', '2010-01-04',
               '2010-01-05', '2010-01-06', '2010-01-07', '2010-01-08',
               '2010-01-09', '2010-01-10'],
              dtype='datetime64[ns]', freq='D')

#指定时间长度(10)、间隔(隔一个月)生成时间范围
pd.date_range('2010-01-01',periods=10,freq='M')
>>>DatetimeIndex(['2010-01-31', '2010-02-28', '2010-03-31', '2010-04-30',
               '2010-05-31', '2010-06-30', '2010-07-31', '2010-08-31',
               '2010-09-30', '2010-10-31'],
              dtype='datetime64[ns]', freq='M')


#指定时间长度(10)、间隔(隔1个小时20分钟)生成时间范围
pd.date_range('2010-01-01',periods=10,freq='1h20min')
>>>DatetimeIndex(['2010-01-01 00:00:00', '2010-01-01 01:20:00',
               '2010-01-01 02:40:00', '2010-01-01 04:00:00',
               '2010-01-01 05:20:00', '2010-01-01 06:40:00',
               '2010-01-01 08:00:00', '2010-01-01 09:20:00',
               '2010-01-01 10:40:00', '2010-01-01 12:00:00'],
              dtype='datetime64[ns]', freq='80T')

#按工作日生成时间范围
pd.date_range('2010-01-01',periods=10,freq='B')
>>>DatetimeIndex(['2010-01-01', '2010-01-04', '2010-01-05', '2010-01-06',
               '2010-01-07', '2010-01-08', '2010-01-11', '2010-01-12',
               '2010-01-13', '2010-01-14'],
              dtype='datetime64[ns]', freq='B')

时间筛选

#创建一个索引为日期1000天，值从0到999的Series
a=pd.Series(np.arange(1000),index=pd.date_range('2010-01-05',periods=1000))
>>> 2010-01-05     0
	2010-01-06     1
	2010-01-07     2
	...
	2012-09-29    998
	2012-09-30    999

#筛选1月份的数据
a['2010-01']
>>> 2010-01-05     0
	2010-01-06     1
	2010-01-07     2
	...
	2010-01-30    25
	2010-01-31    26

#时间切片筛选2010到2011-02-03的数据
a['2010':'2011-02-02']
>>> 2010-01-05      0
	2010-01-06      1
	...
	2011-02-01    392
	2011-02-02    393

筛选数据运算

a=pd.Series(np.arange(1000),index=pd.date_range('2010-01-05',periods=1000))
>>> 2010-01-05     0
	2010-01-06     1
	2010-01-07     2
	...
	2012-09-29    998
	2012-09-30    999

#按每周求和，每7天进行相加求和输出
a.resample('W').sum()
>>> 2010-01-10      15
	2010-01-17      63
	2010-01-24     112
	2010-01-31     161
	...
	2012-09-09    6825
	2012-09-16    6874
	2012-09-23    6923
	2012-09-30    6972

#按每月求平均值
a.resample('M').mean()
>>> 2010-01-31     13.0
	2010-02-28     40.5
	2010-03-31     70.0
	...
	2012-07-31    923.0
	2012-08-31    954.0
	2012-09-30    984.5

读取excel时将字符串列转化为时间类型的列

pd.read_csv('demo.csv',parse_dates = ['col1'],infer_datetime_format=True)

songcream1

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
2
评论
Pandas时间索引的骚操作

当DataFrame的索引为时间对象时，可以很方便的进行数据的筛选。时间范围生成#生成时间范围pd.date_range('2010-01-01','2010-01-08')>>>DatetimeIndex(['2010-01-01', '2010-01-02', '2010-01-03', '2010-01-04', '2010-01-05', '2010-01-06', '2010-01-07', '2010-01-08'],
复制链接

扫一扫