python数据分析库pandas使用之四

最新推荐文章于 2023-03-20 22:28:45 发布

weiwen6933

最新推荐文章于 2023-03-20 22:28:45 发布

阅读量486

点赞数

本文链接：https://blog.csdn.net/weiwen6933/article/details/104219600

版权

Day 4

Pandas基本操作

时间操作数据集密码:z333

时间操作

基本操作

#使用datetime
import datetime
dt = datetime.datetime(year=2020,month = 2,day=8,hour=10,minute=42)
print(dt)

2020-02-08 10:42:00

#使用pandas时间戳
import pandas as pd
ts = pd.Timestamp('2017-11-24')
ts
ts.month
ts.day
ts + pd.Timedelta('5 days')

Timestamp(‘2017-11-24 00:00:00’)
11
24
Timestamp(‘2017-11-29 00:00:00’)

#转化成标准时间格式
pd.to_datetime('2020-02-09')
pd.to_datetime('24/11/2020')

Timestamp(‘2020-11-24 00:00:00’)

s = pd.Series(['2020-11-24 01:00:00','2020-11-25 02:00:00','2020-11-26 03:00:00'])
s
'''
0    2020-11-24 01:00:00
1    2020-11-25 02:00:00
2    2020-11-26 03:00:00
dtype: object
'''
ts = pd.to_datetime(s)
ts
'''
0   2020-11-24 01:00:00
1   2020-11-25 02:00:00
2   2020-11-26 03:00:00
dtype: datetime64[ns]
'''
ts.dt.hour
'''
0    1
1    2
2    3
dtype: int64
'''
ts.dt.weekday
'''
0    1
1    2
2    3
dtype: int64
'''

#列出从2020-02-07，为期十天，每12h一次的时间
pd.Series(pd.date_range(start='2020-02-07',periods = 10,freq='12H'))
'''
0   2020-02-07 00:00:00
1   2020-02-07 12:00:00
2   2020-02-08 00:00:00
3   2020-02-08 12:00:00
4   2020-02-09 00:00:00
5   2020-02-09 12:00:00
6   2020-02-10 00:00:00
7   2020-02-10 12:00:00
8   2020-02-11 00:00:00
9   2020-02-11 12:00:00
dtype: datetime64[ns]
'''

数据集演示

data = pd.read_csv('data/flowdata.csv')
data.head()

在这里插入图片描述

data['Time'] = pd.to_datetime(data['Time'])
data = data.set_index('Time')

#index_col 指定哪一列作为索引列 parse_dates 自动解析时间格式
data = pd.read_csv('data/flowdata.csv',index_col=0,parse_dates=True)
data

在这里插入图片描述

选取时间切片

#取时间的切片
data[pd.Timestamp('2012-01-01 09:00'):pd.Timestamp('2012-01-01 19:00')]

data[('2012-01-01 09:00'):('2012-01-01 19:00')]

在这里插入图片描述

#选取某一年的数据
data['2013']

在这里插入图片描述

#选取某几个月的数据
data['2012-01':'2012-03']

在这里插入图片描述

#选取所有年份中一月的数据
data[data.index.month==1]

在这里插入图片描述

#选取某个时间段内的数据
data[(data.index.hour>8) & (data.index.hour<12)]

data.between_time('08:00','12:00')

在这里插入图片描述
时间序列重采样：选取某段时间内的统计描述

#每一天各项指标的平均值
data.resample('D').mean()
#data.resample('D').max() 最大值

在这里插入图片描述

#每三天为间隔，计算均值
data.resample('3D').mean()

在这里插入图片描述

#以月份作为横坐标，以均值作为纵坐标，画图
%matplotlib notebook
data.resample('M').mean().plot()

在这里插入图片描述

weiwen6933

关注

0
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫