目录
问题说明
需要根据之间序列数据的时间列,筛选每天固定时段的数据。
1. 查看数据
data.head(2)
Out[9]:
TIMESTAMP RECORD Wind_Speed ... BP_80m RH_10m BattV
2 2022-06-25 00:00:01 9384675 3.86 ... 92 44.86 12.42
3 2022-06-25 00:00:02 9384676 3.86 ... 93 44.85 12.43
2. 查看数据类型,可以看出 TIMESTAMP 列的属性并不是时间属性,需要先进行格式转换
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 190680 entries, 2 to 190681
Data columns (total 26 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 TIMESTAMP 190680 non-null object
1 RECORD 190680 non-null object
2 Wind_Speed 190680 non-null object
3 Wb_aaam_x 190680 non-null object
3. 格式转换,获取小时hour列,提取每行 TIMESTAMP 的 hour 属性
data['TIMESTAMP'] = pd.to_datetime(data['TIMESTAMP'])
data['hour'] = data['TIMESTAMP'].apply(lambda x: x.hour)
data.hour #
Out[16]:
2 0 #2022-06-25 00:00:01
3 0 #2022-06-25 00:00:02
4 0
..
190679 4 #2022-06-26 04:32:14
190680 4 #2022-06-26 04:32:15
4. 取单日数据白天(8:00-20:00)'Wind_Speed’列的平均值
ws_avg_day = data.loc[(data['hour'] >= 8) & (data['hour'] < 20), 'Wind_Speed'].mean()
5. 取单日数据夜晚(20:00-24:00, 0:00-8:00 )'Wind_Speed’列的平均值
ws_avg_night = data.loc[(data['hour'] >= 20) | (data['hour'] < 8), 'Wind_Speed'].mean()
6.取每天 7:30-9:30之间的数据
data = data.set_index('TIMESTAMP')
data_part = data.between_time('7:30', '9:30')