如果时间戳是
命令
,我们可以使用
itertools.groupby
函数按相应的日期对数组元素进行分组。
这一天可以通过
np.datetime64.astype(..., dtype='datetime64[D]')
,所以我们可以这样写:
from numpy import datetime64
from functools import partial
from itertools import groupby
for day, timestamps in groupby(data_array,
partial(datetime64.astype, dtype='datetime64[D]')):
# process day and timestamps
pass
在这里
day
是一个
datetime64[D]
numpy对象(它只包含天),以及
timestamps
是一个
可迭代的
(不是列表,但我们可以将其转换为列表)对应的时间戳。
data_array
是包含初始数据的数组。
例如:
>>> for day, timestamps in groupby(data_array,
... partial(datetime64.astype, dtype='datetime64[D]')):
... print((day, list(timestamps)))
...
(numpy.datetime64('2016-12-01'), [numpy.datetime64('2016-12-01T02:00:00.000000000'), numpy.datetime64('2016-12-01T04:00:00.000000000'), numpy.datetime64('2016-12-01T06:00:00.000000000'), numpy.datetime64('2016-12-01T08:00:00.000000000'), numpy.datetime64('2016-12-01T10:00:00.000000000'), numpy.datetime64('2016-12-01T12:00:00.000000000'), numpy.datetime64('2016-12-01T14:00:00.000000000'), numpy.datetime64('2016-12-01T16:00:00.000000000'), numpy.datetime64('2016-12-01T18:00:00.000000000'), numpy.datetime64('2016-12-01T20:00:00.000000000'), numpy.datetime64('2016-12-01T22:00:00.000000000')])
(numpy.datetime64('2016-12-02'), [numpy.datetime64('2016-12-02T00:00:00.000000000'), numpy.datetime64('2016-12-02T02:00:00.000000000'), numpy.datetime64('2016-12-02T04:00:00.000000000'), numpy.datetime64('2016-12-02T06:00:00.000000000'), numpy.datetime64('2016-12-02T08:00:00.000000000'), numpy.datetime64('2016-12-02T10:00:00.000000000'), numpy.datetime64('2016-12-02T12:00:00.000000000'), numpy.datetime64('2016-12-02T14:00:00.000000000'), numpy.datetime64('2016-12-02T16:00:00.000000000'), numpy.datetime64('2016-12-02T18:00:00.000000000'), numpy.datetime64('2016-12-02T20:00:00.000000000'), numpy.datetime64('2016-12-02T22:00:00.000000000')])
(numpy.datetime64('2016-12-03'), [numpy.datetime64('2016-12-03T00:00:00.000000000'), numpy.datetime64('2016-12-03T02:00:00.000000000'), numpy.datetime64('2016-12-03T04:00:00.000000000'), numpy.datetime64('2016-12-03T06:00:00.000000000'), numpy.datetime64('2016-12-03T08:00:00.000000000'), numpy.datetime64('2016-12-03T10:00:00.000000000'), numpy.datetime64('2016-12-03T12:00:00.000000000'), numpy.datetime64('2016-12-03T14:00:00.000000000'), numpy.datetime64('2016-12-03T16:00:00.000000000'), numpy.datetime64('2016-12-03T18:00:00.000000000'), numpy.datetime64('2016-12-03T20:00:00.000000000'), numpy.datetime64('2016-12-03T22:00:00.000000000')])
所以在这里,我们每天都会打印一份
时间戳
,但这是当然的
一
所有的选择。如示例所示,并非所有切片都具有相同的长度(最后两个切片有一个额外的元素)
请注意
时间戳
是迭代器,如果不将其转换为列表,则在一个循环之后,迭代器将
筋疲力尽的
.
这个
groupby
工作在线性时间内,因为每次它都检查“组键”是否与前一个元素相同,但如前所述,约束是必须对数据进行排序的。