时间序列的操作基础

时间序列的操作基础

引入相关库

import numpy as np
import pandas as pd
from pandas import Series,DataFrame

引入datetime库

from datetime import datetime

创建datetime对象

创建一个datetime,传入年月日参数

t1=datetime(2009,10,20)
t1
datetime.datetime(2009, 10, 20, 0, 0)

创建多个datetime对象,传入相关参数

date_list=[
    datetime(2016,9,1),
    datetime(2016,9,10),
    datetime(2017,9,1),
    datetime(2017,9,20),
    datetime(2017,10,1)
]
date_list
[datetime.datetime(2016, 9, 1, 0, 0),
 datetime.datetime(2016, 9, 10, 0, 0),
 datetime.datetime(2017, 9, 1, 0, 0),
 datetime.datetime(2017, 9, 20, 0, 0),
 datetime.datetime(2017, 10, 1, 0, 0)]

通过datetime数据类型创建一个Series,index为date_list对象

s1=Series(np.random.rand(5),index=date_list)
s1
2016-09-01    0.886547
2016-09-10    0.642827
2017-09-01    0.926886
2017-09-20    0.187911
2017-10-01    0.277650
dtype: float64
s1.values
array([0.88654715, 0.64282712, 0.92688552, 0.18791143, 0.27764988])
s1.index
DatetimeIndex(['2016-09-01', '2016-09-10', '2017-09-01', '2017-09-20',
               '2017-10-01'],
              dtype='datetime64[ns]', freq=None)

访问datetime元素

访问第一个元素

s1[1]
0.6428271197631065

通过datetime传入要访问的对象

s1[datetime(2016,9,10)]
0.6428271197631065

直接传入日期参数

s1['2016-9-10']
0.6428271197631065

简化的传入日期参数方法

s1['20160910']
0.6428271197631065

2016年9月有两个数据,简写如果只填入年份和月份不写日期,会报错

s1['201609']
---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

I:\anaconda\lib\site-packages\pandas\core\indexes\base.py in get_value(self, series, key)
   4410             try:
-> 4411                 return libindex.get_value_at(s, key)
   4412             except IndexError:


pandas\_libs\index.pyx in pandas._libs.index.get_value_at()


pandas\_libs\index.pyx in pandas._libs.index.get_value_at()


pandas\_libs\util.pxd in pandas._libs.util.get_value_at()


pandas\_libs\util.pxd in pandas._libs.util.validate_indexer()


TypeError: 'str' object cannot be interpreted as an integer


During handling of the above exception, another exception occurred:


KeyError                                  Traceback (most recent call last)

I:\anaconda\lib\site-packages\pandas\core\indexes\datetimes.py in get_value(self, series, key)
    650         try:
--> 651             value = Index.get_value(self, series, key)
    652         except KeyError:


I:\anaconda\lib\site-packages\pandas\core\indexes\base.py in get_value(self, series, key)
   4418                 else:
-> 4419                     raise e1
   4420             except Exception:


I:\anaconda\lib\site-packages\pandas\core\indexes\base.py in get_value(self, series, key)
   4404         try:
-> 4405             return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
   4406         except KeyError as e1:


pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_value()


pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_value()


pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine.get_loc()


pandas\_libs\index.pyx in pandas._libs.index.DatetimeEngine._date_check_type()


KeyError: '201609'


During handling of the above exception, another exception occurred:


ValueError                                Traceback (most recent call last)

I:\anaconda\lib\site-packages\dateutil\parser\_parser.py in parse(self, timestr, default, ignoretz, tzinfos, **kwargs)
    654         try:
--> 655             ret = self._build_naive(res, default)
    656         except ValueError as e:


I:\anaconda\lib\site-packages\dateutil\parser\_parser.py in _build_naive(self, res, default)
   1240 
-> 1241         naive = default.replace(**repl)
   1242 


ValueError: month must be in 1..12


The above exception was the direct cause of the following exception:


ParserError                               Traceback (most recent call last)

pandas\_libs\tslibs\conversion.pyx in pandas._libs.tslibs.conversion.convert_str_to_tsobject()


pandas\_libs\tslibs\parsing.pyx in pandas._libs.tslibs.parsing.parse_datetime_string()


I:\anaconda\lib\site-packages\dateutil\parser\_parser.py in parse(timestr, parserinfo, **kwargs)
   1373     else:
-> 1374         return DEFAULTPARSER.parse(timestr, **kwargs)
   1375 


I:\anaconda\lib\site-packages\dateutil\parser\_parser.py in parse(self, timestr, default, ignoretz, tzinfos, **kwargs)
    656         except ValueError as e:
--> 657             six.raise_from(ParserError(e.args[0] + ": %s", timestr), e)
    658 


I:\anaconda\lib\site-packages\six.py in raise_from(value, from_value)


ParserError: month must be in 1..12: 201609


During handling of the above exception, another exception occurred:


ValueError                                Traceback (most recent call last)

I:\anaconda\lib\site-packages\pandas\core\indexes\datetimes.py in get_value(self, series, key)
    659             try:
--> 660                 return self.get_value_maybe_box(series, key)
    661             except (TypeError, ValueError, KeyError):


I:\anaconda\lib\site-packages\pandas\core\indexes\datetimes.py in get_value_maybe_box(self, series, key)
    674         elif not isinstance(key, Timestamp):
--> 675             key = Timestamp(key)
    676         values = self._engine.get_value(com.values_from_object(series), key, tz=self.tz)


pandas\_libs\tslibs\timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.__new__()


pandas\_libs\tslibs\conversion.pyx in pandas._libs.tslibs.conversion.convert_to_tsobject()


pandas\_libs\tslibs\conversion.pyx in pandas._libs.tslibs.conversion.convert_str_to_tsobject()


ValueError: could not convert string to Timestamp


During handling of the above exception, another exception occurred:


KeyError                                  Traceback (most recent call last)

<ipython-input-16-190579de8e62> in <module>
----> 1 s1['201609']


I:\anaconda\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
    869         key = com.apply_if_callable(key, self)
    870         try:
--> 871             result = self.index.get_value(self, key)
    872 
    873             if not is_scalar(result):


I:\anaconda\lib\site-packages\pandas\core\indexes\datetimes.py in get_value(self, series, key)
    660                 return self.get_value_maybe_box(series, key)
    661             except (TypeError, ValueError, KeyError):
--> 662                 raise KeyError(key)
    663         else:
    664             return com.maybe_box(self, value, series, key)


KeyError: '201609'

通过这种方法才能返回2016年9月的数据

s1['2016-09']
2016-09-01    0.886547
2016-09-10    0.642827
dtype: float64

返回2017年9月的数据

s1['2017-09']
2017-09-01    0.926886
2017-09-20    0.187911
dtype: float64

返回2016年的所有数据

s1['2016']
2016-09-01    0.886547
2016-09-10    0.642827
dtype: float64

返回2017年的所有数据

s1['2017']
2017-09-01    0.926886
2017-09-20    0.187911
2017-10-01    0.277650
dtype: float64
s1
2016-09-01    0.886547
2016-09-10    0.642827
2017-09-01    0.926886
2017-09-20    0.187911
2017-10-01    0.277650
dtype: float64

通过date_range来产生一段时间范围以内的datetime

start 和end表示时间的起始,periods表示时间间隔,freqs表示步长默认为d
起始时间为2016-01-01后的100天

date_list_new=pd.date_range('2016-01-01', periods=100)
date_list_new

在这里插入图片描述

DatetimeIndex(['2016-01-01', '2016-01-02', '2016-01-03', '2016-01-04',
               '2016-01-05', '2016-01-06', '2016-01-07', '2016-01-08',
               '2016-01-09', '2016-01-10', '2016-01-11', '2016-01-12',
               '2016-01-13', '2016-01-14', '2016-01-15', '2016-01-16',
               '2016-01-17', '2016-01-18', '2016-01-19', '2016-01-20',
               '2016-01-21', '2016-01-22', '2016-01-23', '2016-01-24',
               '2016-01-25', '2016-01-26', '2016-01-27', '2016-01-28',
               '2016-01-29', '2016-01-30', '2016-01-31', '2016-02-01',
               '2016-02-02', '2016-02-03', '2016-02-04', '2016-02-05',
               '2016-02-06', '2016-02-07', '2016-02-08', '2016-02-09',
               '2016-02-10', '2016-02-11', '2016-02-12', '2016-02-13',
               '2016-02-14', '2016-02-15', '2016-02-16', '2016-02-17',
               '2016-02-18', '2016-02-19', '2016-02-20', '2016-02-21',
               '2016-02-22', '2016-02-23', '2016-02-24', '2016-02-25',
               '2016-02-26', '2016-02-27', '2016-02-28', '2016-02-29',
               '2016-03-01', '2016-03-02', '2016-03-03', '2016-03-04',
               '2016-03-05', '2016-03-06', '2016-03-07', '2016-03-08',
               '2016-03-09', '2016-03-10', '2016-03-11', '2016-03-12',
               '2016-03-13', '2016-03-14', '2016-03-15', '2016-03-16',
               '2016-03-17', '2016-03-18', '2016-03-19', '2016-03-20',
               '2016-03-21', '2016-03-22', '2016-03-23', '2016-03-24',
               '2016-03-25', '2016-03-26', '2016-03-27', '2016-03-28',
               '2016-03-29', '2016-03-30', '2016-03-31', '2016-04-01',
               '2016-04-02', '2016-04-03', '2016-04-04', '2016-04-05',
               '2016-04-06', '2016-04-07', '2016-04-08', '2016-04-09'],
              dtype='datetime64[ns]', freq='D')

如果把freq改为周,会发现从01-03开始,因为一周的开始时间为周日,2016-01-03为周日

date_list_new=pd.date_range('2016-01-01', periods=100,freq='w')
date_list_new
DatetimeIndex(['2016-01-03', '2016-01-10', '2016-01-17', '2016-01-24',
               '2016-01-31', '2016-02-07', '2016-02-14', '2016-02-21',
               '2016-02-28', '2016-03-06', '2016-03-13', '2016-03-20',
               '2016-03-27', '2016-04-03', '2016-04-10', '2016-04-17',
               '2016-04-24', '2016-05-01', '2016-05-08', '2016-05-15',
               '2016-05-22', '2016-05-29', '2016-06-05', '2016-06-12',
               '2016-06-19', '2016-06-26', '2016-07-03', '2016-07-10',
               '2016-07-17', '2016-07-24', '2016-07-31', '2016-08-07',
               '2016-08-14', '2016-08-21', '2016-08-28', '2016-09-04',
               '2016-09-11', '2016-09-18', '2016-09-25', '2016-10-02',
               '2016-10-09', '2016-10-16', '2016-10-23', '2016-10-30',
               '2016-11-06', '2016-11-13', '2016-11-20', '2016-11-27',
               '2016-12-04', '2016-12-11', '2016-12-18', '2016-12-25',
               '2017-01-01', '2017-01-08', '2017-01-15', '2017-01-22',
               '2017-01-29', '2017-02-05', '2017-02-12', '2017-02-19',
               '2017-02-26', '2017-03-05', '2017-03-12', '2017-03-19',
               '2017-03-26', '2017-04-02', '2017-04-09', '2017-04-16',
               '2017-04-23', '2017-04-30', '2017-05-07', '2017-05-14',
               '2017-05-21', '2017-05-28', '2017-06-04', '2017-06-11',
               '2017-06-18', '2017-06-25', '2017-07-02', '2017-07-09',
               '2017-07-16', '2017-07-23', '2017-07-30', '2017-08-06',
               '2017-08-13', '2017-08-20', '2017-08-27', '2017-09-03',
               '2017-09-10', '2017-09-17', '2017-09-24', '2017-10-01',
               '2017-10-08', '2017-10-15', '2017-10-22', '2017-10-29',
               '2017-11-05', '2017-11-12', '2017-11-19', '2017-11-26'],
              dtype='datetime64[ns]', freq='W-SUN')

把freq的参数改为‘w-mon’即变为从周一开始

date_list_new=pd.date_range('2016-01-01', periods=100,freq='w-mon')
date_list_new
DatetimeIndex(['2016-01-04', '2016-01-11', '2016-01-18', '2016-01-25',
               '2016-02-01', '2016-02-08', '2016-02-15', '2016-02-22',
               '2016-02-29', '2016-03-07', '2016-03-14', '2016-03-21',
               '2016-03-28', '2016-04-04', '2016-04-11', '2016-04-18',
               '2016-04-25', '2016-05-02', '2016-05-09', '2016-05-16',
               '2016-05-23', '2016-05-30', '2016-06-06', '2016-06-13',
               '2016-06-20', '2016-06-27', '2016-07-04', '2016-07-11',
               '2016-07-18', '2016-07-25', '2016-08-01', '2016-08-08',
               '2016-08-15', '2016-08-22', '2016-08-29', '2016-09-05',
               '2016-09-12', '2016-09-19', '2016-09-26', '2016-10-03',
               '2016-10-10', '2016-10-17', '2016-10-24', '2016-10-31',
               '2016-11-07', '2016-11-14', '2016-11-21', '2016-11-28',
               '2016-12-05', '2016-12-12', '2016-12-19', '2016-12-26',
               '2017-01-02', '2017-01-09', '2017-01-16', '2017-01-23',
               '2017-01-30', '2017-02-06', '2017-02-13', '2017-02-20',
               '2017-02-27', '2017-03-06', '2017-03-13', '2017-03-20',
               '2017-03-27', '2017-04-03', '2017-04-10', '2017-04-17',
               '2017-04-24', '2017-05-01', '2017-05-08', '2017-05-15',
               '2017-05-22', '2017-05-29', '2017-06-05', '2017-06-12',
               '2017-06-19', '2017-06-26', '2017-07-03', '2017-07-10',
               '2017-07-17', '2017-07-24', '2017-07-31', '2017-08-07',
               '2017-08-14', '2017-08-21', '2017-08-28', '2017-09-04',
               '2017-09-11', '2017-09-18', '2017-09-25', '2017-10-02',
               '2017-10-09', '2017-10-16', '2017-10-23', '2017-10-30',
               '2017-11-06', '2017-11-13', '2017-11-20', '2017-11-27'],
              dtype='datetime64[ns]', freq='W-MON')

freq传入’h’,会以每小时为间隔

date_list_new=pd.date_range('2016-01-01', periods=100,freq='h')
date_list_new
DatetimeIndex(['2016-01-01 00:00:00', '2016-01-01 01:00:00',
               '2016-01-01 02:00:00', '2016-01-01 03:00:00',
               '2016-01-01 04:00:00', '2016-01-01 05:00:00',
               '2016-01-01 06:00:00', '2016-01-01 07:00:00',
               '2016-01-01 08:00:00', '2016-01-01 09:00:00',
               '2016-01-01 10:00:00', '2016-01-01 11:00:00',
               '2016-01-01 12:00:00', '2016-01-01 13:00:00',
               '2016-01-01 14:00:00', '2016-01-01 15:00:00',
               '2016-01-01 16:00:00', '2016-01-01 17:00:00',
               '2016-01-01 18:00:00', '2016-01-01 19:00:00',
               '2016-01-01 20:00:00', '2016-01-01 21:00:00',
               '2016-01-01 22:00:00', '2016-01-01 23:00:00',
               '2016-01-02 00:00:00', '2016-01-02 01:00:00',
               '2016-01-02 02:00:00', '2016-01-02 03:00:00',
               '2016-01-02 04:00:00', '2016-01-02 05:00:00',
               '2016-01-02 06:00:00', '2016-01-02 07:00:00',
               '2016-01-02 08:00:00', '2016-01-02 09:00:00',
               '2016-01-02 10:00:00', '2016-01-02 11:00:00',
               '2016-01-02 12:00:00', '2016-01-02 13:00:00',
               '2016-01-02 14:00:00', '2016-01-02 15:00:00',
               '2016-01-02 16:00:00', '2016-01-02 17:00:00',
               '2016-01-02 18:00:00', '2016-01-02 19:00:00',
               '2016-01-02 20:00:00', '2016-01-02 21:00:00',
               '2016-01-02 22:00:00', '2016-01-02 23:00:00',
               '2016-01-03 00:00:00', '2016-01-03 01:00:00',
               '2016-01-03 02:00:00', '2016-01-03 03:00:00',
               '2016-01-03 04:00:00', '2016-01-03 05:00:00',
               '2016-01-03 06:00:00', '2016-01-03 07:00:00',
               '2016-01-03 08:00:00', '2016-01-03 09:00:00',
               '2016-01-03 10:00:00', '2016-01-03 11:00:00',
               '2016-01-03 12:00:00', '2016-01-03 13:00:00',
               '2016-01-03 14:00:00', '2016-01-03 15:00:00',
               '2016-01-03 16:00:00', '2016-01-03 17:00:00',
               '2016-01-03 18:00:00', '2016-01-03 19:00:00',
               '2016-01-03 20:00:00', '2016-01-03 21:00:00',
               '2016-01-03 22:00:00', '2016-01-03 23:00:00',
               '2016-01-04 00:00:00', '2016-01-04 01:00:00',
               '2016-01-04 02:00:00', '2016-01-04 03:00:00',
               '2016-01-04 04:00:00', '2016-01-04 05:00:00',
               '2016-01-04 06:00:00', '2016-01-04 07:00:00',
               '2016-01-04 08:00:00', '2016-01-04 09:00:00',
               '2016-01-04 10:00:00', '2016-01-04 11:00:00',
               '2016-01-04 12:00:00', '2016-01-04 13:00:00',
               '2016-01-04 14:00:00', '2016-01-04 15:00:00',
               '2016-01-04 16:00:00', '2016-01-04 17:00:00',
               '2016-01-04 18:00:00', '2016-01-04 19:00:00',
               '2016-01-04 20:00:00', '2016-01-04 21:00:00',
               '2016-01-04 22:00:00', '2016-01-04 23:00:00',
               '2016-01-05 00:00:00', '2016-01-05 01:00:00',
               '2016-01-05 02:00:00', '2016-01-05 03:00:00'],
              dtype='datetime64[ns]', freq='H')

freq传入’5h’,会以每5小时为间隔

date_list_new=pd.date_range('2016-01-01', periods=100,freq='5h')
date_list_new
DatetimeIndex(['2016-01-01 00:00:00', '2016-01-01 05:00:00',
               '2016-01-01 10:00:00', '2016-01-01 15:00:00',
               '2016-01-01 20:00:00', '2016-01-02 01:00:00',
               '2016-01-02 06:00:00', '2016-01-02 11:00:00',
               '2016-01-02 16:00:00', '2016-01-02 21:00:00',
               '2016-01-03 02:00:00', '2016-01-03 07:00:00',
               '2016-01-03 12:00:00', '2016-01-03 17:00:00',
               '2016-01-03 22:00:00', '2016-01-04 03:00:00',
               '2016-01-04 08:00:00', '2016-01-04 13:00:00',
               '2016-01-04 18:00:00', '2016-01-04 23:00:00',
               '2016-01-05 04:00:00', '2016-01-05 09:00:00',
               '2016-01-05 14:00:00', '2016-01-05 19:00:00',
               '2016-01-06 00:00:00', '2016-01-06 05:00:00',
               '2016-01-06 10:00:00', '2016-01-06 15:00:00',
               '2016-01-06 20:00:00', '2016-01-07 01:00:00',
               '2016-01-07 06:00:00', '2016-01-07 11:00:00',
               '2016-01-07 16:00:00', '2016-01-07 21:00:00',
               '2016-01-08 02:00:00', '2016-01-08 07:00:00',
               '2016-01-08 12:00:00', '2016-01-08 17:00:00',
               '2016-01-08 22:00:00', '2016-01-09 03:00:00',
               '2016-01-09 08:00:00', '2016-01-09 13:00:00',
               '2016-01-09 18:00:00', '2016-01-09 23:00:00',
               '2016-01-10 04:00:00', '2016-01-10 09:00:00',
               '2016-01-10 14:00:00', '2016-01-10 19:00:00',
               '2016-01-11 00:00:00', '2016-01-11 05:00:00',
               '2016-01-11 10:00:00', '2016-01-11 15:00:00',
               '2016-01-11 20:00:00', '2016-01-12 01:00:00',
               '2016-01-12 06:00:00', '2016-01-12 11:00:00',
               '2016-01-12 16:00:00', '2016-01-12 21:00:00',
               '2016-01-13 02:00:00', '2016-01-13 07:00:00',
               '2016-01-13 12:00:00', '2016-01-13 17:00:00',
               '2016-01-13 22:00:00', '2016-01-14 03:00:00',
               '2016-01-14 08:00:00', '2016-01-14 13:00:00',
               '2016-01-14 18:00:00', '2016-01-14 23:00:00',
               '2016-01-15 04:00:00', '2016-01-15 09:00:00',
               '2016-01-15 14:00:00', '2016-01-15 19:00:00',
               '2016-01-16 00:00:00', '2016-01-16 05:00:00',
               '2016-01-16 10:00:00', '2016-01-16 15:00:00',
               '2016-01-16 20:00:00', '2016-01-17 01:00:00',
               '2016-01-17 06:00:00', '2016-01-17 11:00:00',
               '2016-01-17 16:00:00', '2016-01-17 21:00:00',
               '2016-01-18 02:00:00', '2016-01-18 07:00:00',
               '2016-01-18 12:00:00', '2016-01-18 17:00:00',
               '2016-01-18 22:00:00', '2016-01-19 03:00:00',
               '2016-01-19 08:00:00', '2016-01-19 13:00:00',
               '2016-01-19 18:00:00', '2016-01-19 23:00:00',
               '2016-01-20 04:00:00', '2016-01-20 09:00:00',
               '2016-01-20 14:00:00', '2016-01-20 19:00:00',
               '2016-01-21 00:00:00', '2016-01-21 05:00:00',
               '2016-01-21 10:00:00', '2016-01-21 15:00:00'],
              dtype='datetime64[ns]', freq='5H')

使用date_list_new创建一个Series,产生index是时间序列的一个Series

s2=Series(np.random.rand(100),index=date_list_new)
s2
2016-01-01 00:00:00    0.895959
2016-01-01 05:00:00    0.392156
2016-01-01 10:00:00    0.650885
2016-01-01 15:00:00    0.504900
2016-01-01 20:00:00    0.484126
                         ...   
2016-01-20 19:00:00    0.133861
2016-01-21 00:00:00    0.135461
2016-01-21 05:00:00    0.338000
2016-01-21 10:00:00    0.813742
2016-01-21 15:00:00    0.588442
Freq: 5H, Length: 100, dtype: float64
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值