python 时间序列异常值,Python中的时间序列中缺少值

I have a time series dataframe, the dataframe is quite big and contain some missing values in the 2 columns('Humidity' and 'Pressure'). I would like to impute this missing values in a clever way, for example using the value of the nearest neighbor or the average of the previous and following timestamp.Is there an easy way to do it? I have tried with fancyimpute but the dataset contain around 180000 examples and give a memory error f521d62e0d09e1a9adccda284cbaab81.png

解决方案

Consider interpolate (documentation). This example shows how to fill gaps of any size with a straight line:

df = pd.DataFrame({'date': pd.date_range(start='2013-01-01', periods=10, freq='H'), 'value': range(10)})

df.loc[2:3, 'value'] = np.nan

df.loc[6, 'value'] = np.nan

df

date value

0 2013-01-01 00:00:00 0.0

1 2013-01-01 01:00:00 1.0

2 2013-01-01 02:00:00 NaN

3 2013-01-01 03:00:00 NaN

4 2013-01-01 04:00:00 4.0

5 2013-01-01 05:00:00 5.0

6 2013-01-01 06:00:00 NaN

7 2013-01-01 07:00:00 7.0

8 2013-01-01 08:00:00 8.0

9 2013-01-01 09:00:00 9.0

df['value'].interpolate(method='linear', inplace=True)

date value

0 2013-01-01 00:00:00 0.0

1 2013-01-01 01:00:00 1.0

2 2013-01-01 02:00:00 2.0

3 2013-01-01 03:00:00 3.0

4 2013-01-01 04:00:00 4.0

5 2013-01-01 05:00:00 5.0

6 2013-01-01 06:00:00 6.0

7 2013-01-01 07:00:00 7.0

8 2013-01-01 08:00:00 8.0

9 2013-01-01 09:00:00 9.0

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值