三次样条python_Python:时间序列数据的三次样条回归

I have the data as shown below. I want to find a CUBIC SPLINE curve that fits the entire data set (link to sample data).

Things I've tried so far:

I've gone through scipy's Cubic Spline Functions, but all of them are only able to give results at a single time only, whereas I want a single curve for the entire time range.

I plotted a graph by taking an average of the spline coefficients generated by scipy.interpolate.splrep for a 4 number of knots, but the results were not good and didn't solve my purpose.

Things that can help me:

An idea about how to optimize the number and position of knots for a better fit

If not that, then if someone can help me find the exact polynomial coefficients for the Cubic Splines for a given number of knots.

If someone can suggest a complete way to solve this problem.

解决方案

I made a 3D scatterplot of the data, converting the timestamps to "elapsed time in seconds" from the first timestamp, the image is below. It appears to me that the data has a sort of 3D equivalent of an outlier, here shown as an entire line of data that is considerably below most of the other data. This will make creating a 3D surface fit of any kind difficult.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
三次样条插值是一种常用的填补缺失值的方法之一,可以通过插值函数来估计缺失值。在Python中,可以使用pycubicspline库来实现三次样条插值的填补过程。首先,需要导入相关的库和模块: import math import pycubicspline import numpy as np import pandas as pd import matplotlib.pyplot as plt 接下来,可以使用已有的数据创建一个数据框,并将缺失值用NaN表示。然后,可以使用pycubicspline库中的插值函数进行三次样条插值的填补操作。具体的步骤如下: 1. 创建一个时间序列数据框,将缺失值用NaN表示。 2. 使用pycubicspline库中的插值函数对缺失值进行填补。 3. 绘制填补后的时间序列图,以便观察填补效果。 示例代码如下: # 创建一个示例数据框,假设有一列时间序列数据包含缺失值 data = pd.DataFrame({'date': pd.date_range(start='2022-01-01', end='2022-01-31'), 'value': [10, 15, np.nan, 20, 25, np.nan, 30, 35, 40, 45, np.nan, 50, 55, np.nan, 60, 65, 70, np.nan, 75, 80, np.nan, 85, 90, 95, 100, np.nan, 105, 110, 115, 120]}) # 将缺失值用NaN表示 data['value'].replace(0, np.nan, inplace=True) # 使用三次样条插值方法填补缺失值 spline = pycubicspline.CubicSpline(data['date'], data['value']) data['value_filled'] = spline(data['date']) # 绘制填补后的时间序列图 plt.plot(data['date'], data['value_filled'], label='Filled Values') plt.plot(data['date'], data['value'], 'o', label='Original Values') plt.xlabel('Date') plt.ylabel('Value') plt.title('Time Series with Missing Values') plt.legend() plt.show() 通过使用三次样条插值方法,我们可以通过已有的数据估计出缺失值,并生成填补后的时间序列图。这样可以帮助我们更好地理解和分析数据的特征和趋势。请注意,这只是一种填补缺失值的方法之一,根据具体情况和数据特点,可能需要选择其他方法来进行缺失值处理。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值