愉快的学习就从翻译开始吧_How to Convert a Time Series to a Supervised Learning Problem in Python_2

最新推荐文章于 2024-04-17 12:20:32 发布

dreamscape9999

最新推荐文章于 2024-04-17 12:20:32 发布

阅读量319

点赞数

气死了，翻译半天，这节弄完了，账号却怎么不在登录状态了？搞得也没保存，废了！这节不翻译了，以后回头再看不懂的时候再翻译！

The series_to_supervised() Function/series_to_supervised()函数

We can use the shift() function in Pandas to automatically create new framings of time series problems given the desired length of input and output sequences.

我们可以用Pandas中的shift()函数

This would be a useful tool as it would allow us to explore different framings of a time series problem with machine learning algorithms to see which might result in better performing models.

In this section, we will define a new Python function named series_to_supervised() that takes a univariate or multivariate time series and frames it as a supervised learning dataset.

The function takes four arguments:

data: Sequence of observations as a list or 2D NumPy array. Required.
n_in: Number of lag observations as input (X). Values may be between [1..len(data)] Optional. Defaults to 1.
n_out: Number of observations as output (y). Values may be between [0..len(data)-1]. Optional. Defaults to 1.
dropnan: Boolean whether or not to drop rows with NaN values. Optional. Defaults to True.

The function returns a single value:

return: Pandas DataFrame of series framed for supervised learning.

The new dataset is constructed as a DataFrame, with each column suitably named both by variable number and time step. This allows you to design a variety of different time step sequence type forecasting problems from a given univariate or multivariate time series.

Once the DataFrame is returned, you can decide how to split the rows of the returned DataFrame into X and y components for supervised learning any way you wish.

The function is defined with default parameters so that if you call it with just your data, it will construct a DataFrame with t-1 as X and t as y.

The function is confirmed to be compatible with Python 2 and Python 3.

The complete function is listed below, including function comments.

from pandas import DataFrame
from pandas import concat


def series_to_supervised(data, n_in=1, n_out=1, dropnan=True):
    """
    Frame a time series as a supervised learning dataset.
    Arguments:
        data: Sequence of observations as a list or NumPy array.
        n_in: Number of lag observations as input (X).
        n_out: Number of observations as output (y).
        dropnan: Boolean whether or not to drop rows with NaN values.
    Returns:
        Pandas DataFrame of series framed for supervised learning.
    """
    n_vars = 1 if type(data) is list else data.shape[1]
    df = DataFrame(data)
    cols, names = list(), list()
    # input sequence (t-n, ... t-1)
    for i in range(n_in, 0, -1):
        cols.append(df.shift(i))
        names += [('var%d(t-%d)' % (j + 1, i)) for j in range(n_vars)]
    # forecast sequence (t, t+1, ... t+n)
    for i in range(0, n_out):
        cols.append(df.shift(-i))
        if i == 0:
            names += [('var%d(t)' % (j + 1)) for j in range(n_vars)]
        else:
            names += [('var%d(t+%d)' % (j + 1, i)) for j in range(n_vars)]
    # put it all together
    agg = concat(cols, axis=1)
    agg.columns = names
    # drop rows with NaN values
    if dropnan:
        agg.dropna(inplace=True)
    return agg