How to Time Series Prediction with Multilayer Perceptrons

Time Series prediction is a difficult problem both to frame and to address with machine learning. In this lesson you will discover how to develop neural network models for time series prediction in Python using the Keras deep learning library. After reading this lesson you will know:

  • About the airline passengers univariate time series prediction problem.
  • How to phrase time series prediction as a regression problem and develop a neural network model for it.
  • How to frame time series prediction with a time lag and develop a neural network model for it.

1.1 Problem Description: Time Series Prediction

The problem we are going to look at in this lesson is the international airline passengers prediction problem. This is a problem where given a year and a month, the task is to predict the number of international airline passengers in units of 1,000. The data ranges from January 1949 to December 1960 or 12 years, with 144 observations. The dataset is available for free from the DataMarket webpage as a CSV download1 with the filename international-airline-passengers.csv. Below is a sample of the first few lines of the file.

 

         We can load this dataset easily using the Pandas library. We are not interested in the date, given that each observation is separated by the same interval of one month. Therefore when we load the dataset we can exclude the first column. The downloaded dataset also has footer information that we can exclude with the skipfooter argument to pandas.read_csv() set to 3 for the 3 footer lines. Once loaded we can easily plot the whole dataset. The code to load and plot the dataset is listed below.

# Load and Plot the Time Series dataset
import pandas as pd
import matplotlib.pyplot as plt
dataset = pd.read_csv('international-airline-passengers.csv',usecols=[1],engine='python',skipfooter=3,delimiter='[,]')
plt.plot(dataset)
plt.show()

You can see an upward trend in the plot. You can also see some periodicity to the dataset that probably corresponds to the northern hemisphere summer holiday period.

 We are going to keep things simple and work with the data as-is. Normally, it is a good idea to investigate various data preparation techniques to rescale the data and to make it stationary

1.2 Multilayer Perceptron Regression

We will phrase the time series prediction problem as a regression problem. That is, given the number of passengers (in units of thousands) this month, what is the number of passengers next month. We can write a simple function to convert our single column of data into a two-column dataset. The first column containing this month’s (t) passenger count and the second column containing next month’s (t+1) passenger count, to be predicted. Before we get started, let’s first import all of the functions and classes we intend to use.

# Multilayer Perceptron to Predict International Airline Passengers (t+1, given t)
# Import Classes and Functions
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import math
from keras.models import Sequential
from keras.layers import Dense
# fix random seed for reproducibility
seed = 7
np.random.seed(seed)

# Load the Time Series Dataset
# load the dataset
dataframe = pd.read_csv('international-airline-passengers.csv',usecols=[1],engine='python',skipfooter=3)
dataset = dataframe.values
dataset = dataset.astype('float32')

# Split Dataset into Train and Test
# split into train and test sets
train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:],dataset[train_size:len(dataset),:]
print(len(train),len(test))

# Function to Prepare Dataset For Modeling
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
    dataX, dataY = [],[]
    for i in range(len(dataset)-look_back-1):
        a = dataset[i:(i+look_back),0]
        dataX.append(a)
        dataY.append(dataset[i + look_back,0])
    return np.array(dataX),np.array(dataY)

# Call Function to Prepare Dataset For Modeling
# reshape into X=t and Y=t+1
look_back = 1
trainX,trainY = create_dataset(train,look_back)
testX,testY = create_dataset(test, look_back)

# create and fit Multiplayer Perceptron model
model = Sequential()
model.add(Dense(8, input_dim=look_back,activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error',optimizer='adam')
model.fit(trainX,trainY,epochs=200,batch_size=2,verbose=2)

# Evaluate the Fit Model
# Estimate model performance
trainScore = model.evaluate(trainX,trainY,verbose=0)
print('Train Score: %.2f MSE (%.2f RMSE)' % (trainScore, math.sqrt(trainScore)))
testScore = model.evaluate(testX, testY, verbose=0)
print('Test Score: %.2f MSE (%.2f RMSE)' %(testScore, math.sqrt(testScore)))

# Generate and Plot Predications
#generate predictions for training
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)

# shift train predictions for plotting
trainPredictPlot = np.empty_like(dataset)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[look_back:len(trainPredict) + look_back,:] = trainPredict

# shift test predictions for plotting
testPredictPlot = np.empty_like(dataset)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1,:] = testPredict

# plot baseline and predictions
plt.plot(dataset)
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()

 Running the model produces the following output

 Taking the square root of the performance estimates, we can see that the model has an average error of 23 passengers (in thousands) on the training dataset and 48 passengers (in thousands) on the test dataset.

1.3 Multilayer Perceptron Using the Window Method

We can also phrase the problem so that multiple recent time steps can be used to make the prediction for the next time step. This is called the window method, and the size of the window is a parameter that can be tuned for each problem. For example, given the current time (t) we want to predict the value at the next time in the sequence (t+1), we can use the current time (t) as well as the two prior times (t-1 and t-2). When phrased as a regression problem the input variables are t-2, t-1, t and the output variable is t+1.

        The create dataset() function we wrote in the previous section allows us to create this formulation of the time series problem by increasing the look back argument from 1 to 3. A sample of the dataset with this formulation looks as follows:

X1 X2 X3 Y
112 118 132 129
118 132 129 121
132 129 121 135
129 121 135 148
121 135 148 148

Sample Dataset of the Window Formulation of the Problem

We can re-run the example in the previous section with the larger window size. The whole code listing with just the window size change is listed below for completeness.

# Multilayer Perceptron to Predict International Airline Passengers (t+1, given t, t-1, t-2)
# Import Classes and Functions
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import math
from keras.models import Sequential
from keras.layers import Dense

# Function to Prepare Dataset For Modeling
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
    dataX, dataY = [],[]
    for i in range(len(dataset)-look_back-1):
        a = dataset[i:(i+look_back),0]
        dataX.append(a)
        dataY.append(dataset[i + look_back,0])
    return np.array(dataX),np.array(dataY)

# fix random seed for reproducibility
seed = 7
np.random.seed(seed)

# Load the Time Series Dataset
# load the dataset
dataframe = pd.read_csv('international-airline-passengers.csv',usecols=[1],engine='python',skipfooter=3)
dataset = dataframe.values
dataset = dataset.astype('float32')

# Split Dataset into Train and Test
# split into train and test sets
train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:],dataset[train_size:len(dataset),:]
print(len(train),len(test))

# reshape dataset
look_back = 10
trainX,trainY = create_dataset(train,look_back)
testX,testY = create_dataset(test, look_back)

# create and fit Multiplayer Perceptron model
model = Sequential()
model.add(Dense(8, input_dim=look_back,activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error',optimizer='adam')
model.fit(trainX,trainY,epochs=200,batch_size=2,verbose=2)

# Estimate model performance
trainScore = model.evaluate(trainX,trainY,verbose=0)
print('Train Score: %.2f MSE (%.2f RMSE)' % (trainScore, math.sqrt(trainScore)))
testScore = model.evaluate(testX, testY, verbose=0)
print('Test Score: %.2f MSE (%.2f RMSE)' %(testScore, math.sqrt(testScore)))

#generate predictions for training
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)

# shift train predictions for plotting
trainPredictPlot = np.empty_like(dataset)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[look_back:len(trainPredict) + look_back,:] = trainPredict

# shift test predictions for plotting
testPredictPlot = np.empty_like(dataset)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1,:] = testPredict

# plot baseline and predictions
plt.plot(dataset)
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()

Running the example provides the following output.

 We can see that the error was reduced compared to that of the previous section. Again, the window size and the network architecture were not tuned, this is just a demonstration of how to frame a prediction problem. Taking the square root of the performance scores we can see the average error on the training dataset was 22 passengers (in thousands per month) and the average error on the unseen test set was 47 passengers (in thousands per month).

 Prediction of the Number of Passengers using a Simple Multilayer Perceptron Model With Time Lag. Blue=Whole Dataset, Green=Training, Red=Predictions.

 1.4 Summary

In this lesson you discovered how to develop a neural network model for a time series prediction problem using the Keras deep learning library. After working through this tutorial you now know:

  • About the international airline passenger prediction time series dataset.
  • How to frame time series prediction problems as a regression problems and develop a neural network model.
  • How use the window approach to frame a time series prediction problem and develop a neural network model.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值