这几天一直在研究LSTM,想着用LSTM做个电力负荷预测,之前用Tensorflow搭建怎么也调试不好,一度崩溃。无奈换成Keras,Keras搭建LSTM结构整体思路看着非常清晰,程序量也较少。现将程序分享给大家一起交流学习。
这里我搭建的是单变量LSTM模型,数据为简单的m*1形式,m为样本数,运行环境为jupyter notebook
1、目的:使用LSTM由当前小时的电力负荷预测下一小时的电力负荷
2、参考来源:https://machinelearningmastery.com/multivariate-time-series-forecasting-lstms-keras/
3、安装包(有一小部分没用到)
import numpy as np
from math import sqrt
from scipy.io import loadmat
from sklearn.metrics import mean_absolute_error,mean_squared_error
from sklearn.preprocessing import MinMaxScaler
from keras.models import load_model, Model
from keras.layers import Dense, Activation, Dropout, Input, LSTM, Reshape, Lambda, RepeatVector
from keras.initializers import glorot_uniform
from keras.utils import to_categorical
from keras.optimizers import Adam
from keras import backend as K
from keras.models import Sequential
import matplotlib.pyplot as plt
from numpy import concatenate
4、读取数据集(这处每个人应当根据自己的数据存放形式进行修改)
X = loadmat(r"文件位置")#文件为.mat格式
#读取数据(结构体形式)
data_all = X['data']
data = data_all[0,0]['SYSLoad'] #以小时为单位
5、准备数据
scaler_for_x=MinMaxScaler(feature_range=(0,1))
data_ = scaler_for_x.fit_transform(data) #最大最小归一化
normalize_data = data_[0:50000,:]#训练集50000小时
normalize_data_test = data_[50000:50500,:]#测试集500小时
print(normalize_data_test.shape)
#训练集标签和测试集标签,为当前小时的下一小时的电力负荷,即用当前小时预测下一小时
train_x, train_y = normalize_data[:, :], data_[1:50001, -1]
test_x, test_y = normalize_data_test[:, :], data_[50001:50501, -1]
# reshape input to be 3D [samples, timesteps, features]
train_x = train_x.reshape((train_x.shape[0], 1, train_x.shape[1]))
test_x = test_x.reshape((test_x.shape[0], 1, test_x.shape[1]))
print(train_x.shape, train_y.shape, test_x.shape, test_y.shape)
结果为:(50000, 1, 1) (50000,) (500, 1, 1) (500,)
6、 搭建模型
def create_model():
model = Sequential()
#输入数据的shape为(n_samples, timestamps, features)
#隐藏层设置为50, input_shape元组第二个参数1意指features为1
model.add(LSTM(units=50,input_shape=(train_x.shape[1], train_x.shape[2])))
#后接全连接层,直接输出单个值,故units为1
model.add(Dense(units=1))
model.add(Activation('linear'))#选用线性激活函数
model.compile(loss='mse',optimizer=Adam(lr=0.01))#损失函数为平均均方误差,优化器为Adam,学习率为0.01
return model
7、 训练模型并进行测试
model = create_model()
history =model.fit(train_x, train_y, epochs=500, batch_size=72, validation_data=(test_x,test_y))#训练模型并进行测试
输出如下所示:
8、画出Loss曲线
# plot history
plt.plot(history.history['loss'], label='train')
plt.plot(history.history['val_loss'], label='test')
plt.legend()
plt.show()
曲线如下所示:
9、预测并输出RMSE
# make a prediction
yhat = model.predict(test_x)
test_x = test_x.reshape((test_x.shape[0], test_x.shape[2]))
# invert scaling for forecast
inv_yhat = concatenate((yhat, test_x[:, 1:]), axis=1)#按行的方式进行组合
inv_yhat = scaler_for_x.inverse_transform(inv_yhat)#从0~1反变换为真实数据
inv_yhat = inv_yhat[:,0]
# invert scaling for actual
inv_y = scaler_for_x.inverse_transform(test_x)
inv_y = inv_y[:,0]
# calculate RMSE
rmse = sqrt(mean_squared_error(inv_y, inv_yhat))
print('Test RMSE: %.3f' % rmse)
结果为:
Test RMSE: 80.416
10、画出实际和预测曲线
plt.figure(figsize=(24,8))
plt.plot(list(range(len(inv_y))), inv_y, color='b')
plt.plot(list(range(len(inv_yhat))), inv_yhat, color='r')
plt.show()
曲线如下所示:
从预测曲线和RMSE来看,预测效果还是很不错的。