搭建LSTM和三种变体网络预测新冠每日新增人数

最新推荐文章于 2023-02-22 19:21:14 发布

ZZM丶

最新推荐文章于 2023-02-22 19:21:14 发布

阅读量3.8k

点赞数 6

文章标签： tensorflow python 深度学习

本文链接：https://blog.csdn.net/qq_41833526/article/details/115383751

版权

时间倒回2020年4月，疫情肆虐，当时大家应该都在家里网上上课、远程工作，我也一样郁闷的在家上网课。闲来无事，那我们就来预测一下疫情吧，当时4月份正赶上意大利疫情巨变，确诊人数猛涨，那就决定用意大利的数据来预测，这篇就分享一下去年我学习实践的历程吧，正好也能作为笔记，方便以后回顾回顾。

正文开始：
用LSTM网络及其三种变体网络来预测意大利新冠疫情每日新增确诊人数，难点在于由于训练集的数据较少，如果将目标误差设置过小、训练代数过多极易造成过拟合。较好的结果是网络能够反映训练集输出的变化趋势即可，不必对每一天的数字拟合得非常准确，否则网络泛化性可能会得不到保证，测试集误差反而更大。

先用Tensorflow2来搭建，首先把需要用的包import进来。

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, LSTM, RNN, GRU
from tensorflow.keras.experimental import PeepholeLSTMCell
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import time
import math

用到的所有数据如下：

意大利疫情数据

数据集分割：取用其中为2月25日-4月3日的意大利新冠疫情每日累计确诊人数，因此先要将后一天的人数与当天人数做差才能得到每日新增人数，共39个数据。从2月25日开始的第1到30天数据作为训练集，剩余数据作为测试集。
归一化：将整个数据集归一化到 [0,1] 区间内，训练结束后还需要再将拟合结果和预测结果反归一化，以便作图和分析评估。

大家应该也在别的文章里了解了LSTM的运作方式了，LSTM的正确运行所需的数据应该为序列形式的，通俗的讲就是根据前n个连续数据来预测之后的数据，这个n在LSTM里称为Time step，也可称为时间步长，设置合适的时间步长能够提高网络的预测精度。

这里我并没有做多组对比，Time step暂时设为3。

weishu_input = 3  # Time step 设为 3

df = pd.read_excel('意大利确诊.xlsx')  # 从表格读取数据
num = df.iloc[10:, 1].values  # 取出累计人数数据
num_add = []
for i in range(len(num) - 1):#计算新增人数数据
    num_add.append(num[i + 1] - num[i])

num_add_range = np.max(num_add) - np.min(num_add)  # 归一化至0~1
num_add_guiyi = (num_add - np.min(num_add)) / num_add_range

X, y = list(), list()
for i in range(len(num_add_guiyi)):  # 准备网络的输入和输出数据
    end_ix = i + weishu_input
    if end_ix > len(num_add_guiyi) - 1:
        break
    X.append(num_add_guiyi[i:end_ix])
    y.append(num_add_guiyi[end_ix])

input, output = np.array(X), np.array(y)

# 训练集数据进一步处理
train_input = input[0:30 - weishu_input]
train_input = train_input.reshape(train_input.shape[0], 1, train_input.shape[1])
train_output = output[0:30 - weishu_input]
train_output = train_output.reshape(train_output.shape[0], 1, 1)

# 测试集数据进一步处理
test_input = input[30 - weishu_input:]
test_input = test_input.reshape(test_input.shape[0], 1, test_input.shape[1])
test_output = output[30 - weishu_input:]
test_output = test_output.reshape(test_output.shape[0], 1, 1)

搭建一个LSTM网络，设置损失函数、优化器、学习率和评估标准，设置batch size、epochs，开始炼丹！！！

NET = Sequential()  # 创建模型框架

NET.add(LSTM(units=8, input[0:30-weishu_input].shape[1]))  # 添加LSTM层
NET.add(Dense(1, activation='linear'))  # 添加全连接层
print(NET.summary())  # 显示模型结构信息

NET.compile(loss='mse', optimizer=tf.keras.optimizers.Adam(lr=1e-3), metrics=['mae'])  # 损失函数用均方差，优化器用adam优化器，评估标准为绝对值均差

start = time.time()  # 记录训练开始时间

train_history = NET.fit(x=train_input, y=train_output, epochs=150, batch_size=1, verbose=2)  # 训练并打印每一步记录
print('training time = ' + str(time.time() - start))  # 打印训练用时

数据不多，网络只搭了两层，快乐的炼丹时光很快就结束了。把网络参数保存下来或者连网络结构一起保存下来。

NET.save('Model_TF_LSTM.h5')  # 保存模型
NET.save_weights('Model_TF_LSTM_Weights.h5')  # 保存权值

接下来画出损失函数loss和评估标准metrics的变化曲线看看效果怎么样，然而matplotlib原生对中文字体显示的支持太差了，画图之前先自定义几个好看的中文字体。

font1 = {'family': 'Microsoft YaHei', 'weight': 'normal', 'size': 18}  # 定义三种作图用字体
font2 = {'family': 'Microsoft YaHei', 'weight': 'normal', 'size': 16}
font3 = {'family': 'Microsoft YaHei', 'weight': 'normal', 'size': 13}

plt.figure(figsize=(12, 5))
plt.subplot(121)
plt.plot(train_history.history['mae'])  # 绘制训练过程均差
plt.title("训练均差", font1)
plt.xlabel("步数", font2)
plt.ylabel("均差", font2)
plt.subplot(122)
plt.plot(train_history.history['loss'])  # 绘制训练过程均方差
plt.title("训练均方差", font1)
plt.xlabel("步数", font2)
plt.ylabel("均方差", font2)
plt.show()

loss和metrics

预测、画图。

train_day_add_show = list(range(input[0:30 - weishu_input].shape[0]))  # 训练集横轴数据
train_num_add_show = train_output.reshape(len(train_output)) * num_add_range + np.min(num_add)  # 训练集真实新增人数反归一化
train_pre_add_show = NET.predict(train_input).reshape(len(train_input)) * num_add_range + np.min(num_add)  # 训练集预测新增人数反归一化
test_day_add_show = list(range(input[30 - weishu_input:].shape[0]))  # 测试集横轴数据
test_num_add_show = test_output.reshape(len(test_output)) * num_add_range + np.min(num_add)  # 测试集真实新增人数反归一化
test_pre_add_show = NET.predict(test_input).reshape(len(test_input)) * num_add_range + np.min(num_add)  # 测试集预测新增人数反归一化

plt.figure(figsize=(12, 5))
plt.subplot(121)
plt.plot(train_day_add_show, train_num_add_show, 'o-')  # 绘制训练集真实新增人数
plt.plot(train_day_add_show, train_pre_add_show, 'o-')  # 绘制训练集预测新增人数
plt.ylim((0, 7000))
plt.title("训练集", font1)
plt.xlabel("天数", font2)
plt.ylabel("人数", font2)
plt.legend(['真实', '预测'], loc='lower right', prop=font3)
plt.subplot(122)
plt.plot(test_day_add_show, test_num_add_show, 'o-')  # 绘制测试集真实新增人数
plt.plot(test_day_add_show, test_pre_add_show, 'o-')  # 绘制测试集预测新增人数
plt.ylim((0, 7000))
plt.title("测试集", font1)
plt.xlabel("天数", font2)
plt.ylabel("人数", font2)
plt.legend(['真实', '预测'], loc='lower right', prop=font3)
plt.show()

print(sum(abs(train_pre_add_show - train_num_add_show)) / len(train_pre_add_show))  # 计算训练集人数均差

print(sum(abs(test_pre_add_show - test_num_add_show)) / len(test_pre_add_show))  # 计算测试集人数均差
print(math.sqrt(np.sum(np.array(test_pre_add_show - test_num_add_show) ** 2) / len(test_pre_add_show)))  # 计算测试集人数均方差

MaeMse_add = NET.evaluate(test_input, test_output)  # 计算测试集的均差和均方差
print(MaeMse_add)

结果：
① 训练结束均方差：0.0085
② 训练结束均差：0.0756
③ 预测结果与测试集真实数据的均方差：1104（人）
④ 预测结果与测试集真实数据的误差均值：1349（人）

以上就完成LSTM预测意大利新增人数预测了。

Tensorflow2还提供了LSTM的另外两种变体，PeepholeLSTM的api，

NET.add(RNN(PeepholeLSTMCell(8), input_shape=(None, input[0:30-weishu_input].shape[1])))  # 添加LSTM的变体一
NET.add(Dense(1, activation='linear'))  # 添加全连接层

和GRU的api，都可以试一试。

NET.add(GRU(8, input_shape=(None, input[0:30-weishu_input].shape[1])))  # 添加GRU层
NET.add(Dense(1, activation='linear'))  # 添加全连接层

除了以上直接预测新增人数的方式，还可以先预测累计人数，之后再做差间接得到新增人数。那预处理的时候就可以不求导，直接分割数据集。

weishu_input = 3  # Time step 设为 3

df = pd.read_excel('意大利确诊.xlsx')  # 从表格读取数据
num = df.iloc[10:, 1].values  # 取出累计人数数据
num_add = num 

num_add_range = np.max(num_add) - np.min(num_add)  # 归一化至0~1
num_add_guiyi = (num_add - np.min(num_add)) / num_add_range

X, y = list(), list()
for i in range(len(num_add_guiyi)):  # 准备网络的输入和输出数据
    end_ix = i + weishu_input
    if end_ix > len(num_add_guiyi) - 1:
        break
    X.append(num_add_guiyi[i:end_ix])
    y.append(num_add_guiyi[end_ix])

input, output = np.array(X), np.array(y)

# 训练集数据进一步处理
train_input = input[0:30 - weishu_input]
train_input = train_input.reshape(train_input.shape[0], 1, train_input.shape[1])
train_output = output[0:30 - weishu_input]
train_output = train_output.reshape(train_output.shape[0], 1, 1)

# 测试集数据进一步处理
test_input = input[30 - weishu_input:]
test_input = test_input.reshape(test_input.shape[0], 1, test_input.shape[1])
test_output = output[30 - weishu_input:]
test_output = test_output.reshape(test_output.shape[0], 1, 1)

预测的是累计确诊人数。

train_day_show = (train_day.reshape(len(train_day))) * input_range + np.min(day)  # 训练集天数反归一化
train_num_show = train_num.reshape(len(train_day)) * output_range + np.min(num)  # 训练集真实累计人数反归一化
train_pre_show = NET.predict(train_day).reshape(len(train_day)) * output_range + np.min(num)  # 训练集预测累计人数反归一化
test_day_show = (test_day.reshape(len(test_day))) * input_range + np.min(day)  # 测试集天数反归一化
test_num_show = test_num.reshape(len(test_day)) * output_range + np.min(num)  # 测试集真实累计人数反归一化
test_pre_show = NET.predict(test_day).reshape(len(test_day)) * output_range + np.min(num)  # 测试集预测累计人数反归一化

plt.figure(figsize=(12, 5))  # 画累计人数图
plt.subplot(121)
plt.plot(train_day_show, train_num_show, 'o-')  # 绘制训练集真实累计人数
plt.plot(train_day_show, train_pre_show, 'o-')  # 绘制训练集预测累计人数
plt.ylim((0, 130000))
plt.title("训练集真实和预测人数曲线", font1)
plt.xlabel("天数", font2)
plt.ylabel("累计人数", font2)
plt.legend(['真实', '预测'], loc='upper left', prop=font3)
plt.subplot(122)
plt.plot(test_day_show, test_num_show, 'o-')  # 绘制测试集真实累计人数
plt.plot(test_day_show, test_pre_show, 'o-')  # 绘制测试集预测累计人数
plt.ylim((0, 130000))
plt.title("测试集真实和预测人数曲线", font1)
plt.xlabel("天数", font2)
plt.ylabel("累计人数", font2)
plt.legend(['真实', '预测'], loc='upper left', prop=font3)
plt.show()

预测累计

再微分间接得到新增人数。

train_pre_QD_show = []
for i in range(len(train_pre_show) - 1):  # 对训练集预测累计人数求导=训练集预测新增人数
    train_pre_QD_show.append(train_pre_show[i + 1] - train_pre_show[i])

test_pre_QD_show = []
test_pre_QD_show.append(test_pre_show[0] - train_pre_show[-1])  # 对测试集预测累计人数求导=测试集预测新增人数
for i in range(len(test_pre_show) - 1):
    test_pre_QD_show.append(test_pre_show[i + 1] - test_pre_show[i])

num_add = []
for i in range(len(NUM) - 1):  # 计算出每日真实新增人数
    num_add.append(NUM[i + 1] - NUM[i])

day_add = list(range(len(num_add)))  # 准备作图用的横坐标数据：每日新增对应的天数
train_day_add_show = day_add[0:30]  # 前30天为训练集天数
test_day_add_show = day_add[30:]  # 往后为测试集天数

train_num_add_show = num_add[0:30]  # 前30天为训练集新增人数
test_num_add_show = num_add[30:]  # 往后为测试集新增人数

plt.figure(figsize=(12, 5))  # 画新增人数图
plt.subplot(121)
plt.plot(train_day_add_show, train_num_add_show, 'o-')  # 绘制训练集真实新增人数
plt.plot(train_day_add_show, train_pre_QD_show, 'o-')  # 绘制训练集预测新增人数
plt.ylim((0, 10000))
plt.title("训练集", font1)
plt.xlabel("天数", font2)
plt.ylabel("新增人数", font2)
plt.legend(['真实', '预测'], loc='upper left', prop=font3)
plt.subplot(122)
plt.plot(test_day_add_show, test_num_add_show, 'o-')  # 绘制测试集真实新增人数
plt.plot(test_day_add_show, test_pre_QD_show, 'o-')  # 绘制测试集预测新增人数
plt.ylim((0, 10000))
plt.title("测试集", font1)
plt.xlabel("天数", font2)
plt.ylabel("新增人数", font2)
plt.legend(['真实', '预测'], loc='upper left', prop=font3)
plt.show()

print(sum(abs(train_pre_show - train_num_show)) / len(train_pre_show))  # 计算训练集累计人数均差
print(sum(abs(test_pre_show - test_num_show)) / len(test_pre_show))  # 计算测试集累计人数均差

print(sum(abs(np.array(train_pre_QD_show) - np.array(train_num_add_show))) / len(train_pre_QD_show))  # 计算训练集新增人数均差
print(sum(abs(np.array(test_pre_QD_show) - np.array(test_num_add_show))) / len(test_pre_QD_show))  # 计算测试集新增人数均差
print(math.sqrt(np.sum((np.array(test_pre_QD_show) - np.array(test_num_add_show)) ** 2) / len(test_pre_QD_show)))

这个时候我们预测的新增为：

求导预测新增

LSTM还有另一种变体，CoupledInputForgetLSTM。CoupledInputForgetLSTM在Tensorflow2.0版本从标准api中移除了，要实现它只能使用Tensorflow1。这种的变体特点在于，由于发现忘记门与输入门之间有一定的耦合关系，所以直接合并忘记门与输入门，减少计算量，加快训练速度。由于此变体与LSTM差别不大，所以原理上他们的训练和测试结果不会有很大差别。而且本次应用数据量小，复杂度低，几种变体的差别更微乎其微。给大家全贴在下面，Tensorflow1和Tensorflow2搭建网络的方式有些不同，但不用担心，以下除了网络搭建部分以外都和以上是差不多的。

# -*- coding: utf-8 -*-
import os

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

import tensorflow as tf
import numpy as np
import pandas as pd
from tensorflow.contrib import rnn
import matplotlib.pyplot as plt
import math

font1 = {'family': 'Microsoft YaHei', 'weight': 'normal', 'size': 18}  # 定义三种作图用字体
font2 = {'family': 'Microsoft YaHei', 'weight': 'normal', 'size': 16}
font3 = {'family': 'Microsoft YaHei', 'weight': 'normal', 'size': 13}

weishu_input = 3

df = pd.read_excel('意大利确诊.xlsx')  # 从表格读取数据
num = df.iloc[10:, 1].values  # 取出累计人数数据
num_add = []
for i in range(len(num) - 1):  # 计算新增人数数据
    num_add.append(num[i + 1] - num[i])

num_add_range = np.max(num_add) - np.min(num_add)  # 归一化至0~1
num_add_guiyi = (num_add - np.min(num_add)) / num_add_range

X, y = list(), list()
for i in range(len(num_add_guiyi)):  # 准备网络的输入和输出数据
    end_ix = i + weishu_input
    if end_ix > len(num_add_guiyi) - 1:
        break
    X.append(num_add_guiyi[i:end_ix])
    y.append(num_add_guiyi[end_ix])

input, output = np.array(X), np.array(y)

# 训练集数据进一步处理
train_input = input[0:30 - weishu_input]
train_input = train_input.reshape(train_input.shape[0], 1, train_input.shape[1])
train_output = output[0:30 - weishu_input]
train_output = train_output.reshape(train_output.shape[0], 1, 1)

# 测试集数据进一步处理
test_input = input[30 - weishu_input:]
test_input = test_input.reshape(test_input.shape[0], 1, test_input.shape[1])
test_output = output[30 - weishu_input:]
test_output = test_output.reshape(test_output.shape[0], 1, 1)

tf.reset_default_graph()
x = tf.placeholder("float", [None, 1, train_input.shape[2]])
y = tf.placeholder("float", [None, 1, 1])

cplstm_cell = rnn.CoupledInputForgetGateLSTMCell(8)
outputs, states = tf.nn.dynamic_rnn(cplstm_cell, x, dtype=tf.float32)
pred = tf.contrib.layers.fully_connected(outputs, 1)
cost = tf.reduce_mean(tf.square(pred - y))
optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
correct_pred = abs(pred - y)
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

loss_show = []
acc_show = []

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for epoch in range(2000):
        _, loss, acc, prediction = sess.run([optimizer, cost, accuracy, pred],
                                            feed_dict={x: train_input, y: train_output})
        loss_show.append(loss)
        acc_show.append(acc)
        print("epoch %d, loss %g, acc %g" % (epoch + 1, loss, acc))
    test_prediction, test_loss, test_acc = sess.run([pred, cost, accuracy], feed_dict={x: test_input, y: test_output})
    print(test_loss, test_acc)

plt.figure(figsize=(12, 5))
plt.subplot(121)
plt.plot(acc_show)
plt.title("训练过程精度变化曲线", font1)
plt.xlabel("步数", font2)
plt.ylabel("均差", font2)
plt.subplot(122)
plt.plot(loss_show)
plt.title("训练过程交叉熵损失变化曲线", font1)
plt.xlabel("步数", font2)
plt.ylabel("均方差", font2)
plt.show()

train_day_add_show = list(range(input[0:30 - weishu_input].shape[0]))  # 训练集横轴数据
train_num_add_show = train_output.reshape(len(train_output)) * num_add_range + np.min(num_add)  # 训练集真实新增人数反归一化
train_pre_add_show = prediction.reshape(len(prediction)) * num_add_range + np.min(num_add)  # 训练集预测新增人数反归一化
test_day_add_show = list(range(input[30 - weishu_input:].shape[0]))  # 测试集横轴数据
test_num_add_show = test_output.reshape(len(test_output)) * num_add_range + np.min(num_add)  # 测试集真实新增人数反归一化
test_pre_add_show = np.array(test_prediction).reshape(len(test_output)) * num_add_range + np.min(
    num_add)  # 测试集预测新增人数反归一化

plt.figure(figsize=(12, 5))
plt.subplot(121)
plt.plot(train_day_add_show, train_num_add_show, 'o-')  # 绘制训练集真实新增人数
plt.plot(train_day_add_show, train_pre_add_show, 'o-')  # 绘制训练集预测新增人数
plt.ylim((0, 7000))
plt.title("训练集", font1)
plt.xlabel("天数", font2)
plt.ylabel("人数", font2)
plt.legend(['真实', '预测'], loc='lower right', prop=font3)
plt.subplot(122)
plt.plot(test_day_add_show, test_num_add_show, 'o-')  # 绘制测试集真实新增人数
plt.plot(test_day_add_show, test_pre_add_show, 'o-')  # 绘制测试集预测新增人数
plt.ylim((0, 7000))
plt.title("测试集", font1)
plt.xlabel("天数", font2)
plt.ylabel("人数", font2)
plt.legend(['真实', '预测'], loc='lower right', prop=font3)
plt.show()

print(sum(abs(test_pre_add_show - test_num_add_show)) / len(test_pre_add_show))  # 计算测试集人数均差
print(math.sqrt(np.sum(np.array(test_pre_add_show - test_num_add_show) ** 2) / len(test_pre_add_show)))  # 计算测试集人数均方差

在这里插入图片描述

结果：
① 训练结束均方差：0.0075
② 训练结束均差：0.0657
③ 预测结果与测试集真实数据的均方差：1204（人）
④ 预测结果与测试集真实数据的误差均值：1578（人）

ZZM丶

关注

6
点赞
踩
42

收藏

觉得还不错? 一键收藏
打赏
0
评论
搭建LSTM和三种变体网络预测新冠每日新增人数

时间倒回2020年4月，疫情肆虐，当时大家应该都在家里网上上课、远程工作，我也一样郁闷的在家上网课。闲来无事，那我们就来预测一下疫情吧，当时4月份正赶上意大利疫情巨变，确诊人数猛涨，那就决定用意大利的数据来预测，这篇就分享一下去年我学习实践的历程吧，正好也能作为笔记，方便以后回顾回顾。正文开始：用LSTM网络及其三种变体网络来预测意大利新冠疫情每日新增确诊人数，难点在于由于训练集的数据较少，如果将目标误差设置过小、训练代数过多极易造成过拟合。较好的结果是网络能够反映训练集输出的变化趋势即可，不必对每一天
复制链接

扫一扫