EA-LSTM: Evolutionary attention-based LSTM for time series prediction

最新推荐文章于 2022-04-25 20:52:12 发布

EntropyPlus

最新推荐文章于 2022-04-25 20:52:12 发布

阅读量2.2k

点赞数 2

分类专栏：文献阅读

本文链接：https://blog.csdn.net/u012759262/article/details/102542054

版权

文献阅读专栏收录该内容

7 篇文章 1 订阅

订阅专栏

原文链接

1. Introduction

做法： Based on the idea of evolutionary computation[21],we propose a competitive random search (CRS) instead of the gradient-based method to solve the attention layer weights.

Genetic Algorithms and the Optimal Allocation of Trials

为什么要引入CRS：change the search direction to avoid falling into local optimum.

[22] X. Zhang, J. Clune, K.O. Stanley, On the relationship between the OpenAI evolution strategy and stochastic gradient descent, 2017, arXiv:1712. 06564.
[23] E. Conti, V. Madhavan, F. Petroski Such, J. Lehman, K.O. Stanley, J. Clune, Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents, 2017, arXiv:1712. 06560.
[24] Joel Lehman, Jay Chen, Jeff Clune, Kenneth O. Stanley, Safe mutations for deep and recurrent neural multi-order through output gradients, 2017, arXiv:1712.06563.

对GA的操作：

In particular, theimproved crossoveroperator has integrated more stochastic mechanisms to maintain the differences between the progeny individuals,
目的：avoiding premature convergence of the algorithm and being trapped in local optimum.
use the basic bit mutation operator to specifically perform the mutation operation by randomly inverting one or several gene values at the locus according to the mutation rate on a single encoded string.

introduction的知识点：

遗传算法

2. Preliminaries(预先准备)

时序数据用于分类和回归。

数据集为 $X=(X_1, X_2,... X_T)$ ，其中每一个 $X_t=(x_t^1, x_t^2,...x_t^L)$ 代表 $L$ 个timestamps。其中每个时刻对应的输出值则记为 $y$ 。

离散or 回归则取决于Y的数据是连续的还是discrete的。

目标：根据历史时刻的输入数据和输出数据找到映射函数 $\widetilde{y}_T$ ，数学表达式为：
$\widetilde{y}_T=f(X, y)$

3. Methodology(方法论)

3.1 Overview

本章结构：

,we first give the overview of the model we proposed
we will detail the evolutionary attention-based LSTM.
we present the competitive random search and a collaborative(协作训练) training mechanism.

工作流程如下：
在这里插入图片描述

3.2 整体算法流程

定义注意力层的权重为：
$W=(W^1, W^2, ..., W^L)$
这里的L是timestamps的个数。根据注意力层的权重对LSTM层的输出进行采样。
$\widetilde{X}_t=(x_t^1W^1, x_t^1W^2, ..., x_t^1W^L)$
然后把 $\widetilde{X}_t$ 喂到LSTM层中，LSTM的计算公式：
作者把 $h^{t-1}$ 作为输出 $\widetilde{y}_t$ ，然后拼成一个矩阵。 $\widetilde{y}_T=(\widetilde{y}^1, \widetilde{y}^2, ..., \widetilde{y}^T)$

3.2 Competitive random search

在这里插入图片描述

把part a中的权重进行二进制编码，每一个个体 $W_i$ 对应的权重传递到 part b，利用遗传算法筛选出最合适的 权重组合

这里并未使用所有的权重，而是挑选出了最合适的权重，umm，跟原来想的不太一样，原本以为是通过遗传算法训练attention的weight，现在只是通过遗传算法找到那些weight合适，其实做了一个筛选操作。送到LSTM神经网络中根据误差进行训练。

在这里插入图片描述
2. 然后重复步骤 c。
3. 最后构建新的种群。

在这里插入图片描述

EntropyPlus

关注

2
点赞
踩
19

收藏

觉得还不错? 一键收藏
9
评论
EA-LSTM: Evolutionary attention-based LSTM for time series prediction

基于进化的LSTM，尽管注意力机制能够捕捉不同窗口大小的特征，但是仍然是不够的evolutionary attention-based LSTM 的目标： multivariate time series prediction优点：尽可能的避免落入最小值，在attention layer中引入了competitive random search
复制链接

扫一扫

专栏目录