1. RNN简介
2. LSTM简介
长短期记忆(Long short-term memory, LSTM)是一种特殊的RNN,主要是为了解决长序列训练过程中的梯度消失和梯度爆炸问题。简单来说,就是相比普通的RNN,LSTM能够在更长的序列中有更好的表现。
LSTM结构(图右)和普通RNN的主要输入输出区别如下所示。
下面具体对LSTM的内部结构来进行剖析。
下面开始进一步介绍这四个状态在LSTM内部的使用。(敲黑板)
LSTM内部主要有三个阶段:
更常见的LSTM原理图如下图:
3. LSTM的python实战
3.1 & 3.1.1 数据预处理
数据预处理是非常关键的一步,需要把数据转换成LSTM模型规定的数据格式,即监督型数据,大致格式:过去n步+预测1步。
预处理代码
def series_to_supervised(data, n_steps=3, split_radio=0.7):
# featrues = 1 if type(data) is list else data.shape[1]
data = pd.DataFrame(data, dtype=v_TYPE)
data = data.values
data = data.astype(v_TYPE)
col = []
n_len = len(data)
for i in range(n_len-n_steps):
temp = []
for j in range(n_steps):
temp.append(list(data[i+j,:]))
col.append(np.array([temp, list(data[i+n_steps,:])]))
m_len = len(col)
split = int(m_len*split_radio)
col = pd.DataFrame(col, dtype=v_TYPE).values
train, test = col[:split,:], col[split:,:]
X_train, y_train = train[:, :-1], train[:, -1]
# TO DO: convert y_train and y_test to np.arry(),.if not,
# it'll report ERROR: ValueError: setting an array element with a sequence
y = []
for row in y_train:
y.append(row)
y_train = np.array(y)
X_test, y_test = test[:, :-1], test[:, -1]
y = []
for row in y_test:
y.append(row)
y_test = np.array(y)
X_train = merge_arrylist(X_train)
X_test = merge_arrylist(X_test)
X_train = X_train.reshape(y_train.shape[0], n_steps, X_train.shape[1])
X_test = X_test.reshape(y_test.shape[0], n_steps, X_test.shape[1])
X_train = X_train.astype(v_TYPE)
X_test = X_test.astype(v_TYPE)
return X_train, y_train, X_test, y_test
举个例子,单变量序列: [1,2,3,4,5,6,7], 这里时间步长为1,设定预测的时间步长timesteps=3,则意味着利用三个连续数据预测下一个(第