# 如何用TensorFlow预测时间序列：TFTS库详细教程

### 前言

• 读入时间序列数据（分为从numpy数组和csv文件两种方式）
• 用AR模型对时间序列进行预测
• 用LSTM模型对时间序列进行预测（包含单变量和多变量）

### 从Numpy数组中读入时间序列数据

# coding: utf-8
from __future__ import print_function
import numpy as np
import matplotlib
matplotlib.use('agg')
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.contrib.timeseries.python.timeseries import NumpyReader

x = np.array(range(1000))
noise = np.random.uniform(-0.2, 0.2, 1000)
y = np.sin(np.pi * x / 100) + x / 200. + noise
plt.plot(x, y)
plt.savefig('timeseries_y.jpg')

data = {
tf.contrib.timeseries.TrainEvalFeatures.TIMES: x,
tf.contrib.timeseries.TrainEvalFeatures.VALUES: y,
}

reader = NumpyReader(data)

with tf.Session() as sess:
# 要用tf.train.start_queue_runners启动队列才能正常进行读取
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
print(sess.run(full_data))
coord.request_stop()

train_input_fn = tf.contrib.timeseries.RandomWindowInputFn(
reader, batch_size=2, window_size=10)

with tf.Session() as sess:
batch_data = train_input_fn.create_batch()
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)
one_batch = sess.run(batch_data[0])
coord.request_stop()

print('one_batch_data:', one_batch)

### 从CSV文件中读入时间序列数据

1,-0.6656603714
2,-0.1164380359
3,0.7398626488
4,0.7368633029
5,0.2289480898
6,2.257073255
7,3.023457405
8,2.481161007
9,3.773638612
10,5.059257738
11,3.553186083

CSV文件的第一列为时间点，第二列为该时间点上观察到的值。将其读入的方法为：

# coding: utf-8
from __future__ import print_function
import tensorflow as tf

csv_file_name = './data/period_trend.csv'
reader = tf.contrib.timeseries.CSVReader(csv_file_name)

### 使用AR模型预测时间序列

x = np.array(range(1000))
noise = np.random.uniform(-0.2, 0.2, 1000)
y = np.sin(np.pi * x / 100) + x / 200. + noise
plt.plot(x, y)
plt.savefig('timeseries_y.jpg')

data = {
tf.contrib.timeseries.TrainEvalFeatures.TIMES: x,
tf.contrib.timeseries.TrainEvalFeatures.VALUES: y,
}

train_input_fn = tf.contrib.timeseries.RandomWindowInputFn(
reader, batch_size=16, window_size=40)

ar = tf.contrib.timeseries.ARRegressor(
periodicities=200, input_window_size=30, output_window_size=10,
num_features=1,
loss=tf.contrib.timeseries.ARModel.NORMAL_LIKELIHOOD_LOSS)

num_features参数表示在一个时间点上观察到的数的维度。我们这里每一步都是一个单独的值，所以num_features=1。

ar.train(input_fn=train_input_fn, steps=6000)

TFTS中验证(evaluation)的含义是：使用训练好的模型在原先的训练集上进行计算，由此我们可以观察到模型的拟合效果，对应的程序段是：

evaluation_input_fn = tf.contrib.timeseries.WholeDatasetInputFn(reader)
evaluation = ar.evaluate(input_fn=evaluation_input_fn, steps=1)

evaluation[‘start_tuple’]会被用于之后的预测中，它相当于最后30步的输出值和对应的时间点。以此为起点，我们可以对1000步以后的值进行预测，对应的代码为：

(predictions,) = tuple(ar.predict(
input_fn=tf.contrib.timeseries.predict_continuation_input_fn(
evaluation, steps=250)))

plt.figure(figsize=(15, 5))
plt.plot(data['times'].reshape(-1), data['values'].reshape(-1), label='origin')
plt.plot(evaluation['times'].reshape(-1), evaluation['mean'].reshape(-1), label='evaluation')
plt.plot(predictions['times'].reshape(-1), predictions['mean'].reshape(-1), label='prediction')
plt.xlabel('time_step')
plt.ylabel('values')
plt.legend(loc=4)
plt.savefig('predict_result.jpg')

### 使用LSTM预测单变量时间序列

x = np.array(range(1000))
noise = np.random.uniform(-0.2, 0.2, 1000)
y = np.sin(np.pi * x / 50 ) + np.cos(np.pi * x / 50) + np.sin(np.pi * x / 25) + noise

data = {
tf.contrib.timeseries.TrainEvalFeatures.TIMES: x,
tf.contrib.timeseries.TrainEvalFeatures.VALUES: y,
}

train_input_fn = tf.contrib.timeseries.RandomWindowInputFn(
reader, batch_size=4, window_size=100)

estimator = ts_estimators.TimeSeriesRegressor(
model=_LSTMModel(num_features=1, num_units=128),
optimizer=tf.train.AdamOptimizer(0.001))

num_features = 1表示单变量时间序列，即每个时间点上观察到的量只是一个单独的数值。num_units=128表示使用隐层为128大小的LSTM模型。

estimator.train(input_fn=train_input_fn, steps=2000)
evaluation = estimator.evaluate(input_fn=evaluation_input_fn, steps=1)
# Predict starting after the evaluation
(predictions,) = tuple(estimator.predict(
input_fn=tf.contrib.timeseries.predict_continuation_input_fn(
evaluation, steps=200)))

### 使用LSTM预测多变量时间序列

0,0.926906299771,1.99107237682,2.56546245685,3.07914768197,4.04839057867
1,0.108010001864,1.41645361423,2.1686839775,2.94963962176,4.1263503303
2,-0.800567600028,1.0172132907,1.96434754116,2.99885333086,4.04300485864
3,0.0607042871898,0.719540073421,1.9765012584,2.89265588817,4.0951014426
4,0.933712200629,0.28052120776,1.41018552514,2.69232603996,4.06481164223
5,-0.171730652974,0.260054421028,1.48770816369,2.62199129293,4.44572807842
6,-1.00180162933,0.333045158863,1.50006392277,2.88888309683,4.24755865606
7,0.0580061875336,0.688929398826,1.56543458772,2.99840358953,4.52726873347

csv_file_name = path.join("./data/multivariate_periods.csv")
csv_file_name,
column_names=((tf.contrib.timeseries.TrainEvalFeatures.TIMES,)
+ (tf.contrib.timeseries.TrainEvalFeatures.VALUES,) * 5))
train_input_fn = tf.contrib.timeseries.RandomWindowInputFn(
reader, batch_size=4, window_size=32)

estimator = ts_estimators.TimeSeriesRegressor(
model=_LSTMModel(num_features=5, num_units=128),
optimizer=tf.train.AdamOptimizer(0.001))

### 总结

CSDN AI热衷分享 欢迎扫码关注

#### tensorflow 1.3 lstm训练和预测铁路客运数据

2017年08月20日 3KB 下载

#### Tensorflow LSTM时间序列预测的尝试

2017-09-08 09:05:02

#### tf19: 预测铁路客运量

2016-12-26 13:34:57

#### 使用tensorflow seq2seq进行时间序列预测

2018-05-11 16:56:52

#### Tensorflow构建RNN做时间序列预测（一）

2018-02-05 16:44:34

#### RNN时间序列预测（2）-Tensorflow入门，RNN操作

2017-04-19 11:56:35

#### Pytorch LSTM 时间序列预测

2017-12-02 16:44:06

#### TensorFlow 出现 TimeSeriesRegressor 未发现

2017-09-19 21:38:50

#### 使用Tensorflow进行时序预测(TFTS)

2017-11-28 21:16:51

#### TensorFlow实战：Chapter-7下（TFTS库与时间序列预测)

2017-09-05 21:24:46