借助 Transformer 实现美股价格的预测（Python干货）

最新推荐文章于 2024-08-13 11:46:25 发布

老余捞鱼

最新推荐文章于 2024-08-13 11:46:25 发布

阅读量1.2k

点赞数 34

分类专栏： AI顾投高级策略 AI探讨与学习文章标签： transformer python 深度学习

本文链接：https://blog.csdn.net/weixin_70955880/article/details/140860467

版权

作者：老余捞鱼

原创不易，转载请标明出处及原作者。

写在前面的话：

Transformer 是一种在自然语言处理等领域广泛应用的深度学习架构，与传统的循环神经网络（RNN）相比，Transformer 可以并行处理输入序列的各个位置，大大提高了计算效率。而且通过多层的深度堆叠，能够学习到更复杂和抽象的特征表示。本文将利用Python代码来实现美股价格的预测模拟。

话不多说，代码如下：

import numpy as np
import pandas as pd
import os, datetime
import tensorflow as tf
from tensorflow.keras.models import *
from tensorflow.keras.layers import *
print('Tensorflow version: {}'.format(tf.__version__))

import matplotlib.pyplot as plt
plt.style.use('seaborn')

import warnings
warnings.filterwarnings('ignore')

physical_devices = tf.config.list_physical_devices()
print('Physical devices: {}'.format(physical_devices))

# Filter out the CPUs and keep only the GPUs
gpus = [device for device in physical_devices if 'GPU' in device.device_type]

# If GPUs are available, set memory growth to True
if len(gpus) > 0:
    tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
    tf.config.experimental.set_memory_growth(gpus[0], True)
    print('GPU memory growth: True')

Tensorflow version: 2.9.1
Physical devices: [PhysicalDevice(name=’/physical_device:CPU:0′, device_type=’CPU’)]

Hyperparameters

batch_size = 32
seq_len = 128

d_k = 256
d_v = 256
n_heads = 12
ff_dim = 256

Load IBM data

IBM_path = 'IBM.csv'

df = pd.read_csv(IBM_path, delimiter=',', usecols=['Date', 'Open', 'High', 'Low', 'Close', 'Volume'])

# Replace 0 to avoid dividing by 0 later on
df['Volume'].replace(to_replace=0, method='ffill', inplace=True) 
df.sort_values('Date', inplace=True)
df.tail()
 df.head()

# print the shape of the dataset
print('Shape of the dataframe: {}'.format(df.shape))
Shape of the dataframe: (14588, 6)

Plot daily IBM closing prices and volume

fig = plt.figure(figsize=(15,10))
st = fig.suptitle("IBM Close Price and Volume", fontsize=20)
st.set_y(0.92)

ax1 = fig.add_subplot(211)
ax1.plot(df['Close'], label='IBM Close Price')
ax1.set_xticks(range(0, df.shape[0], 1464))
ax1.set_xticklabels(df['Date'].loc[::1464])
ax1.set_ylabel('Close Price', fontsize=18)
ax1.legend(loc="upper left", fontsize=12)

ax2 = fig.add_subplot(212)
ax2.plot(df['Volume'], label='IBM Volume')
ax2.set_xticks(range(0, df.shape[0], 1464))
ax2.set_xticklabels(df['Date'].loc[::1464])
ax2.set_ylabel('Volume', fontsize=18)
ax2.legend(loc="upper left", fontsize=12)

Calculate normalized percentage change of all columns

'''Calculate percentage change'''

df['Open'] = df['Open'].pct_change() # Create arithmetic returns column
df['High'] = df['High'].pct_change() # Create arithmetic returns column
df['Low'] = df['Low'].pct_change() # Create arithmetic returns column
df['Close'] = df['Close'].pct_change() # Create arithmetic returns column
df['Volume'] = df['Volume'].pct_change()

df.dropna(how='any', axis=0, inplace=True) # Drop all rows with NaN values

###############################################################################
'''Create indexes to split dataset'''

times = sorted(df.index.values)
last_10pct = sorted(df.index.values)[-int(0.1*len(times))] # Last 10% of series
last_20pct = sorted(df.index.values)[-int(0.2*len(times))] # Last 20% of series

###############################################################################
'''Normalize price columns'''
#
min_return = min(df[(df.index < last_20pct)][['Open', 'High', 'Low', 'Close']].min(axis=0))
max_return = max(df[(df.index < last_20pct)][['Open', 'High', 'Low', 'Close']].max(axis=0))

# Min-max normalize price columns (0-1 range)
df['Open'] = (df['Open'] - min_return) / (max_return - min_return)
df['High'] = (df['High'] - min_return) / (max_return - min_return)
df['Low'] = (df['Low'] - min_return) / (max_return - min_return)
df['Close'] = (df['Close'] - min_return) / (max_return - min_return)

###############################################################################
'''Normalize volume column'''

min_volume = df[(df.index < last_20pct)]['V