借助 Transformer 实现美股价格的预测(Python干货)

作者:老余捞鱼

原创不易,转载请标明出处及原作者。

写在前面的话:

          Transformer 是一种在自然语言处理等领域广泛应用的深度学习架构,与传统的循环神经网络(RNN)相比,Transformer 可以并行处理输入序列的各个位置,大大提高了计算效率。而且通过多层的深度堆叠,能够学习到更复杂和抽象的特征表示。本文将利用Python代码来实现美股价格的预测模拟。

话不多说,代码如下:

import numpy as np
import pandas as pd
import os, datetime
import tensorflow as tf
from tensorflow.keras.models import *
from tensorflow.keras.layers import *
print('Tensorflow version: {}'.format(tf.__version__))

import matplotlib.pyplot as plt
plt.style.use('seaborn')

import warnings
warnings.filterwarnings('ignore')

physical_devices = tf.config.list_physical_devices()
print('Physical devices: {}'.format(physical_devices))

# Filter out the CPUs and keep only the GPUs
gpus = [device for device in physical_devices if 'GPU' in device.device_type]

# If GPUs are available, set memory growth to True
if len(gpus) > 0:
    tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
    tf.config.experimental.set_memory_growth(gpus[0], True)
    print('GPU memory growth: True')

Tensorflow version: 2.9.1
Physical devices: [PhysicalDevice(name=’/physical_device:CPU:0′, device_type=’CPU’)]

      Hyperparameters

  • batch_size = 32
    seq_len = 128
    
    d_k = 256
    d_v = 256
    n_heads = 12
    ff_dim = 256

    Load IBM data

    IBM_path = 'IBM.csv'
    
    df = pd.read_csv(IBM_path, delimiter=',', usecols=['Date', 'Open', 'High', 'Low', 'Close', 'Volume'])
    
    # Replace 0 to avoid dividing by 0 later on
    df['Volume'].replace(to_replace=0, method='ffill', inplace=True) 
    df.sort_values('Date', inplace=True)
    df.tail()
     df.head()

    # print the shape of the dataset
    print('Shape of the dataframe: {}'.format(df.shape))
    Shape of the dataframe: (14588, 6)

    Plot daily IBM closing prices and volume

    fig = plt.figure(figsize=(15,10))
    st = fig.suptitle("IBM Close Price and Volume", fontsize=20)
    st.set_y(0.92)
    
    ax1 = fig.add_subplot(211)
    ax1.plot(df['Close'], label='IBM Close Price')
    ax1.set_xticks(range(0, df.shape[0], 1464))
    ax1.set_xticklabels(df['Date'].loc[::1464])
    ax1.set_ylabel('Close Price', fontsize=18)
    ax1.legend(loc="upper left", fontsize=12)
    
    ax2 = fig.add_subplot(212)
    ax2.plot(df['Volume'], label='IBM Volume')
    ax2.set_xticks(range(0, df.shape[0], 1464))
    ax2.set_xticklabels(df['Date'].loc[::1464])
    ax2.set_ylabel('Volume', fontsize=18)
    ax2.legend(loc="upper left", fontsize=12)

    Calculate normalized percentage change of all columns

    '''Calculate percentage change'''
    
    df['Open'] = df['Open'].pct_change() # Create arithmetic returns column
    df['High'] = df['High'].pct_change() # Create arithmetic returns column
    df['Low'] = df['Low'].pct_change() # Create arithmetic returns column
    df['Close'] = df['Close'].pct_change() # Create arithmetic returns column
    df['Volume'] = df['Volume'].pct_change()
    
    df.dropna(how='any', axis=0, inplace=True) # Drop all rows with NaN values
    
    ###############################################################################
    '''Create indexes to split dataset'''
    
    times = sorted(df.index.values)
    last_10pct = sorted(df.index.values)[-int(0.1*len(times))] # Last 10% of series
    last_20pct = sorted(df.index.values)[-int(0.2*len(times))] # Last 20% of series
    
    ###############################################################################
    '''Normalize price columns'''
    #
    min_return = min(df[(df.index < last_20pct)][['Open', 'High', 'Low', 'Close']].min(axis=0))
    max_return = max(df[(df.index < last_20pct)][['Open', 'High', 'Low', 'Close']].max(axis=0))
    
    # Min-max normalize price columns (0-1 range)
    df['Open'] = (df['Open'] - min_return) / (max_return - min_return)
    df['High'] = (df['High'] - min_return) / (max_return - min_return)
    df['Low'] = (df['Low'] - min_return) / (max_return - min_return)
    df['Close'] = (df['Close'] - min_return) / (max_return - min_return)
    
    ###############################################################################
    '''Normalize volume column'''
    
    min_volume = df[(df.index < last_20pct)]['V
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

老余捞鱼

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值