量化回测框架-backtrader-强化学习-chainerrl

本文介绍如何安装并应用backtrader量化回测框架,结合chainerrl实现强化学习策略。内容包括在Jupyter Notebook环境中设置和运行回测,并在云端Azure Notebook平台的运行情况。
摘要由CSDN通过智能技术生成
  1. 安装框架-backtrader, chainerrl
pip install -U backtrader
pip install -U chainerrl
  1. 导入强化学习包和相关依赖
    以下代码运行在jupter notebook, 在云端azure notebook运行
import chainer
import chainerrl
# Strategy

from datetime import datetime
import backtrader
import random

# Integrate Model
import sys
import warnings
import numpy
import pandas
warnings.filterwarnings('ignore')
# Build Instance and draw single plot
#為了將GUI圖形顯示在 Jupyter notebook 
%matplotlib inline
from mpl_toolkits import mplot3d
from mpl_toolkits.mplot3d import Axes3D
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot
obs_size = 1500
n_actions = 3

# Instance of Value function Q from "chainerrl.q_functions.FCStateQFunctionWithDiscreteAction"
I_Am_Q_Function = chainerrl.q_functions.FCStateQFunctionWithDiscreteAction(obs_size, n_actions,n_hidden_layers=7, n_hidden_channels=512)
# Instance of training optimizer from chainer.optimizers
optimizer = chainer.optimizers.Adam(eps=1e-2)
# set the oprimizer 
optimizer.setup(I_Am_Q_Function)
# Set the discount factor that discounts future rewards.
gamma = 0.95

# Use epsilon-greedy for exploration
def Im_RandomInterger_Function(Interger_Range_Start=0 ,Interger_Range_End=2):
    
    return random.randint(Interger_Range_Start, Interger_Range_End)
    

explorer = chainerrl.explorers.ConstantEpsilonGreedy(epsilon=0.3, random_action_func=Im_RandomInterger_Function)

# DQN uses Experience Replay.
# Specify a replay buffer and its capacity.
replay_buffer = chainerrl.replay_buffer.ReplayBuffer(capacity=10 ** 6)

# Since observations from CartPole-v0 is numpy.float64 while
# Chainer only accepts numpy.float32 by default, specify
# a converter as a feature extractor function phi.
phi = lambda x: x.astype(numpy.float32, copy=False)

# Now create an agent that will interact with the environment.
I_am_DQN_Agent = chainerrl.agents.DoubleDQN(
    I_Am_Q_Function, optimizer, replay_buffer, gamma, explorer,
    replay_start_size=500, update_interval=1,
    target_update_interval=100, phi=phi)
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值