深度学习用于股票预测_用于自动股票交易的深度强化学习

最新推荐文章于 2024-07-26 11:49:10 发布

weixin_26704853

最新推荐文章于 2024-07-26 11:49:10 发布

阅读量9.2k

点赞数 12

文章标签：机器学习深度学习 python 人工智能 tensorflow

原文链接：https://towardsdatascience.com/deep-reinforcement-learning-for-automated-stock-trading-f1dad0126a02

版权

本文探讨了如何利用深度强化学习进行自动股票交易。通过深度学习模型，可以解析复杂的市场趋势，以做出更精准的股票预测。该技术结合了机器学习、深度学习和Python编程，使用TensorFlow等工具实现智能交易策略。

摘要由CSDN通过智能技术生成

深度学习用于股票预测

Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without seeking professional advice. See our Reader Terms for details.

Towards Data Science编辑的注意事项： 尽管我们允许独立作者按照我们的 规则和指南 发表文章 ，但我们不认可每位作者的贡献。 您不应在未征求专业意见的情况下依赖作者的作品。 有关 详细信息， 请参见我们的 阅读器条款 。

This blog is based on our paper: Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy, presented at ICAIF 2020: ACM International Conference on AI in Finance.

该博客基于我们的论文： 《用于自动股票交易的深度强化学习：整体策略》 ，在ICAIF 2020 ：ACM金融人工智能国际会议上发表。

Our codes are available on Github.

我们的代码可在Github上找到。

Our paper will be available on arXiv soon.

我们的论文即将在arXiv上发布。

If you want to cite our paper, the reference format is as follows:

如果您想引用我们的论文，参考格式如下：

Hongyang Yang, Xiao-Yang Liu, Shan Zhong, and Anwar Walid. 2020. Deep Reinforcement Learning for Automated Stock Trading: An Ensemble Strategy. In ICAIF ’20: ACM International Conference on AI in Finance, Oct. 15–16, 2020, Manhattan, NY. ACM, New York, NY, USA.

杨洪阳，刘晓阳，山中和安华·瓦利德(Anwar Walid)。 2020年。《自动交易的深度强化学习：整体策略》。在ICAIF '20：ACM金融人工智能国际会议上，2020年10月15日至16日，纽约曼哈顿。美国纽约州ACM。

总览 (Overview)

One can hardly overestimate the crucial role stock trading strategies play in investment.

人们几乎不能高估股票交易策略在投资中的关键作用。

Profitable automated stock trading strategy is vital to investment companies and hedge funds. It is applied to optimize capital allocation and maximize investment performance, such as expected return. Return maximization can be based on the estimates of potential return and risk. However, it is challenging to design a profitable strategy in a complex and dynamic stock market.

获利的自动股票交易策略对投资公司和对冲基金至关重要。它用于优化资本分配和最大化投资绩效，例如预期收益。收益最大化可以基于潜在收益和风险的估计。但是，在复杂而动态的股票市场中设计一种有利可图的战略是一项挑战。

Every player wants a winning strategy. Needless to say, a profitable strategy in such a complex and dynamic stock market is not easy to design.

每个玩家都希望有一个获胜的策略。毋庸置疑，在如此复杂而动态的股票市场中，要制定一项有利可图的策略并不容易。

Yet, we are to reveal a deep reinforcement learning scheme that automatically learns a stock trading strategy by maximizing investment return.

但是，我们将揭示一种深度强化学习方案，该方案可以通过最大化投资回报来自动学习股票交易策略。

Image for post — **Suhyeon** **Suhyeon提供** on 在未 Unsplash 飞溅

Our Solution: Ensemble Deep Reinforcement Learning Trading StrategyThis strategy includes three actor-critic based algorithms: Proximal Policy Optimization (PPO), Advantage Actor Critic (A2C), and Deep Deterministic Policy Gradient (DDPG).It combines the best features of the three algorithms, thereby robustly adjusting to different market conditions.

我们的解决方案 ：整合深度强化学习交易策略此策略包括三种基于行为者批评的算法：近距离策略优化(PPO)，优势行为者批评者(A2C)和深度确定性策略梯度(DDPG)。 它结合了三种算法的最佳功能，从而可以稳健地适应不同的市场条件。

The performance of the trading agent with different reinforcement learning algorithms is evaluated using Sharpe ratio and compared with both the Dow Jones Industrial Average index and the traditional min-variance portfolio allocation strategy.

使用夏普比率评估具有不同强化学习算法的交易代理商的绩效，并与道琼斯工业平均指数和传统的最小方差投资组合分配策略进行比较。

第1部分。为什么要在股票交易中使用深度强化学习(DRL)？ (Part 1. Why do you want to use Deep Reinforcement Learning (DRL) for stock trading?)

Existing works are not satisfactory. Deep Reinforcement Learning approach has many advantages.

现有作品不令人满意。深度强化学习方法具有许多优势。

1.1 DRL和现代投资组合理论(MPT) (1.1 DRL and Modern Portfolio Theory (MPT))

MPT performs not so well in out-of-sample data.
MPT在样本外数据中表现不佳。
MPT is very sensitive to outliers.
MPT 对异常值非常敏感。
MPT is calculated only based on stock returns, if we want to take other relevant factors into account, for example some of the technical indicators like Moving Average Convergence Divergence (MACD), and Relative Strength Index (RSI), MPT may not be able to combine these information together well.
MPT 仅基于股票收益进行计算，如果我们要考虑其他相关因素 ，例如某些技术指标，例如移动平均收敛散度(MACD)和相对强度指数(RSI) ，MPT可能无法将这些信息很好地结合在一起。

1.2 DRL和监督式机器学习预测模型 (1.2 DRL and supervised machine learning prediction models)

DRL doesn’t need large labeled training datasets. This is a significant advantage since the amount of data grows exponentially today, it becomes very time-and-labor-consuming to label a large dataset.
DRL不需要大型的标签训练数据集 。这是一个重要的优势，因为如今的数据量呈指数增长，因此标记大型数据集变得非常耗时且费力。
DRL uses a reward function to optimize future rewards, in contrast to an ML regression/classification model that predicts the probability of future outcomes.
与预测未来结果可能性的ML回归/分类模型相比，DRL使用奖励函数来优化未来奖励。

1.3 采用DRL股票交易中的R ationale (1.3 The rationale of using DRL for stock trading)

The goal of stock trading is to maximize returns, while avoiding risks. DRL solves this optimization problem by maximizing the expected total reward from future actions over a time period.
股票交易的目的是在避免风险的同时最大化回报 。 DRL通过最大化一段时间内来自未来行动的预期总回报来解决此优化问题。
Stock trading is a continuous process of testing new ideas, getting feedback from the market, and trying to optimize the trading strategies over time. We can model stock trading process as Markov decision process which is the very foundation of Reinforcement Learning.
股票交易是一个不断测试新想法，从市场上获得反馈以及不断优化交易策略的连续过程 。我们可以将股票交易过程建模为马尔可夫决策过程 ，这是强化学习的基础。

1.4 深度强化学习的优势 (1.4 The advantages of deep reinforcement learning)

Deep reinforcement learning algorithms can outperform human players in many challenging games. For example, in March 2016, DeepMind’s AlphaGo program, a deep reinforcement learning algorithm, beat the world champion Lee Sedol at the game of Go.
在许多具有挑战性的游戏中，深度强化学习算法可以胜过人类玩家 。例如࿰