cc和毫升换算_毫升模型可以读取库存图表并预测价格

最新推荐文章于 2024-10-04 08:54:00 发布

weixin_26750481

最新推荐文章于 2024-10-04 08:54:00 发布

阅读量269

点赞数

文章标签：机器学习 python tensorflow nlp 人工智能

原文链接：https://towardsdatascience.com/can-an-ml-model-read-stock-charts-and-predict-prices-fb73c551c7a4

版权

cc和毫升换算

Note from Towards Data Science’s editors: While we allow independent authors to publish articles in accordance with our rules and guidelines, we do not endorse each author’s contribution. You should not rely on an author’s works without seeking professional advice. See our Reader Terms for details.

Towards Data Science编辑的注意事项： 尽管我们允许独立作者按照我们的 规则和指南 发表文章 ，但我们不认可每位作者的贡献。 您不应在未征求专业意见的情况下依赖作者的作品。 有关 详细信息， 请参见我们的 阅读器条款 。

Most of the ML models out there are trying to predict stock prices (or the changes in stock prices) using historical price data and other technical indicators — i.e., NUMERIC INPUTS. But, I asked myself, why can’t an ML model replicate precisely how a human trades in the stock markets?

那里的大多数ML模型都试图使用历史价格数据和其他技术指标(例如NUMERIC INPUTS)来预测股价(或股价变化)。但是，我问自己，为什么机器学习模型不能精确地复制人类 在股票市场的交易方式？

Any average person would open up the stock price charts (candlestick charts), and try to find some patterns in the price data. Maybe next, apply some technical indicators, peek a look at the company fundamentals, and finally make a decision on the next movement of the stock price as a combined output of all the charted and numeric inputs. What if an ML model could also make these conclusions from visual information?

任何普通人都会打开股票价格图表( 烛台图表 )，并尝试在价格数据中找到一些模式。也许接下来，应用一些技术指标，窥探公司基本面，最后决定股价的下一个走势，作为所有图表和数值输入的组合输出。 如果ML模型也可以从视觉信息中得出这些结论怎么办？

So let’s pass the price history time series as visual information to a CNN model as input. Will the model recognize the patterns in the visual data like a human does? Will this model perform better than a model that takes numeric data as input? Well, let’s find out!

因此，让我们将价格历史记录时间序列作为可视信息传递给CNN模型作为输入。模型会像人类一样识别视觉数据中的模式吗？该模型的性能是否会比将数字数据作为输入的模型更好？好吧，让我们找出答案！

Before we start, in my last article, we explore why using LSTMs blindly to predict stock prices is going to land you in trouble. You can read that here. Already read that? Good, now we can proceed.

在开始之前，我们在上一篇文章中探讨了为什么盲目地使用LSTM来预测股价会给您带来麻烦。您可以在这里阅读。已经看过了吗？好，现在我们可以继续。

And next, a disclaimer. None of this is financial advice, nor should you implement it directly in practice. Consider this as an experiment and nothing more. Cool, now this aside, we can get on with our experimenting!

接下来是免责声明。 这些都不是财务建议，也不应该在实践中直接实施。 将此视为实验，仅此而已。 太酷了，现在撇开这个，我们可以继续进行实验！

Instead of predicting stock prices, we shall predict the direction of price movement for the next day, i.e., whether the stock price will go up or down or sideways. Let’s get started!

代替预测股价，我们将预测第二天的价格走势 ，即股价是上涨还是下跌还是横盘整理。让我们开始吧！

The flow of this article is as follows:

本文的流程如下：

Get historical stock data in python.
在python中获取历史股票数据。
Convert the price data to a visual representation
将价格数据转换为可视化表示
Build and train a model with Tensorflow Keras.
使用Tensorflow Keras构建和训练模型。
Predict and interpret the results.
预测并解释结果。

步骤1：获取库存数据 (Step 1: Get Stock Data)

There are multiple options to get access to historical stock prices in python, but one of the most straightforward libraries is yfinance. Quite convenient and free, it gets the job done by scraping data from yahoo finance.

使用python可以查看历史股票价格的方法有多种，但是最简单的库之一是yfinance 。它非常方便和免费，它可以通过从yahoo Finance抓取数据来完成工作。

!pip install yfinance## Import the required libraries
import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as pltfrom pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

For this article, I will take up the stock prices of ‘Reliance,’ the largest corporation in the Indian stock markets, but feel free to try this out for any other instrument as well.

在本文中，我将介绍印度股票市场上最大的公司“ Reliance”的股价，但也可以将其用于其他任何工具。

The following piece of code downloads stock price data for Reliance a period of around 18 years (I am intentionally avoiding the COVID period) with a resolution of 1 day and stores it in a pandas dataframe. You can vary these parameters as you deem fit for your experiments.

以下代码下载分辨率为1天的Reliance大约18年(我有意避免使用COVID期间)的股票价格数据，并将其存储在pandas数据框中。您可以根据自己的实验来更改这些参数。

Printing out the head of the pandas dataframe, you can see the various parameters like ‘Open’, ‘Close’, ‘High’, ‘Low’, ‘Volume’ available for the stock data.

打印出熊猫数据框的头部，您可以看到适用于股票数据的各种参数，例如“ Open”，“ Close”，“ High”，“ Low”，“ Volume”。

Image for post — Pandas dataframe top for Reliance stock data

Plotting out the ‘Close’ price to visualize the data, see how nicely the stock has risen in the last few years.

绘制“收盘价”以可视化数据，看看过去几年中该股的上涨情况如何。

Now we shall add some additional features to our data. First, calculate the percentage change from the previous day close. Next based this change, generate a signal which is:

现在，我们将向数据添加一些其他功能。首先，计算前一天收盘价的百分比变化。接下来根据此更改，生成一个信号：

0 (if the change is lesser than -2.5%)
0(如果变化小于-2.5％)
1 (if the change is between -2.5% and -1%)
1(如果变化介于-2.5％和-1％之间)
2 (if the change is between -1% and 1%)
2(如果变化在-1％和1％之间)
3 (if the change is between 1% and 2.5%)
3(如果变化介于1％和2.5％之间)
4 (if the change is greater than 2.5%)
4(如果更改大于2.5％)

The dataframe now looks like:

数据框现在看起来像：

Now, this is completely arbitrary, you can vary it as you please. You can break the change into an even more/lesser number of parts (adjust the last layer of the model accordingly).

现在，这完全是任意的，您可以随意更改。 您可以将更改分为更多/更少的零件(相应地调整模型的最后一层)。

步骤2：股票价格的直观表示 (Step 2: Visual Representation of Stock Prices)

Now, this step is very important, since it will decide what the model sees. For this article, I want to keep the images as close as possible to the candlestick charts a human trader sees. A constraint is that every input image has to be of the same dimensions and must contain sufficient information for the model to make conclusions from.

现在，此步骤非常重要，因为它将决定模型的外观。对于本文，我想使图像尽可能地接近交易员看到的烛台图。一个约束是每个输入图像必须具有相同的尺寸，并且必须包含足够的信息以供模型做出结论。

We shall have a lookback period, for example, the past 50 days (10 trading weeks), which shall be represented. For simplicity, for each of these 50 days, the open and close, and the direction of the price movement will be encoded.

我们将有一个回溯期 ，例如过去的50天 (10个交易周)。为简单起见，将对这50天的开盘价和收盘 价以及价格走势进行编码。

Since images of the same dimension will be fed to the neural network, it might not understand where this image stands in the entirety of the 15 years. For example, in 2002, the prices are at ~80–90 whereas they slowly rise with time to ~1800. Although price patterns may (or may not) be independent of the absolute stock prices, I wanted to encode some of this information as well in the images. So, to give the image some context to the past prices and their absolute price levels, the 50-day price representations will be scaled in a bigger time window which considers additional previous price values.

由于相同尺寸的图像将被馈送到神经网络，因此可能无法理解该图像在整个15年中的位置。例如，在2002年，价格大约为80-90，而随着时间的流逝，价格缓慢上升至大约1800。尽管价格模式可能(或可能不)独立于绝对股票价格，但我还是希望在图像中对某些信息进行编码。因此，为了使图像具有过去价格及其绝对价格水平的背景信息，将在更大的时间范围内缩放50天价格表示形式，该时间范围考虑了其他先前的价格值。

The code for generating this visual representation is:

用于生成此视觉表示的代码为：

Code to generate an image representation from stock price data

从股票价格数据生成图像表示的代码

Note: This is a visual representation very close to what we see on the candlestick charts (that was the whole point). However, you can use your imagination (and some discretion) to create a completely different visual representation with other encoded parameters as well. Because the way a CNN model sees and learns from an image might (or might not) be very different from how we do!

注意：这是非常接近我们在烛台图表上看到的直观表示(这是重点)。 但是，您也可以运用自己的想象力(和某些判断力)来创建具有其他编码参数的完全不同的视觉表示。 因为CNN模型查看图像和从图像中学习的方式可能(也可能不会)与我们的做法大不相同！

Next, let’s create a data generator that will iterate through the 15 years of data and create pairs of images and the corresponding next-day-predictions.

接下来，让我们创建一个数据生成器 ，该数据生成器将迭代15年的数据，并创建图像对和相应的次日预测。

This data generates image-prediction batches like

此数据生成图像预测批次，例如

步骤3：建立和训练ML模型 (Step 3: Building and Training an ML Model)

Since we have inputs as images and require outputs as one of three classes (up, down, no movement), our model will have a few convolutional layers, followed by a few dense layers and finally a softmax function.

由于我们将输入作为图像并且将输出作为三类(向上，向下，无运动)之一，因此我们的模型将具有一些卷积层，然后是一些密集层，最后是softmax函数。

Let’s initiate the generator objects and start the training.

让我们启动生成器对象并开始训练。

Initialize all the data generator objects and start training the model

初始化所有数据生成器对象并开始训练模型

步骤4：可视化结果 (Step 4: Visualizing the Results)

Post-training, the losses can be plotted as:

训练后，损失可以绘制为：

plt.plot(history.history['loss'], label='train')
plt.plot(history.history['val_loss'], label='val')
plt.legend()
plt.show()

Let’s now see the accuracy on the test data.

现在让我们看一下测试数据的准确性。

## Evaluating the performance of the model
print(model.evaluate(test_gen,steps = (len(data)-split_test)//batch_size))>>> loss: 1.3830 - accuracy: 0.4375

Well, given 5 possible outcomes, the accuracy of randomly guessing one would be 0.2. So this model seems to have performed quite well comparatively. Or has it?

好吧，给定5个可能的结果，随机猜测一个的准确性为0.2。因此，该模型相对而言似乎表现良好。还是有？

Plotting out the predicted outputs, it’s clear what has happened. The model has not learned anything useful yet, so it ends up predicting “2” as every output resulting in relatively high accuracy.

绘制出预测的输出，很清楚发生了什么。该模型尚未学到任何有用的东西，因此最终预测每个输出为“ 2”，从而导致相对较高的准确性。

Let’s train the model some more then. But now, the validation loss increases greatly. Plus, testing gives an accuracy of less than 0.3. Not very encouraging, the model has probably overfitted on the training data. A very clear reason for this is the small size of training data compared to the model size.

让我们再训练一下模型。但是现在，验证损失大大增加了。另外，测试得出的准确度小于0.3。不太令人鼓舞，该模型可能过度拟合了训练数据。一个很明显的原因是训练数据与模型大小相比较小 。

结论词... (Concluding Words…)

Although the performance was not extremely encouraging, this was quite an innovative approach of looking at the age-old problem of predicting stock prices. A few more ideas on these lines are giving multi-timeframe inputs (say, weekly data as inputs as well), or additionally providing technical indicators as numeric data (in parallel with visual charts), or transform the price data into other visual domains (doesn’t have to be the charts as we see them), etc. What else do you think we can do to improve this model? Do you have any other interesting ideas?

尽管表现并不十分令人鼓舞，但这是一种创新的方法，可以解决预测股票价格的古老问题。这些方面的其他一些想法是提供多时间框架输入(例如，每周数据也作为输入)，或者另外提供技术指标作为数字数据(与可视化图表并行)，或者将价格数据转换为其他可视域(不必一定是我们看到的图表)，等等。您认为我们还可以做些什么来改进此模型？您还有其他有趣的想法吗？

After trying out all these ideas, the question remains — is feeding visual data providing any edge that numeric data cannot? What do you think?

在尝试了所有这些想法之后，问题仍然存在–馈送视觉数据是否提供了数字数据无法提供的优势？ 你怎么看？

You can find the entire code in my GitHub repository.

您可以在我的GitHub存储库中找到整个代码。

Finally, I am not saying I am an expert in these fields — I am just putting forward my explorations on this topic, so feel free to point out my errors or add anything I missed. I would love to hear your feedback.

最后，我并不是说我是这些领域的专家—我只是在提出关于该主题的探索，因此可以随时指出我的错误或添加我遗漏的任何内容。 我希望听到您的反馈。

Do you want to be rich overnight using ML in stocks? This article is (NOT) for you!

您想在一夜之间使用ML股票来致富吗？ 这篇文章不适合您！

Or check out some of my other machine learning articles. I’m sure you’ll find them useful…

或查看其他一些机器学习文章。 我相信您会发现它们很有用……