深度学习第一课第四周_我的第一个深度学习项目

最新推荐文章于 2022-09-21 19:21:10 发布

weixin_26704853

最新推荐文章于 2022-09-21 19:21:10 发布

阅读量361

点赞数

文章标签：深度学习人工智能 python 机器学习 tensorflow

原文链接：https://medium.com/@mateusnobrests/my-first-deep-learning-project-3f231b47c74c

版权

深度学习第一课第四周

Looking for patterns on stock market using sentiment analysis

使用情绪分析寻找股市模式

TL; DR： (TL;DR:)

It’s pretty difficult to notice patterns on stock market. Correlating it with financial news sentiment analysis can give us a glance of those patterns.
很难注意到股票市场的格局。金融新闻情绪分析 关联它可以给我们这些模式一目了然 。
Transfer Learning is a extremely useful technique when you have limited time and resources.
当您的时间和资源有限时， 转学是一项非常有用的技术。
Interpreting the results of the model and translating it to real life context is a very important and hard part of the process.
解释模型的结果并将其转换为现实环境是该过程中非常重要且困难的部分。
Working alone is great, but on a team you can do bigger things and take a bit of knowledge from each person.
独自工作固然很棒 ，但在团队中，您可以做更大的事情，并从每个人那里学到一些知识 。

Before everything..

一切之前

对我的队友大声喊叫： (Huge SHOUTOUT to my teammates:)

João Sarmento (https://www.linkedin.com/in/joaolrsarmento/)

JoãoSarmento( https://www.linkedin.com/in/joaolrsarmento/ )

Kenji Yamane(https://github.com/kenji-yamane)

山根贤治( https://github.com/kenji-yamane )

非常感谢我的教授： (Huge THANKS to my professor:)

Marcos Ricardo Omena de AlbuquerqueMáximo( Escavador ) (Marcos Ricardo Omena de Albuquerque Máximo (Escavador))

问题： (Problem:)

The original idea of the project changed a lot. But the major idea was to identify patterns on stock market and correlate it with sentiment analysis of news.

该项目的最初想法发生了很大变化。但是主要思想是确定股票市场的模式，并将其与新闻的情绪分析相关联。

The success criteria was generate valuable insights and get a proof of concept on real life (a $0.10 profit for example).

成功的标准是产生有价值的见解并获得现实生活中的概念证明(例如，0.10美元的利润)。

解： (Solution:)

We used a dataset from Amazon Reviews to train a model from BERT on sentiment analysis, applying transfer learning to our problem.

我们使用来自Amazon Reviews的数据集来训练来自BERT的情绪分析模型，并将转移学习应用于我们的问题。

Financial news about Netflix, Amazon, Facebook and other companies were taken using Stock News API and the historical data using Nasdaq website.

使用Stock News API获取有关Netflix，Amazon，Facebook和其他公司的财经新闻，并使用Nasdaq网站获取历史数据。

Using a “degree of positivity” metric, created for us to be able to see the patterns, we merged data from financial news sentiment analysis and Nasdaq.

使用为我们能够看到模式而创建的“积极程度”度量，我们合并了来自金融新闻情绪分析和纳斯达克的数据。

The graphs for analyzing results are time-series of our metric and the fluctuations of the market.

用于分析结果的图表是我们指标的时间序列和市场波动。

模型： (Model:)

BERT, Bidirectional Encoder Representation for Transformers, is designed to train deeply bidirectional representations by considering both left and right context. We used Transfer Learning to our problem from the BERT base model (109M parameters).

BERT，变压器的双向编码器表示 ，旨在通过考虑左右上下文来训练深层的双向表示。我们使用BERT基本模型(109M参数)对问题进行了转移学习。

结果： (Results:)

Image for post — Fine-tuning result after 2 epochs

The model improved a lot after 2 epochs, so our fear of getting stuck and don’t get results using transfer learning was unfounded. We left everything to the last minute, so we had very little time to train the model.

该模型在2个时期后有了很大的改进，因此我们担心被转移学习所困和无法获得结果的担心是没有根据的。我们把所有的事情都留到了最后一分钟，所以我们只有很少的时间来训练模型。

These 2 graphs show some of our results. The most promising result was on the daily trade volume, but even that can’t tell us a story about the market.

这两个图显示了我们的一些结果。最有希望的结果是在每日交易量上，但是即使这样也不能告诉我们有关市场的故事。

On the volume graph, we can notice that bigger fluctuations on ‘degree of positivity’ are sometimes related to bigger traded volumes. But the data are not able to relate to that big booms on traded volume.

在交易量图中，我们可以注意到，“积极程度”的较大波动有时与交易量较大有关。但是数据无法与交易量的大幅增长相关。

我做了什么： (What I Did:)

I was in charge of:

我负责：

Getting the results of the model on stock news.
在股票新闻上获取模型的结果。
Interpreting the results,
解释结果，
Cross data from Nasdaq to the sentiment analysis on a meaningful way and provide insights.
从纳斯达克以有意义的方式将数据交叉到情绪分析并提供见解。
Getting real time predictions with our model using Gradio
使用Gradio使用我们的模型获取实时预测
Orchestrate it on a project, so anyone can use our work and reproduce our results (that part really takes time)
将其编排在一个项目上，这样任何人都可以使用我们的工作并重现我们的结果(那部分确实需要时间)

I got the trained model.h5 from Kenji and the news from João and did my job on top of their work.

我从Kenji获得了受过训练的模型。h5从João获得了新闻，并在他们的工作之上完成了自己的工作。

我们本可以做得更好的： (What we could’ve done better:)

A decent statistical analysis on the stock market data, trying to understand days with large variation, min and max and seasonality.
对股市数据进行了体面的统计分析，试图了解变化较大，最小，最大和季节性的日子。
Use a faster version of the model like TinyBERT to be able to process more data.
使用诸如TinyBERT的更快版本的模型可以处理更多数据。
Fine tuning on financial news data to better ‘transfer learning’ from BERT.
对金融新闻数据进行微调，以更好地从BERT进行“转移学习”。
Analyzing the results more carefully, trying differents visualizations and ways to build that ‘degree of positivity’ metric.
更仔细地分析结果，尝试使用差异化可视化方法和构建“阳性程度”指标的方法。
Train more.
训练更多。

有关项目的上下文： (Context about the project:)

I enrolled on CT-213 (Artificial Intelligence in Mobile Robotics) that semester. With no experience in python, but a lot of willpower and help of other colleagues (as the ones that developed that project with me), i struggled and learned a lot!

我上学期参加了CT-213(移动机器人人工智能)课程。没有python经验，但是有很多毅力和其他同事的帮助(与我一起开发该项目的同事)，我奋斗了很多，学到了很多东西！

From gradient descent to genetic algorithms and reinforcement learning, i learned about the possibilities and limitations of neural networks, the math behind it and how to apply it on toy problems.

从梯度下降到遗传算法和强化学习，我学习了神经网络的可能性和局限性，其背后的数学知识以及如何将其应用于玩具问题。