爬虫神经网络_股市筛选和分析:在投资中使用网络爬虫,神经网络和回归分析...

爬虫神经网络

与AI交易 (Trading with AI)

Stock markets tend to react very quickly to a variety of factors such as news, earnings reports, etc. While it may be prudent to develop trading strategies based on fundamental data, the rapid changes in the stock market are incredibly hard to predict and may not conform to the goals of more short term traders. This study aims to use data science as a means to both identify high potential stocks, as well as attempt to forecast future prices/price movement in an attempt to maximize an investor’s chances of success.

股票市场往往会对各种因素(例如新闻,收益报告等)​​做出快速React。尽管基于基本数据制定交易策略可能是谨慎的做法,但股票市场的快速变化却难以预测,而且可能无法预测符合更多短期交易者的目标。 这项研究旨在利用数据科学来识别高潜力股票,并试图预测未来价格/价格走势,以最大程度地提高投资者的成功机会。

In the first half of this analysis, I will introduce a strategy to search for stocks that involves identifying the highest-ranked stocks based on trading volume during the trading day. I will also include information based on twitter and sentiment analysis in order to provide an idea of which stocks have the maximum probability of going up in the near future. The next half of the project will attempt to apply forecasting techniques to our chosen stock(s). I will apply deep learning via a Long short term memory (LSTM) neural network, which is a form of a recurrent neural network (RNN) to predict close prices. Finally, I will also demonstrate how simple linear regression could aid in forecasting.

在本分析的前半部分,我将介绍一种搜索股票的策略,该策略涉及根据交易日内的交易量来确定排名最高的股票。 我还将包括基于推特和情绪分析的信息,以提供有关哪些股票在不久的将来具有最大上涨可能性的想法。 该项目的下半部分将尝试将预测技术应用于我们选择的股票。 我将通过长期短期记忆(LSTM)神经网络应用深度学习,这是递归神经网络(RNN)的一种形式,可以预测收盘价。 最后,我还将演示简单的线性回归如何有助于预测。

第1部分:库存筛选 (Part 1: Stock screening)

Let’s begin by web-scraping data on the most active stocks in a given time period, in this case, one day. Higher trading volume is more likely to result in bigger price volatility which could potentially result in larger gains. The main python packages to help us with this task are the yahoo_fin, alpha_vantage, and pandas libraries.

首先,在给定的时间段(本例中为一天)中,通过网络收集最活跃的股票的数据。 更高的交易量更有可能导致更大的价格波动,从而有可能带来更大的收益。 可以帮助我们完成此任务的主要python软件包是yahoo_finalpha_vantagepandas库。

# Import relevant packages
import yahoo_fin.stock_info as ya
from alpha_vantage.timeseries import TimeSeries
from alpha_vantage.techindicators import TechIndicators
from alpha_vantage.sectorperformance import SectorPerformances
import pandas as pd
import pandas_datareader as web
import matplotlib.pyplot as plt
from bs4 import BeautifulSoup
import requests
import numpy as np# Get the 100 most traded stocks for the trading day
movers = ya.get_day_most_active()
movers.head()

The yahoo_fin package is able to provide the top 100 stocks with the largest trading volume. We are interested in stocks with a positive change in price so let’s filter based on that.

yahoo_fin软件包 能够提供交易量最大的前100只股票。 我们对价格有正变化的股票感兴趣,因此让我们基于此进行过滤。

movers = movers[movers['% Change'] >= 0]
movers.head()
Image for post
Stocks with the largest trading volume for the trading day, filtered by the price change
当日交易量最大的股票,按价格变动过滤

Excellent! We have successfully scraped the data using the yahoo_fin python module. it is often a good idea to see if those stocks are also generating attention, and what kind of attention it is to avoid getting into false rallies. We will scrap some sentiment data courtesy of sentdex. Sometimes sentiments may lag due to source e.g News article published an hour after the event, so we will also utilize tradefollowers for their twitter sentiment data. We will process both lists independently and combine them. For both the sentdex and tradefollowers data we use a 30 day time period. Using a single day might be great for day trading but increases the probability of jumping on false rallies. NOTE: Sentdex only has stocks that belong to the S&P 500. Using the BeautifulSoup library, this process is made fairly simple.

优秀的! 我们已经使用yahoo_fin python模块成功地抓取了数据。 通常,最好查看这些股票是否也引起关注,以及避免引起虚假集会的关注是什么。 我们将根据senddex删除一些情感数据。 有时,情绪可能会由于消息来源而有所滞后,例如在事件发生后一小时发布的新闻文章,因此我们还将利用贸易关注者的推特情绪数据。 我们将独立处理两个列表并将其合并。 对于senddex和tradefollowers数据,我们使用30天的时间段。 使用单日交易对日间交易而言可能很棒,但会增加跳空虚假反弹的可能性。 注意:Sentdex仅拥有属于S&P 500的股票。使用BeautifulSoup库,此过程变得相当简单。

res = requests.get('http://www.sentdex.com/financial-analysis/?tf=30d')
soup = BeautifulSoup(res.text)
table = soup.find_all('tr')# Initialize empty lists to store stock symbol, sentiment and mentionsstock = []
sentiment = []
mentions = []
sentiment_trend = []# Use try and except blocks to mitigate missing data
for ticker in table:
ticker_info = ticker.find_all('td')

try:
stock.append(ticker_info[0].get_text())
except:
stock.append(None)
try:
sentiment.append(ticker_info[3].get_text())
except:
sentiment.append(None)
try:
mentions.append(ticker_info[2].get_text())
except:
mentions.append(None)
try:
if (ticker_info[4].find('span',{"class":"glyphicon glyphicon-chevron-up"})):
sentiment_trend.append('up')
else:
sentiment_trend.append('down')
except:
sentiment_trend.append(None)
Image for post
sentdex senddex获得的情感数据

We then combine these results with our previous results about the most traded stocks with positive price changes on a given day. This done using a left join of this data frame with the original movers data frame

然后,我们将这些结果与我们先前的有关交易最多的股票的先前结果相结合,并得出给定价格的正变化。 使用此数据框与原始移动者数据框的左连接完成此操作

top_stocks = movers.merge(company_info, on='Symbol', how='left')
top_stocks.drop(['Market Cap','PE Ratio (TTM)'], axis=1, inplace=True)
top_stocks
Image for post
A merged data frame containing the biggest movers and their sentiment information (if available)
包含最大推动者及其情绪信息的合并数据帧(如果有)

The movers data frame had a total of 45 stocks but for brevity only 10 are shown here. A couple of stocks pop up with both very good sentiments and an upwards trend in favourability. ZNGA, TWTR, and AES (not shown) for instance stood out as potentially good picks. Note, the mentions here refer to the number of times the stock was referenced according to the internal metrics used by sentdex. Let’s attempt supplementing this information with some data based on twitter. We get stocks that showed the strongest twitter sentiments within a time period of 1 month and were considered bullish.

推动者数据框中共有45种存量,但为简洁起见,此处仅显示10种。 情绪高涨且有利可图的趋势呈上升趋势的几只股票。 例如,ZNGA,TWTR和AES(未显示)脱颖而出,成为潜在的好选择。 请注意,此处提及的内容是指根据senddex使用的内部指标引用股票的次数 。 让我们尝试使用基于Twitter的一些数据来补充此信息。 我们得到的股票在1个月内显示出最强烈的Twitter情绪,并被视为看涨。

  • 1
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值