python抓取网页关键字
There is a plethora of financial data available nowadays and seemingly even more places to source that data from. There are countless different methods to go about gathering data, many of which require third party API’s which must be installed to your system in order to make the necessary API calls. In this quick notebook walkthrough, we will demonstrate how to perform a simple JSON web scrape to fetch the data and then organize it into a pandas DataFrame. We will then use the Python library Plotly to visualize the indicators.
如今,有大量的财务数据可用,而且似乎还有更多的地方可以从中获取数据。 收集数据的方法有无数种,其中许多方法需要第三方API,这些第三方API必须安装到您的系统上才能进行必要的API调用。 在这个快速的笔记本演练中,我们将演示如何执行简单的JSON Web抓取以获取数据,然后将其组织到pandas DataFrame中。 然后,我们将使用Python库Plotly可视化指标。
导入库 (Importing Libraries)
import pandas as pd
import requests
import json
import plotly.graph_objects as go
Next, we must write our web scrape function. We will be using the third party API from DB-nomics (db.nomics.world). The API will return our data in an HTML format via a URL which means we first need to convert it to a JSON format. We use the requests library to do this conversion. At this point, the JSON file is organized as a data dictionary, which means we need to index into the dictionary to grab the actual data. We index this information and organize it into three variables, one for the time-series index (periods), one for our actual data values (values), and one for the over-arching dataset which will be used to wrap our final DataFrame. We then return the DataFrame “indicators”.
接下来,我们必须编写我们的网络抓取功能。 我们将使用DB-nomics(db.nomi