时间序列预测 时间因果建模_时间序列建模以预测投资基金的回报

本文探讨了如何使用时间序列预测和时间因果建模来预测投资基金的回报,重点介绍了相关技术和方法。
摘要由CSDN通过智能技术生成

时间序列预测 时间因果建模

Time series analysis, discussed ARIMA, auto ARIMA, auto correlation (ACF), partial auto correlation (PACF), stationarity and differencing.

时间序列分析,讨论了ARIMA,自动ARIMA,自动相关(ACF),部分自动相关(PACF),平稳性和微分。

数据在哪里? (Where is the data?)

Most financial time series examples use built-in data sets. Sometimes these data do not attend your needs and you face a roadblock if you don’t know how to get the exact data that you need.

大多数财务时间序列示例都使用内置数据集。 有时,这些数据无法满足您的需求,如果您不知道如何获取所需的确切数据,就会遇到障碍。

In this article I will demonstrate how to extract data directly from the internet. Some packages already did it, but if you need another related data, you will do it yourself. I will show you how to extract funds information directly from website. For the sake of demonstration, we have picked Brazilian funds that are located on CVM (Comissão de Valores Mobiliários) website, the governmental agency that regulates the financial sector in Brazil.

在本文中,我将演示如何直接从Internet提取数据。 一些软件包已经做到了,但是如果您需要其他相关数据,则可以自己完成。 我将向您展示如何直接从网站提取资金信息。 为了演示,我们选择了位于CVM(Comissãode ValoresMobiliários)网站上的巴西基金,该网站是监管巴西金融业的政府机构。

Probably every country has some similar institutions that store financial data and provide free access to the public, you can target them.

可能每个国家都有一些类似的机构来存储财务数据并向公众免费提供访问权限,您可以将它们作为目标。

从网站下载数据 (Downloading the data from website)

To download the data from a website we could use the function getURL from the RCurlpackage. This package could be downloaded from the CRAN just running the install.package(“RCurl”) command in the console.

要从网站下载数据,我们可以使用RCurlpackage中的getURL函数。 只需在控制台中运行install.package(“ RCurl”)命令,即可从CRAN下载此软件包。

downloading the data, url http://dados.cvm.gov.br/dados/FI/DOC/INF_DIARIO/DADOS/

下载数据,网址为http://dados.cvm.gov.br/dados/FI/DOC/INF_DIARIO/DADOS/

library(tidyverse) # a package to handling the messy data and organize it
library(RCurl) # The package to download a spreadsheet from a website
library(forecast) # This package performs time-series applications
library(PerformanceAnalytics) # A package for analyze financial/ time-series data
library(readr) # package for read_delim() function

#creating an object with the spreadsheet url url <- "http://dados.cvm.gov.br/dados/FI/DOC/INF_DIARIO/DADOS/inf_diario_fi_202006.csv"

#downloading the data and storing it in an R object
text_data <- getURL(url, connecttimeout = 60)

#creating a data frame with the downloaded file. I use read_delim function to fit the delim pattern of the file. Take a look at it!
df <- read_delim(text_data, delim = ";")

#The first six lines of the data
head(df)### A tibble: 6 x 8
## CNPJ_FUNDO DT_COMPTC VL_TOTAL VL_QUOTA VL_PATRIM_LIQ CAPTC_DIA RESG_DIA
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 00.017.02~ 2020-06-01 1123668. 27.5 1118314. 0 0
## 2 00.017.02~ 2020-06-02 1123797. 27.5 1118380. 0 0
## 3 00.017.02~ 2020-06-03 1123923. 27.5 1118445. 0 0
## 4 00.017.02~ 2020-06-04 1124052. 27.5 1118508. 0 0
## 5 00.017.02~ 2020-06-05 1123871. 27.5 1118574. 0 0
## 6 00.017.02~ 2020-06-08 1123999. 27.5 1118639. 0 0
## # ... with 1 more variable: NR_COTST <dbl>

处理凌乱 (Handling the messy)

This data set contains a lot of information about all the funds registered on the CVM. First of all, we must choose one of them to apply our time-series analysis.

该数据集包含有关在CVM上注册的所有资金的大量信息。 首先,我们必须选择其中之一来应用我们的时间序列分析。

There is a lot of funds for the Brazilian market. To count how much it is, we must run the following code:

巴西市场有很多资金。 要计算多少,我们必须运行以下代码:

#get the unique identification code for each fund
x <- unique(df$CNPJ_FUNDO)

length(x) # Number of funds registered in Brazil.##[1] 17897

I selected the Alaska Black FICFI Em Ações — Bdr Nível I with identification code (CNPJ) 12.987.743/0001–86 to perform the analysis.

我选择了带有识别代码(CNPJ)12.987.743 / 0001–86的阿拉斯加黑FICFI EmAções-BdrNívelI来进行分析。

Before we start, we need more observations to do a good analysis. To take a wide time window, we need to download more data from the CVM website. It is possible to do this by adding other months to the data.

在开始之前,我们需要更多的观察资料才能进行良好的分析。 要花很长时间,我们需要从CVM网站下载更多数据。 可以通过在数据中添加其他月份来实现。

For this we must take some steps:

为此,我们必须采取一些步骤:

First, we must generate a sequence of paths to looping and downloading the data. With the command below, we will take data from January 2018 to July 2020.

首先,我们必须生成一系列循环和下载数据的路径。 使用以下命令,我们将获取2018年1月至2020年7月的数据。

# With this command we generate a list of urls for the years of 2020, 2019, and 2018 respectively.

url20 <- c(paste0("http://dados.cvm.gov.br/dados/FI/DOC/INF_DIARIO/DADOS/inf_diario_fi_", 202001:202007, ".csv"))
url19 <- c(paste0("http://dados.cvm.gov.br/dados/FI/DOC/INF_DIARIO/DADOS/inf_diario_fi_", 201901:201912, ".csv"))
url18 <- c(paste0("http://dados.cvm.gov.br/dados/FI/DOC/INF_DIARIO/DADOS/inf_diario_fi_", 201801:201812, ".csv"))

After getting the paths, we have to looping trough this vector of paths and store the dat

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值