arima模型python_Python中的Arima模型进行温度预测

arima模型python

Time Series forecasting is one of the most in-demand techniques of data science, be it in stock trading, predicting business sales or weather forecasting. It is clearly a very handy skill to have and I am gonna equip you with just that by the end of this article.

时间序列预测是数据科学中最抢手的技术之一,无论是在股票交易,预测业务销售还是天气预报中。 显然,这是一项非常方便的技能,在本文结尾处,我将为您提供这些技能。

In this tutorial, we are gonna build an ARIMA model(don’t worry if you do not exactly know how this works yet) to predict the future temperature values of a particular city using python. GitHub link for the code and data set can be found at the end of this blog. I have also attached my YouTube video at the end, in case you are interested in a video explanation. So without wasting any time let’s get started.

在本教程中,我们将构建一个ARIMA模型(如果您还不完全了解它的工作原理,请不要担心),以使用python预测特定城市的未来温度值。 有关代码和数据集的GitHub链接,请参见此博客的末尾。 如果您对视频说明感兴趣,我还将在结尾处附加我的YouTube视频。 因此,不要浪费时间,让我们开始吧。

读取数据 (Reading Your Data)

The first step in any time series is to read your data and see how it looks like. The following code snippet demonstrates how to do that.

任何时间序列的第一步都是读取数据并查看其外观。 以下代码段演示了如何执行此操作。

import pandas as pd
df=pd.read_csv('/content/MaunaLoaDailyTemps.csv',index_col='DATE' ,parse_dates=True)
df=df.dropna()
print('Shape of data',df.shape)
df.head()
df

The code is pretty straightforward. We read the data using pd.read_csv and writing parse_date=True, makes sure that pandas understands that it is dealing with date values and not string values.

该代码非常简单。 我们使用pd.read_csv读取数据并编写parse_date = True,以确保pandas理解它是在处理日期值而不是字符串值。

Next we drop any missing values and print the shape of the data. df.head() prints the first 5 rows of the dataset. Here is the output you should see for this:

接下来,我们删除所有缺失的值并打印数据的形状。 df.head()打印数据集的前5行。 这是您应该看到的输出:

绘制您的数据 (Plot Your data)

The next is to plot out your data. This gives you an idea of whether the data is stationary or not. For those who don’t what stationarity means, let me give you a gist of it. Although i have made several videos on this topic, it all boils down to this:

接下来是绘制数据。 这使您可以了解数据是否稳定。 对于那些不了解平稳性的人,让我向您介绍其中的要点。 尽管我已经制作了一些有关此主题的视频,但这些都可以归结为:

Any time series data that has to be modeled needs to be stationary. Stationary means that it’s statistical properties are more or less constant with time. Makes sense, right? How else are you supposed to make predictions if the statistical properties are varying with time? These are the following properties that any stationarity model will have:

必须建模的任何时间序列数据都必须是固定的。 平稳的意味着它的统计属性或多或少随时间而变化。 有道理吧? 如果统计属性随时间变化,您还应该如何进行预测? 这些是任何平稳模型都具有的以下属性:

  1. Constant Mean

    恒定均值
  2. Constant Variance(There can be variations, but the variations shouldn’t be irregular)

    恒定方差(可以有变化,但变化不应不规则)
  3. No seasonality(No repeating patterns in the data set)

    没有季节性(数据集中没有重复模式)

So first step is to check for stationarity. If your data set is not stationary, you’ll have to convert it to a stationary series. Now before you start worrying about all of this, relax! We have a fixed easy test to check for stationarity called the ADF(Augmented Dickey Fuller Test). But before showing that, lets plot the data first.

因此,第一步是检查平稳性。 如果您的数据集不稳定,则必须将其转换为平稳序列。 现在,在您开始担心所有这些之前,放松一下! 我们有一个固定的简单测试来检查平稳性,称为ADF(增强迪基·富勒测试)。 但是在显示之前,让我们先绘制数据。

Since I am only interested in predicting the average temperature, that is the only column I will be plotting.

因为我只对预测平均温度感兴趣,所以这是我将要绘制的唯一一列。


    
  • 4
    点赞
  • 66
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值