Cousera Sequence,Time Series and PredictionWK1

本文内容为Cousera,deepAI, Tensorflow in practise的笔记,原视频地址

Time series examples

It’s typically defined as an ordered sequence of values that are usually equally spaced over time.

Univariant

在这里插入图片描述
e.g.
stock prices
在这里插入图片描述
e.g.weather forecast
在这里插入图片描述
historical trend: e.g. Moore’s law
在这里插入图片描述
不知道这个Arcade revenue是啥
在这里插入图片描述

Multivariant

Multivariate Time Series charts can be useful ways of understanding the impact of related data.
在这里插入图片描述

e.g.在这里插入图片描述
e.g. combined, the correlation is easy to see
在这里插入图片描述
e.g. the path of a car as it travels.
在这里插入图片描述

Machine learning applied to time series

Predict / Forecast

在这里插入图片描述

Imputation (past/hole)

在这里插入图片描述
在这里插入图片描述

Anormaly Detect

在这里插入图片描述

Spot Pattern

语音识别中
在这里插入图片描述

Comman Patterns in time series

Trend

e.g. upwards facing trend 呈上升趋势
在这里插入图片描述

Seasonality

e.g. active users at a website for software developers
工作日(小平台) / 周末(凹区域)
其他 shopping sites 周末peak
在这里插入图片描述

Combination of seasonality & trend

overall upwards trend but there are local peaks and troughs
在这里插入图片描述

无法使用Time Serie的情形: Random(White) noise/ Random values 无法预测的数据

在这里插入图片描述

Autocorrelation 自相关

The spikes appear at random timestamps. You can’t predict when that will happen next or how strong they will be. But clearly, the entire series isn’t random. Between the spikes there’s a very deterministic type of decay.

在这里插入图片描述
We can see here that the value of each time step is 99 percent of the value of the previous time step plus an occasional spike. This is an auto correlated time series. Namely it correlates with a delayed copy of itself often called a lag.
在这里插入图片描述
Often a time series like this is described as having memory as steps are dependent on previous ones. The spikes which are unpredictable are often called Innovations. In other words, they cannot be predicted based on past values.
Another example is here where there are multiple autocorrelations, in this case, at time steps one and 50. The lag one autocorrelation gives these very quick short-term exponential delays, and the 50 gives the small balance after each spike.
在这里插入图片描述

以上编程实现

ipynb

As we’ve learned a machine-learning model is designed to spot patterns, and when we spot patterns we can make predictions. For the most part this can also work with time series except for the noise which is unpredictable. But we should recognize that this assumes that patterns that existed in the past will of course continue on into the future.
在这里插入图片描述

Real life data : Combination of above ,Sometimes big events
stationary & Non-stationary time series

If this were stock, price then maybe it was a big financial crisis or a big scandal or perhaps a disruptive technological breakthrough causing a massive change. 比如这次的新冠疫情/以前的SARS
After that the time series started to trend downward without any clear seasonality. We’ll typically call this a non-stationary time series.
在这里插入图片描述
在这里插入图片描述
To predict on this we could just train for limited period of time. For example, here where I take just the last 100 steps. You’ll probably get a better performance than if you had trained on the entire time series. But that’s breaking the mold for typical machine, learning where we always assume that more data is better. But for time series forecasting it really depends on the time series. If it’s stationary, meaning its behavior does not change over time, then great. The more data you have the better. But if it’s not stationary then the optimal time window that you should use for training will vary.
在这里插入图片描述Ideally, we would like to be able to take the whole series into account and generate a prediction for what might happen next. As you can see, this isn’t always as simple as you might think given a drastic change like the one we see here. So that’s some of what you’re going to be looking at in this course. But let’s start by going through a workbook that generates sequences like those you saw in this video. After that we’ll then try to predict some of these synthesized sequences as a practice before later we’ll move on to real-world data.

Train,Val,Test sets
Fixed Partitioning

和其他数据的随机分组不同,这个要保证train,val,test都有一个seasonal pattern。
在这里插入图片描述
To measure the performance of our forecasting model,. We typically want to split the time series into a training period, a validation period and a test period. This is called fixed partitioning. If the time series has some seasonality, you generally want to ensure that each period contains a whole number of seasons. For example, one year, or two years, or three years, if the time series has a yearly seasonality. You generally don’t want one year and a half, or else some months will be represented more than others. While this might appear a little different from the training validation test, that you might be familiar with from non-time series data sets.Where you just picked random values out of the corpus to make all three, you should see that the impact is effectively the same.

val 调参差不多的,retrain using both train & val (well?=>) retrain using test

在这里插入图片描述
Next you’ll train your model on the training period, and you’ll evaluate it on the validation period. Here’s where you can experiment to find the right architecture for training. And work on it and your hyper parameters, until you get the desired performance, measured using the validation set. Often, once you’ve done that, you can retrain using both the training and validation data. And then test on the test period to see if your model will perform just as well. And if it does, then you could take the unusual step of retraining again, using also the test data. But why would you do that? Well, it’s because the test data is the closest data you have to the current point in time. And as such it’s often the strongest signal in determining future values. If your model is not trained using that data, too, then it may not be optimal. Due to this, it’s actually quite common to forgo a test set all together. And just train, using a training period and a validation period, and the test set is in the future.

Roll-Forward Partitioning

(大概就是切一小块一小块吧)
在这里插入图片描述

Moving average and differencing

(这一块暂时看不懂,哎我感觉adam什么的都是moving average 加个exponentional 加速度修正那些)
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

以上编程实现(metric)

ipynb

wk1 exercise

wk1 ans

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值