Long Term Time Series Prediction and Stock Value Prediction

最新推荐文章于 2022-06-05 09:52:12 发布

xiaoyaoxiaoshenxian

最新推荐文章于 2022-06-05 09:52:12 发布

阅读量576

点赞数

分类专栏：信号处理

信号处理专栏收录该内容

5 篇文章 0 订阅

订阅专栏

Last update 17.9.1999

A time series is a finite data set, for example, f0, f1, f2,...., fn-1 that has been measured atdifferent points t0, t1,t2, ..., tn-1 intime. Standard time series prediction methods try to evaluate fn at time tn. Usually, itis assumed that all time steps are equal and need not be storedexplicitly. Moreover, it is assumed, that a good estimate of fn may be obtained from the m previous values fn-1, fn-2, fn-3,..., fn-m. Therefore, a formula ormechanism of the form f'i=F(fi-1,fi-2, fi-3, ..., fi-m) is searched in such a way that f'i=fi+ei,where ei is the prediction error. Since fi is known for all i<n, the error ei is known for i<n and Formula F can appliedfor i>m-1. Now, F is searched in such a way that ei is minimized for all m-1<i<n. When aformula is found that leads to small errors ei,one can assume that the prediction f'n=F(fn-1, fn-2, fn-3,..., fn-m) is accurate. Note that theformula F is not unique, i.e., one can find various formulae Fj that result in different predictions f'nj, i.e., the predictions are uncertain. When along-term prediction f'k with k>n isdesired, one can iteratively apply f'i=F(fi-1, fi-2, fi-3,..., fi-m), starting with i=n andcontinuing with i=n+1, i=n+2, ..., i=k. In the first step, theinput variables fi-1, fi-2,fi-3, ..., fi-m of Fare known (measured) data. In the successive steps, these valuesare replaced by estimates obtained in the previous steps, whichdrastically increases the uncertainty and the errors of theestimate f'k. Therefore, this procedurerarely leads do good long-term predictions.

GGP offers a new method for long-termprediction that seems to be much better in most cases. Here, thetime series is considered as a function f(t), sampled in thepoints f0, f1, f2, ...., fn-1 at t0, t1, t2,..., tn-1. Now, a generalized seriesexpansion of the form f(t)=Sum (k=1,2,...K) AkFk(pk,t) is searched,where Ak are linear parameters, pk linear parameters and Fkbasis functions. Usually, only a small set of basis functions(typically K=3) is constructed by an improved Genetic Programmingtechnique and the parameters Ak and pk are optimized in such a way that a goodextrapolation is obtained, when f(t)=Sum (k=1,2,...K) Ak Fk(pk,t)is evaluated for t>tn-1. To improve theprediction, GGP offers several features such as subdivision ofthe known data set into approximation and extrapolation range,data pre-processing, and a huge number of system parameters. Anoptimal setting of these parameters is not known because theauthor (Christian Hafner) had no timefor extensive tests, but the default values seem to be quite goodin most situations when the given data set is properly scaled.

More information on GGP and the GGP time series prediction philosophy.

The following picture show an excellent GGP long-termprediction of the properly scaled and weighted Dow Jones indexover almost 100 years, obtained with standard GGP systemparameters and K=3 basis functions. The time scale starts in 1900and ends in 1998. The data in the approximation range are used tofind the linear and non-linear parameters. The basis functionsare searched in such a way that a good fit is obtained in theapproximation range and the extrapolation quality of thesolutions is checked in the extrapolation range. The data in theprediction range is not known for the GGP algorithm. The best 10solutions are plotted. The quality of them is indicated by colors(dark for relatively bad, bright for relatively good, bestsolution by green). Excellent predictions have been found for arange of approximately 25 years! It seems that GGP found threedifferent sets. Two of them overestimate the performance.

Since the time range is very long, one can assume that thedata in the approximation range that are far away from theprediction range are less important for the prediction than thosethat are near the prediction range. To take this into account,the error function to be minimized by GGP was exponentiallyweighted. If this weighting is omitted, the GGP predictionbecomes wrong:

This reflects the uncertainty of the prediction. Note that theprediction starts in an area with a relatively high uncertainty.When the border of the prediction range is moved the differencebetween weighted and unweighted GGP results becomes lesspronounced.

The uncertainty of the prediction depends on variousproperties of the given data set. Often the data set is noisy andnoise can drastically reduce the quality of a prediction.Financial data can be very noisy. It is known that such data havealso some fractal aspects that should not be mixed up with noise.Theoretically, it would be possible to take the fractal aspectsinto account and to correctly predict such data. With thestandard settings, GGP can also create discontinuous functions,but it turns out that such solutions die out quite quickly andthe GGP focuses on smooth continuous functions. Note that thegreen solution in the figure above seems to be noisy. In fact,GGP has created a function with a small non-smooth portion thatlooks like noise. This portion is caused by a purelydeterministic basis function. Obviously, it does not contributemuch to the quality of the solution. Therefore, GGP would replacethe corresponding basis function by a better one if one would runGGP for a longer time. However, noise can considerably disturbthe prediction and it seems that the noise content in financialdata is the higher the shorter the observation interval is.Moreover, indices like the Dow Jones are less noisy than stockvalues of special companies - especially small ones. A typicalexample is shown in the following figure:

GGP has a noise estimation feature that indicates that the AMDdata over a period of 8 years is noisy. As one can see, most ofthe GGP are completely wrong. If one would run GGP for a longertime, it might find better solutions, but since GGP offers manysolutions with a similar quality in the approximation and in theextrapolation range, but with a completely different behavior inthe prediction range, it is impossible to obtain confidence inany of the predictions.

Microsoft is a much bigger company with less noisy data as onecan see in the following figure.

As one can see, all GGP predictions show the same trend andare almost identical in the approximation and extrapolationrange. Because of the more or less exponential behavior, it wouldbe reasonable to analyze the logarithm of the stock value, i.e.,to do some pre-processing first. GGP could easily do this, butsince the author (Christian Hafner) hadno access to precise data (the data were extracted from a bitmapfile of the chart found on the WWW), this would causeinaccuracies for low values, which would correspond to somenoise.

As one can see in the following figure (General Electrics - abig company with low noise and behaviour similar to the DowJones), GGP first creates solutions that are obviously wrong.These solutions do not even approximate the data in theapproximation range. After a while, good approximations are foundwith a low quality in the extrapolation range.

When the given data set is simple enough and not too noisy,GGP can find several solutions with a good quality in theapproximation and extrapolation range. When these solutionsremain within a limited area and look reasonable in theprediction range, one can obtain some confidence in theprediction. To obtain more (or less) confidence, one shouldanalyze the same data within several GGP runs with slightlydifferent system parameters, different weighting, different sizeof the approximation/extrapolation range, etc.

xiaoyaoxiaoshenxian

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Long Term Time Series Prediction and Stock Value Prediction

Last update 17.9.1999A time series is a finite data set, for example, f0, f1, f2,...., fn-1 that has been measured atdifferent points t0, t1,t2, ..., tn-1 intime. Standard time series prediction m
复制链接

扫一扫