变换后的ARMA新息递归预报--python索引踩坑记

最新推荐文章于 2024-05-27 20:17:39 发布

小果一粒沙

最新推荐文章于 2024-05-27 20:17:39 发布

阅读量225

点赞数

分类专栏： python 统计学文章标签：算法

本文链接：https://blog.csdn.net/qq_35167821/article/details/111573220

版权

python 同时被 2 个专栏收录

26 篇文章 3 订阅

订阅专栏

统计学

18 篇文章 1 订阅

订阅专栏

有时候按照课本来打公式，很多次都会出现程序的索引跟课本上的索引不一致的情况，这次，我在这个地方陷了两天，气死我了！但是还好，问题终于解决，不是我的问题，是课本的问题。不过还是想把这个思考的过程记录下来，希望以后能够在索引上面少花一些时间。

介绍python常见的索引方式

自带列表：

索引是从0开始的，如果你有li[a:b]的形式，那么最终取出来的数是li[a], li[a+1], ..., li[b-1].
还有一个非常坑的地方。

li = [1, 2, 3]
li_2to5 = li[2:5]
print(li_2to5)
# out: 3
# 最后输出列表的长度不是3！！而是1，因为5 > 3 所以最后只取到列表的末尾

都是坑！当你两个列表使用同样的切片方式时，出来的列表长度不一致，不要慌！检查一下原始没有切片的列表长度是否一致。

numpy

arr[i,j]
跟列表类似。

pandas

pd.DataFrame类型的数据：
两种索引的方式：

按照标签索引: loc

arr = [[100, 80, 95], [1, 3, 2]]
df = pd.DataFrame(arr, index=['score', 'rank'], columns=['year1', 'year2', 'year3'])
print(df)
print(df.loc[:, 'year1':'year2'])
# df[1, 1] => 5
#       year1  year2  year3
# score    100     80     95
# rank       1      3      2
#        year1  year2
# score    100     80
# rank       1      3

这是一个闭区间，既包含year1, 也包含year2！！！
2. 按照位置索引: iloc

print(df.iloc[:, 0:1])
#       year1
# score    100
# rank       1

在使用这种索引方式来获取数据框中的数据时，在使用索引切片的时候，跟列表是类似的，i:j是左开右闭，i, i+1, ..., j-1.

我的！老出错的！索引！

信息递归算法的公式：

当 $\cdots, N$ 时（N是给定的原始数据的个数），
$P_{n}X_{n+h}=\hat{X}_{n+1} =\left\{ \begin{aligned} &0, &n=0 \\ &\sum\limits_{j=1}^{n}\theta_{n, j}(X_{n+1-j}-\hat{X}_{n+1-j}) & n \ge 1 \end{aligned} \right.$
当 $\cdots, N+q$ 时
$P_{n}X_{n+h} = \sum\limits_{i=1}^{p}\phi_{i}P_{n}X_{n+h-i} + \sum\limits_{j=h}^{q}\theta_{n+h-1, j}(X_{n+h-j}-\hat{X}_{n+h-j})$
当 $\cdots$ 时
$P_{n}X_{n+h} = \sum\limits_{i=1}^{p}\phi_{i}P_{n}X_{n+h-i}$

其中我的 $\theta$ 的索引是跟着公式来的，即第一行和第一列的元素为0，其他行列的元素使用正常的索引就可以取出。
但是我的X的索引是跟着python计算机来的，按照上面的python介绍，我写了下面的索引表格。

原始公式(N=n)	$\sum\limits_{i=1}^{p}\phi_{i}$	$\sum\limits_{i=1}^{p}P_{n}X_{n+h-i}$	$\sum\limits_{j=h}^{q}\theta_{n+h-1, j}$	$\sum\limits_{j=h}^{q}(X_{n+h-j}-\hat{X}_{n+h-j})$
进行代换(n=N+h)	$\sum\limits_{i=1}^{p}\phi_{i}$	$\sum\limits_{i=1}^{p}P_{N}X_{n-i}$	$\sum\limits_{j=h}^{q}\theta_{n-1, j}$	$\sum\limits_{j=h}^{q}(X_{n-j}-\hat{X}_{n-j})$
数学(包括左包括右)	1:p	$n - p : n - 1$	$h : q$	$n - q : n - (h + 1)$
计算机(包括左不包括右)	1:	$n - p : n$	$h + 1 : q + 1$	$n - q : n - h$

解释：
当h=0的时候，在计算机中就已经相当于是开始进行预报了，此时对应数学公式中的 $h = 1$ .

$N + h = n$ n为 $\cdots$
$h = n - N$ ， $N$ 是原始数据的长度，这里的h是从0, 1, 2, 3开始取的.
公式中的 $h(in\quad code)+1$ ，因为在公式中h是1, 2, 3, … 但是在代码中h是0, 1, 2, …

最后代码部分

## version--2
def predict_X(X=X, h=H, p=P, q=Q_, phi=PHI, theta=THETA):
    m = max(p, q)
    bar_X = []
    theta_nj, _ = calc_theta_nj()
    theta_nj = theta_nj.values
    # 为了方便编程，n从0开始取
    X = np.asarray(X)
    for n in range(len(X)+h):
        if n == 0:
            bar_X.append(0)
        elif 0 < n and n < m:
            tmp_reduce = np.array(X[:n]) - np.array(bar_X[:n])
            tmp = np.dot(theta_nj[n, 1:n+1], tmp_reduce[::-1])
            bar_X.append(tmp)
        elif n < N:
            # 注意这里的n是从[m, N(9)]
            tmp_reduce = np.array(X[n-q:n]) - np.array(bar_X[n-q:n])
            tmp = np.dot(phi[1:], X[n-p:n+1-1][::-1])\
                + np.dot(theta_nj[n, 1:q+1],\
                    tmp_reduce[::-1])
            bar_X.append(tmp)
        elif n < N + H:
            h_ = n-N # 0, 1, 2
            print(h_)
            afore = np.sum(np.array(phi[1:]) * bar_X[n-p:n][::-1])

            print('afore:')
            print(np.array(phi[1:]) , bar_X[n-p:n][::-1])

            tmp_reduce = np.array(X[n-q:n-h_]) - np.array(bar_X[n-q:n-h_])
            after = np.sum(theta_nj[n-1, h_+1:q+1] * tmp_reduce[::-1])
            bar_X.append(afore+after)

            print('after:')
            print('1--',X[n-q:n-h_][::-1], bar_X[n-q:n-h_][::-1])
            print(theta_nj[n-1, h_+1:q+1] , tmp_reduce[::-1])
            print(bar_X[-1])
            print('--------------')
        else:
            # h_ = n-N
            afore = np.sum(np.array(phi[1:]) * bar_X[n-p:n][::-1])
            X_bar.append(afore)
    return bar_X

predict_X()

Out:

0
afore:
[ 1.   -0.24] [-0.8731551320646224, 0.3192062269207901]
after:
1-- [ 0.525 -0.638  0.01 ] [-0.8731551320646224, 0.3192062269207901, -0.16903562798495692]
[0.4 0.2 0.1] [ 1.3982 -0.9572  0.179 ]
-0.564042490604746
--------------
1
afore:
[ 1.   -0.24] [-0.564042490604746, -0.8731551320646224]
after:
1-- [ 0.525 -0.638] [-0.8731551320646224, 0.3192062269207901]
[0.2 0.1] [ 1.3982 -0.9572]
-0.1705753515688519
--------------
2
afore:
[ 1.   -0.24] [-0.1705753515688519, -0.564042490604746]
after:
1-- [0.525] [-0.8731551320646224]
[0.1] [1.3982]
0.10460977374375188
--------------
[0,
 1.5296687463605358,
 -0.16737218041388724,
 1.2370419967204334,
 0.7433286045352023,
 0.31319736866125164,
 -1.7282341498026976,
 -0.16903562798495692,
 0.3192062269207901,
 -0.8731551320646224,
 -0.564042490604746,
 -0.1705753515688519,
 0.10460977374375188]

小果一粒沙

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
变换后的ARMA新息递归预报--python索引踩坑记

有时候按照课本来打公式，很多次都会出现程序的索引跟课本上的索引不一致的情况，这次，我在这个地方陷了两天，气死我了！但是还好，问题终于解决，不是我的问题，是课本的问题。不过还是想把这个思考的过程记录下来，希望以后能够在索引上面少花一些时间。介绍python常见的索引方式自带列表：索引是从0开始的，如果你有li[a:b]的形式，那么最终取出来的数是li[a], li[a+1], ..., li[b-1].还有一个非常坑的地方。li = [1, 2, 3]li_2to5 = li[2:5]print
复制链接

扫一扫

专栏目录