前言
计算实时序列数据的均值和方差时,常使递推的方式,可以减少存储和降低计算复杂度( O(n)-> O(1) )。下面将给出递推公式和推导方法。
一、定义
均值
定义:给定一个包含n个样本的集合 X={X1, …Xn},均值就是这个集合中所有元素和的平均值。
公式:
μ
=
1
n
∑
i
=
1
n
x
i
\begin{aligned} &\mu = \frac{1}{n}\sum_{i =1}^{n}x_{i} \end{aligned}
μ=n1i=1∑nxi
方差
定义:方差是各个数据与其算术平均数的离差平方和的平均数。
公式:
σ
2
=
1
n
∑
i
=
1
n
(
x
i
−
μ
)
2
\begin{aligned} &\sigma^{2} = \frac{1}{n}\sum_{i =1}^{n}(x_{i} - \mu )^{2} \end{aligned}
σ2=n1i=1∑n(xi−μ)2
备注:样本方差的分母是n-1。
二、递推公式
1.均值
令 前n个样本的均值为: μ n = 1 n ∑ i = 1 n x i (2.1) \begin{aligned} &\mu_{n} = \frac{1}{n}\sum_{i =1}^{n}x_{i} \end{aligned} \tag{2.1} μn=n1i=1∑nxi(2.1)
则,与前n-1个样本的均值的递推公式为:
μ
n
=
1
n
∑
i
=
1
n
x
i
=
1
n
(
∑
i
=
1
n
−
1
x
i
+
x
n
)
=
1
n
[
(
n
−
1
)
μ
n
−
1
+
x
n
]
=
μ
n
−
1
+
1
n
(
x
n
−
μ
n
−
1
)
(2.2)
\begin{aligned} &\mu_{n} = \frac{1}{n}\sum_{i =1}^{n}x_{i} \\ &~~~ = \frac{1}{n}\left ( \sum_{i =1}^{n-1}x_{i} + x_{n} \right ) \\ &~~~ = \frac{1}{n}\left [ (n-1)\mu_{n-1} + x_{n} \right ] \\ &~~~ = \mu_{n-1} + \frac{1}{n}(x_{n} - \mu_{n-1} ) \end{aligned} \tag{2.2}
μn=n1i=1∑nxi =n1(i=1∑n−1xi+xn) =n1[(n−1)μn−1+xn] =μn−1+n1(xn−μn−1)(2.2)
2.方差
令 前n个样本的方差为: σ n 2 = 1 n ∑ i = 1 n ( x i − μ n ) 2 (2.3) \begin{aligned} &\sigma^{2}_{n} = \frac{1}{n}\sum_{i =1}^{n}(x_{i} - \mu_{n} )^{2} \end{aligned} \tag{2.3} σn2=n1i=1∑n(xi−μn)2(2.3)
将(2.2)代入(2.3),可得与前n-1个样本的方差的递推公式为:
σ
n
2
=
1
n
∑
i
=
1
n
[
(
x
i
−
μ
n
−
1
)
−
1
n
(
x
n
−
μ
n
−
1
)
]
2
=
1
n
∑
i
=
1
n
[
(
x
i
−
μ
n
−
1
)
2
+
1
n
2
(
x
n
−
μ
n
−
1
)
2
−
2
n
(
x
i
−
μ
n
−
1
)
(
x
n
−
μ
n
−
1
)
]
2
=
1
n
∑
i
=
1
n
(
x
i
−
μ
n
−
1
)
2
+
1
n
2
(
x
n
−
μ
n
−
1
)
2
−
2
n
2
(
x
n
−
μ
n
−
1
)
∑
i
=
1
n
(
x
i
−
μ
n
−
1
)
=
n
−
1
n
σ
n
−
1
2
+
n
+
1
n
2
(
x
n
−
μ
n
−
1
)
2
−
2
n
2
(
x
n
−
μ
n
−
1
)
(
x
n
−
μ
n
−
1
)
=
n
−
1
n
σ
n
−
1
2
+
n
−
1
n
2
(
x
n
−
μ
n
−
1
)
2
(2.4)
\begin{aligned} &\sigma^{2}_{n} = \frac{1}{n}\sum_{i =1}^{n}\left [(x_{i} - \mu_{n-1})-\frac{1}{n}(x_{n} - \mu_{n-1} ) \right ]^{2} \\ &~~~ = \frac{1}{n}\sum_{i =1}^{n}\left [(x_{i} - \mu_{n-1})^{2}+\frac{1}{n^{2} }(x_{n}- \mu_{n-1} )^{2} - \frac{2}{n}(x_{i} - \mu_{n-1})(x_{n} - \mu_{n-1} ) \right ]^{2} \\ &~~~ = \frac{1}{n}\sum_{i =1}^{n}(x_{i} - \mu_{n-1})^{2} + \frac{1}{n^{2} }(x_{n}- \mu_{n-1} )^{2} - \frac{2}{n^{2}}(x_{n} - \mu_{n-1} )\sum_{i =1}^{n}(x_{i} - \mu_{n-1})\\ &~~~ = \frac{n-1}{n}\sigma^{2}_{n-1} + \frac{n+1}{n^{2}}(x_{n}- \mu_{n-1} )^{2} - \frac{2}{n^{2}}(x_{n} - \mu_{n-1} )(x_{n} - \mu_{n-1} )\\ &~~~ = \frac{n-1}{n}\sigma^{2}_{n-1} + \frac{n-1}{n^{2}}(x_{n}- \mu_{n-1} )^{2} \end{aligned} \tag{2.4}
σn2=n1i=1∑n[(xi−μn−1)−n1(xn−μn−1)]2 =n1i=1∑n[(xi−μn−1)2+n21(xn−μn−1)2−n2(xi−μn−1)(xn−μn−1)]2 =n1i=1∑n(xi−μn−1)2+n21(xn−μn−1)2−n22(xn−μn−1)i=1∑n(xi−μn−1) =nn−1σn−12+n2n+1(xn−μn−1)2−n22(xn−μn−1)(xn−μn−1) =nn−1σn−12+n2n−1(xn−μn−1)2(2.4)
式(2.4 ) 的推导第三行第3项到第四行第3项,具体推导可见式 ( 2.5 )
∑
i
=
1
n
(
x
i
−
μ
n
−
1
)
=
∑
i
=
1
n
x
i
−
n
μ
n
−
1
=
x
n
+
∑
i
=
1
n
−
1
x
i
−
n
μ
n
−
1
=
x
n
−
μ
n
−
1
+
∑
i
=
1
n
−
1
x
i
−
(
n
−
1
)
μ
n
−
1
=
x
n
−
μ
n
−
1
\begin{aligned} &\sum_{i =1}^{n}(x_{i} - \mu_{n-1}) = \sum_{i =1}^{n}x_{i} - n\mu_{n-1} \\ &~~~~~~~~~~~~~~~~~~~~ = x_{n} + \sum_{i =1}^{n-1} x_{i} - n\mu_{n-1}\\ &~~~~~~~~~~~~~~~~~~~~ = x_{n} - \mu_{n-1} + \sum_{i =1}^{n-1} x_{i} - (n-1)\mu_{n-1} \\ &~~~~~~~~~~~~~~~~~~~~ = x_{n} - \mu_{n-1} \end{aligned}
i=1∑n(xi−μn−1)=i=1∑nxi−nμn−1 =xn+i=1∑n−1xi−nμn−1 =xn−μn−1+i=1∑n−1xi−(n−1)μn−1 =xn−μn−1
总结
相关定义还有标准差、协方差。
参见此链接:https://blog.csdn.net/jisuanji111111/article/details/129183563