一二阶统计量的在线增量计算算法原理推导
统计机器学习领域中经常用到诸如均值,方差,标准差,协方差等统计量;这些统计量频繁计算需要消耗很大的内存,且非常耗时。现有一种常用的在线,增量的统计量的计算方法,可以有效地解决计算大批量数据的统计量时,内存消耗大,耗时较长的问题。
通过算法原理的推导,可以建立统计量的当前值 M , k M_{,k} M,k与前一个统计量的值 M , k − 1 M_{,k-1} M,k−1之间的关系( k k k表示待统计的序列的第k个元素),最终实现在线增量的统计量计算算法。
1. 均值 mean
x ˉ = ∑ i = 1 n x i \bar x=\displaystyle \sum_{i=1}^n x_i xˉ=i=1∑nxi
2. 方差 variance
s 2 = ∑ i = 1 n ( x i − x ˉ ) 2 n − 1 = 1 n ( n − 1 ) [ n ∑ i = 1 n x i 2 − ( ∑ i = 1 n x i ) 2 ] s^2=\displaystyle \frac{\displaystyle \sum_{i=1}^n (x_i-\bar x)^2}{n-1}= \frac{1}{n(n-1)} \Big[n \displaystyle \sum_{i=1}^n x_i^2 -(\displaystyle \sum_{i=1}^n x_i)^2\Big] s2=n−1i=1∑n(xi−xˉ)2=n(n−1)1[ni=1∑nxi2−(i=1∑nxi)2]
推导公式1:
s 2 = ∑ ( x i − x ˉ ) 2 n − 1 s^2=\displaystyle \frac{\sum (x_i-\bar x)^2}{n-1} s2=n−1∑(xi−xˉ)2 = ∑ x i 2 − 2 x ˉ ∑ x i + ∑ x ˉ 2 n − 1 =\displaystyle \frac{\sum x_i^2 -2\bar x \sum x_i + \sum \bar x^2}{n-1} =n−1∑xi2−2xˉ∑xi+∑xˉ2
= ∑ x i 2 − 2 x ˉ ⋅ n x ˉ + n x ˉ 2 n − 1 =\displaystyle \frac{\sum x_i^2 -2 \bar x \cdot n\bar x + n\bar x^2}{n-1} =n−1∑xi2−2xˉ⋅nxˉ+nxˉ2 = ∑ x i 2 − n x ˉ 2 n − 1 =\displaystyle \frac{\sum x_i^2 - n\bar x^2}{n-1} =n−1∑xi2−nxˉ2
= ∑ i = 1 n x i 2 − 1 n ( ∑ i = 1 n x i 2 ) n − 1 =\displaystyle \frac{\displaystyle \sum_{i=1}^n x_i^2 - \frac{1}{n} \Big(\displaystyle \sum_{i=1}^n x_i^2\Big)}{n-1} =n−1i=1∑nxi2−n1(i=1∑nxi2) = 1 n ( n − 1 ) [ n ∑ x i 2 − ( ∑ x i ) 2 ] =\displaystyle \frac{1}{n(n-1)} \Big[n \displaystyle \sum x_i^2 -(\displaystyle \sum x_i)^2\Big] =n(n−1)1[n∑xi2−(∑xi)2]
3. 标准差 standard deviation
s = ∑ i = 1 n ( x i − x ˉ ) 2 n − 1 s=\displaystyle \sqrt \frac{\displaystyle \sum_{i=1}^n (x_i-\bar x)^2}{n-1} s=n−1i=1∑n(xi−xˉ)2
4. 协方差 covariance
c o v ( x , y ) = ∑ i = 1 n ( x i − x ˉ ) ( y i − y ˉ ) n − 1 cov(x,y)=\displaystyle \frac{\displaystyle \sum_{i=1}^n (x_i-\bar x)(y_i-\bar y)}{n-1} cov(x,y)=n−1i=1∑n(xi−xˉ)(yi−yˉ)
推导公式2:
( n − 1 ) c o v ( x , y ) = ∑ ( x i y i − x ˉ y i − x i y ˉ + x ˉ y ˉ ) (n-1)cov(x,y)=\sum(x_iy_i - \bar xy_i - x_i\bar y + \bar x \bar y) (n−1)cov(x,y)=∑(xiyi−xˉyi−xiyˉ+xˉyˉ) = ∑ ( x i y i ) − ∑ ( x ˉ y i ) − ∑ ( x i y ˉ ) + ∑ ( x ˉ y ˉ ) =\sum(x_iy_i) - \sum(\bar xy_i) - \sum(x_i\bar y) + \sum(\bar x \bar y) =∑(xiyi)−∑(xˉyi)−∑(xiyˉ)+∑(xˉyˉ)
= ∑ ( x i y i ) − n x ˉ y ˉ − n x ˉ y ˉ + n x ˉ y ˉ =\sum(x_iy_i) - n\bar x \bar y - n \bar x \bar y + n \bar x \bar y =∑(xiyi)−nxˉyˉ−nxˉyˉ+nxˉyˉ = ∑ ( x i y i ) − n x ˉ y ˉ =\sum(x_iy_i) - n\bar x \bar y =∑(xiyi)−nxˉyˉ
= ∑ ( x i y i − x ˉ y i ) =\sum(x_iy_i - \bar x y_i) =∑(xiyi−xˉyi) = ∑ i = 1 n y i ( x i − x ˉ ) =\displaystyle \sum_{i=1}^n y_i(x_i- \bar x) =i=1∑nyi(xi−xˉ)
则, c o v ( x , y ) = 1 n − 1 ∑ i = 1 n x i ( y i − y ˉ ) = 1 n − 1 ∑ i = 1 n y i ( x i − x ˉ ) cov(x,y) = \frac{1}{n-1}\displaystyle \sum_{i=1}^n x_i(y_i- \bar y) = \frac{1}{n-1}\displaystyle \sum_{i=1}^n y_i(x_i- \bar x) cov(x,y)=n−11i=1∑nxi(yi−yˉ)=n−11i=1∑nyi(xi−xˉ)
M1为一阶累积统计量: M 1 = ∑ i = 1 k x i \displaystyle M_1 = \sum_{i=1}^k{x_i} M1=i=1∑kxi
M2为一阶累积统计量: M 2 = ∑ i = 1 k ( x i − x ˉ ) 2 \displaystyle M_2 = \sum_{i=1}^k{(x_i - \bar x)}^2 M2=i=1∑k(xi−xˉ)2
5. 一二阶统计量的在线增量算法-结论
M 1 , k = M 1 , k − 1 + ( x k − M k − 1 ) / k M_{1,k} = M_{1,k-1} + (x_k - M_{k-1})/k M1,k=M1,k−1+(xk−Mk−1)/k
M 2 , k = M 2 , k − 1 + ( x k − M k − 1 ) ∗ ( x k − M k ) = M 2 , k − 1 + ( 1 − 1 k ) ( x k − M 1 , k − 1 ) 2 \displaystyle M_{2,k} = M_{2,k-1} + (x_k - M_{k-1})*(x_k - M_k) = M_{2,k-1} + (1-\frac{1}{k})(x_k-M_{1,k-1})^2 M2,k=M2,k−1+(xk−Mk−1)∗(xk−Mk)=M2,k−1+(1−k1)(xk−M1,k−1)2
初始条件: M 1 = x 1 , M_1 = x_1, M1=x1, M 2 = 0 M_2=0 M2=0
推导过程1:
M 1 , k = ∑ i = 1 k x i = 1 k ( ∑ i = 1 k x i + x k ) = 1 k ( k − 1 k − 1 ∑ i = 1 k − 1 x i + x k ) \displaystyle M_{1,k} = \sum_{i=1}^k{x_i}=\frac{1}{k} (\sum_{i=1}^k{x_i} + x_k) = \frac{1}{k}(\frac{k-1}{k-1}\sum_{i=1}^{k-1}{x_i} + x_k) M1,k=i=1∑kxi=k1(i=1∑kxi+xk)=k1(k−1k−1i=1∑k−1xi+xk)
= 1 k ( ( k − 1 ) M 1 , k − 1 + x k ) = k − 1 k M 1 , k − 1 + 1 k x k \displaystyle =\frac{1}{k}((k-1)M_{1,k-1}+x_k) =\frac{k-1}{k}M_{1,k-1}+ \frac{1}{k}x_k =k1((k−1)M1,k−1+xk)=kk−1M1,k−1+k1xk
= M 1 , k − 1 − 1 k M 1 , k − 1 + 1 k x k = M 1 , k − 1 + ( x k − M 1 , k − 1 ) / k \displaystyle =M_{1,k-1} - \frac{1}{k}M_{1,k-1}+ \frac{1}{k}x_k=M_{1,k-1} + (x_k - M_{1,k-1})/k =M1,k−1−k1M1,k−1+k1xk=M1,k−1+(xk−M1,k−1)/k
准备两个推导的前提等式:
x
k
−
M
1
,
k
−
1
=
k
(
M
1
,
k
−
M
1
,
k
−
1
)
x_k-M_{1,k-1} = k(M_{1,k} - M_{1,k-1})
xk−M1,k−1=k(M1,k−M1,k−1)
m 1 , k − 1 = ( m 1 , k − 1 k x k ) 1 1 − 1 k \displaystyle m_{1,k-1}=(m_{1,k}-\frac{1}{k}x_k)\frac{1}{1-\frac{1}{k}} m1,k−1=(m1,k−k1xk)1−k11
推导过程2:
M 2 , k = ∑ i = 1 k ( x i − M 1 , k ) 2 = ∑ i = 1 k ( x i − M 1 , k − 1 − ( x k − M 1 , k − 1 ) / k ) 2 \displaystyle M_{2,k} = \sum_{i=1}^k{(x_i - M_{1,k})}^2=\sum_{i=1}^k(x_i-M_{1,k-1} - (x_k - M_{1,k-1})/k)^2 M2,k=i=1∑k(xi−M1,k)2=i=1∑k(xi−M1,k−1−(xk−M1,k−1)/k)2
= [ ( x i − M 1 , k − 1 ) 2 + 1 k 2 ( x k − M 1 , k − 1 ) 2 − 2 ( x i − M 1 , k − 1 ) 1 k ( x k − M 1 , k − 1 ) ] \displaystyle=[(x_i-M_{1,k-1})^2+\frac{1}{k^2}(x_k-M_{1,k-1})^2-2(x_i-M_{1,k-1})\frac{1}{k}(x_k-M_{1,k-1})] =[(xi−M1,k−1)2+k21(xk−M1,k−1)2−2(xi−M1,k−1)k1(xk−M1,k−1)]
= ∑ i = 1 k ( x i − M 1 , k − 1 ) 2 + ∑ i = 1 k 1 k 2 ( x k − M 1 , k − 1 ) 2 − 2 k ∑ i = 1 k ( x i − M 1 , k − 1 ) ( x k − M 1 , k − 1 ) \displaystyle= \sum_{i=1}^k(x_i-M_{1,k-1})^2+ \sum_{i=1}^k\frac{1}{k^2}(x_k-M_{1,k-1})^2 - \frac{2}{k} \sum_{i=1}^k{(x_i-M_{1,k-1})(x_k-M_{1,k-1})} =i=1∑k(xi−M1,k−1)2+i=1∑kk21(xk−M1,k−1)2−k2i=1∑k(xi−M1,k−1)(xk−M1,k−1)
= ∑ i = 1 k − 1 ( x i − M 1 , k − 1 ) 2 + ( x k − M 1 , k − 1 ) 2 + 1 k ( x k − M 1 , k − 1 ) 2 − 2 k ( k M 1 , k − k M 1 , k − 1 ) ( x k − M 1 , k − 1 ) \displaystyle= \sum_{i=1}^{k-1}{(x_i-M_{1,k-1})^2} +(x_k-M_{1,k-1})^2+ \frac{1}{k}(x_k-M_{1,k-1})^2 - \frac{2}{k}{(kM_{1,k} - k M_{1,k-1})(x_k-M_{1,k-1})} =i=1∑k−1(xi−M1,k−1)2+(xk−M1,k−1)2+k1(xk−M1,k−1)2−k2(kM1,k−kM1,k−1)(xk−M1,k−1)
= M 2 , k − 1 + ( x k − M 1 , k − 1 ) 2 + 1 k ( x k − M 1 , k − 1 ) 2 − 2 ( M 1 , k − M 1 , k − 1 ) ( x k − M 1 , k − 1 ) \displaystyle= M_{2,k-1} + (x_k-M_{1,k-1})^2 + \frac{1}{k}(x_k-M_{1,k-1})^2 - 2(M_{1,k} - M_{1,k-1})(x_k-M_{1,k-1}) =M2,k−1+(xk−M1,k−1)2+k1(xk−M1,k−1)2−2(M1,k−M1,k−1)(xk−M1,k−1)
= M 2 , k − 1 + ( x k − M 1 , k − 1 ) 2 + 1 k ( x k − M 1 , k − 1 ) 2 − 2 1 k ( x k − M 1 , k − 1 ) ( x k − M 1 , k − 1 ) \displaystyle= M_{2,k-1} + (x_k-M_{1,k-1})^2 + \frac{1}{k}(x_k-M_{1,k-1})^2 - 2\frac{1}{k}(x_k - M_{1,k-1})(x_k-M_{1,k-1}) =M2,k−1+(xk−M1,k−1)2+k1(xk−M1,k−1)2−2k1(xk−M1,k−1)(xk−M1,k−1)
= M 2 , k − 1 + ( 1 − 1 k ) ( x k − M 1 , k − 1 ) 2 \displaystyle= M_{2,k-1} + (1-\frac{1}{k})(x_k-M_{1,k-1})^2 =M2,k−1+(1−k1)(xk−M1,k−1)2
= M 2 , k − 1 + ( x k − M 1 , k − 1 ) ( 1 − 1 k ) ( x k − M 1 , k − 1 ) \displaystyle= M_{2,k-1} + (x_k-M_{1,k-1})(1-\frac{1}{k})(x_k-M_{1,k-1}) =M2,k−1+(xk−M1,k−1)(1−k1)(xk−M1,k−1)
= M 2 , k − 1 + ( x k − M 1 , k − 1 ) [ x k − 1 k x k − ( 1 − 1 k ) M 1 , k − 1 ] \displaystyle= M_{2,k-1} + (x_k-M_{1,k-1})\Big[x_k - \frac{1}{k} x_k - (1-\frac{1}{k})M_{1,k-1}\Big] =M2,k−1+(xk−M1,k−1)[xk−k1xk−(1−k1)M1,k−1]
= M 2 , k − 1 + ( x k − M 1 , k − 1 ) [ x k − 1 k x k − ( M 1 , k − 1 k x k ) ] \displaystyle= M_{2,k-1} + (x_k-M_{1,k-1})\Big[x_k - \frac{1}{k} x_k - (M_{1,k}-\frac{1}{k}x_k)\Big] =M2,k−1+(xk−M1,k−1)[xk−k1xk−(M1,k−k1xk)]
= M 2 , k − 1 + ( x k − M 1 , k − 1 ) ( x k − M 1 , k ) \displaystyle= M_{2,k-1} + (x_k-M_{1,k-1})(x_k - M_{1,k}) =M2,k−1+(xk−M1,k−1)(xk−M1,k)