统计相关
code | note | |
---|---|---|
最小值 | numpy.amin(a[, axis=None, out=None, keepdims=np._NoValue, initial=np._NoValue, where=np._NoValue]) | - axis=None,取array所有元素的最小 - axis=0,取每列最小 - axis=1,取每行最小 |
最大值 | numpy.amax(a[, axis=None, out=None, keepdims=np._NoValue, initial=np._NoValue, where=np._NoValue]) | 同上 |
极差 | numpy.ptp(a, axis=None, out=None, keepdims=np._NoValue) | 同上 |
分位数 | numpy.percentile(a, q, axis=None, out=None, overwrite_input=False, interpolation='linear', keepdims=False) | - q:分位点应该是介于0-100的float,可以是[q1,q2] - axis同上 |
中位数 | numpy.median(a, axis=None, out=None, overwrite_input=False, keepdims=False) | axis同上 |
平均值 | numpy.mean(a[, axis=None, dtype=None, out=None, keepdims=np._NoValue)]) | axis |
加权平均值 | numpy.average(a[, axis=None, weights=None, returned=False]) | weight应该是一个和a一样大小的array |
方差 | numpy.var(a[, axis=None, dtype=None, out=None, ddof=0, keepdims=np._NoValue]) | - ddof=0 :方差- ddof=1 :样本方差 |
标准差 | numpy.std(a[, axis=None, dtype=None, out=None, ddof=0, keepdims=np._NoValue]) | ddof和axis同上 |
协方差 | numpy.cov(m, y=None, rowvar=True, bias=False, ddof=None, fweights=None,aweights=None) | 默认计算样本协方差和样本方差 |
相关系数 | numpy.corrcoef(x, y=None, rowvar=True, bias=np._NoValue, ddof=np._NoValue) | 同上 |
分组位置 | numpy.digitize(x, bins, right=False) Return the indices of the bins to which each value in input array belongs. | - bins:一维单调数组,必须是升序或者降序 - right:间隔是否包含最右 - 返回值:x在bins中的位置。 |
import numpy as np
np.random.seed(20200623)
x = np.random.randint(0, 20, size=[4, 5])
print(x)
# [[10 2 1 1 16]
# [18 11 10 14 10]
# [11 1 9 18 8]
# [16 2 0 15 16]]
print(np.percentile(x, [25, 50]))
# [ 2. 10.]
print(np.percentile(x, [25, 50], axis=0))
# [[10.75 1.75 0.75 10.75 9.5 ]
# [13.5 2. 5. 14.5 13. ]]
print(np.percentile(x, [25, 50], axis=1))
# [[ 1. 10. 8. 2.]
# [ 2. 11. 9. 15.]]