python编程计算中位数_python – NumPy:计算累积中位数

这是一种沿行复制元素的方法,为我们提供了一个2D数组.然后,我们将用大数字填充上三角区域,以便稍后当我们沿着每一行对数组进行排序时,基本上将所有元素排序到对角线元素并模拟累积窗口.然后,按照选择中间一个的中位数或两个中间一个的平均值的定义(对于偶数没有元素),我们将得到第一个位置的元素:(0,0),然后是第二行:平均值(1,0)& (1,1),第三行:(2,1),第四行:(3,1)& (3,2)等.因此,我们将从排序数组中提取出那些元素,从而得到我们的中值.

因此,实施将是 –

def cummedian_sorted(a):

n = a.size

maxn = a.max()+1

a_tiled_sorted = np.tile(a,n).reshape(-1,n)

mask = np.triu(np.ones((n,n),dtype=bool),1)

a_tiled_sorted[mask] = maxn

a_tiled_sorted.sort(1)

all_rows = a_tiled_sorted[np.arange(n), np.arange(n)//2].astype(float)

idx = np.arange(1,n,2)

even_rows = a_tiled_sorted[idx, np.arange(1,1+(n//2))]

all_rows[idx] += even_rows

all_rows[1::2] /= 2.0

return all_rows

运行时测试

方法 –

# Loopy solution from @Uriel's soln

def cummedian_loopy(arr):

return [median(a[:i]) for i in range(1,len(a)+1)]

# Nan-fill based solution from @Nickil Maveli's soln

def cummedian_nanfill(arr):

a = np.tril(arr).astype(float)

a[np.triu_indices(a.shape[0], k=1)] = np.nan

return np.nanmedian(a, axis=1)

计时 –

设置#1:

In [43]: a = np.random.randint(0,100,(100))

In [44]: print np.allclose(cummedian_loopy(a), cummedian_sorted(a))

...: print np.allclose(cummedian_loopy(a), cummedian_nanfill(a))

...:

True

True

In [45]: %timeit cummedian_loopy(a)

...: %timeit cummedian_nanfill(a)

...: %timeit cummedian_sorted(a)

...:

1000 loops, best of 3: 856 µs per loop

1000 loops, best of 3: 778 µs per loop

10000 loops, best of 3: 200 µs per loop

设置#2:

In [46]: a = np.random.randint(0,100,(1000))

In [47]: print np.allclose(cummedian_loopy(a), cummedian_sorted(a))

...: print np.allclose(cummedian_loopy(a), cummedian_nanfill(a))

...:

True

True

In [48]: %timeit cummedian_loopy(a)

...: %timeit cummedian_nanfill(a)

...: %timeit cummedian_sorted(a)

...:

10 loops, best of 3: 118 ms per loop

10 loops, best of 3: 47.6 ms per loop

100 loops, best of 3: 18.8 ms per loop

设置#3:

In [49]: a = np.random.randint(0,100,(5000))

In [50]: print np.allclose(cummedian_loopy(a), cummedian_sorted(a))

...: print np.allclose(cummedian_loopy(a), cummedian_nanfill(a))

True

True

In [54]: %timeit cummedian_loopy(a)

...: %timeit cummedian_nanfill(a)

...: %timeit cummedian_sorted(a)

...:

1 loops, best of 3: 3.36 s per loop

1 loops, best of 3: 583 ms per loop

1 loops, best of 3: 521 ms per loop

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值