python循环叠加,数组的连续重叠子集(NumPy,Python)

I have a NumPy array [1,2,3,4,5,6,7,8,9,10,11,12,13,14] and want to have an array structured like [[1,2,3,4], [2,3,4,5], [3,4,5,6], ..., [11,12,13,14]].

Sure this is possible by looping over the large array and adding arrays of length four to the new array, but I'm curious if there is some secret 'magic' Python method doing just this :)

解决方案

The fastest way seems to be to preallocate the array, given as option 7 right at the bottom of this answer.

>>> import numpy as np

>>> A=np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14])

>>> A

array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])

>>> np.array(zip(A,A[1:],A[2:],A[3:]))

array([[ 1, 2, 3, 4],

[ 2, 3, 4, 5],

[ 3, 4, 5, 6],

[ 4, 5, 6, 7],

[ 5, 6, 7, 8],

[ 6, 7, 8, 9],

[ 7, 8, 9, 10],

[ 8, 9, 10, 11],

[ 9, 10, 11, 12],

[10, 11, 12, 13],

[11, 12, 13, 14]])

>>>

You can easily adapt this to do it for variable chunk size.

>>> n=5

>>> np.array(zip(*(A[i:] for i in range(n))))

array([[ 1, 2, 3, 4, 5],

[ 2, 3, 4, 5, 6],

[ 3, 4, 5, 6, 7],

[ 4, 5, 6, 7, 8],

[ 5, 6, 7, 8, 9],

[ 6, 7, 8, 9, 10],

[ 7, 8, 9, 10, 11],

[ 8, 9, 10, 11, 12],

[ 9, 10, 11, 12, 13],

[10, 11, 12, 13, 14]])

You may wish to compare performance between this and using itertools.islice.

>>> from itertools import islice

>>> n=4

>>> np.array(zip(*[islice(A,i,None) for i in range(n)]))

array([[ 1, 2, 3, 4],

[ 2, 3, 4, 5],

[ 3, 4, 5, 6],

[ 4, 5, 6, 7],

[ 5, 6, 7, 8],

[ 6, 7, 8, 9],

[ 7, 8, 9, 10],

[ 8, 9, 10, 11],

[ 9, 10, 11, 12],

[10, 11, 12, 13],

[11, 12, 13, 14]])

My timing results:

1. timeit np.array(zip(A,A[1:],A[2:],A[3:]))

10000 loops, best of 3: 92.9 us per loop

2. timeit np.array(zip(*(A[i:] for i in range(4))))

10000 loops, best of 3: 101 us per loop

3. timeit np.array(zip(*[islice(A,i,None) for i in range(4)]))

10000 loops, best of 3: 101 us per loop

4. timeit numpy.array([ A[i:i+4] for i in range(len(A)-3) ])

10000 loops, best of 3: 37.8 us per loop

5. timeit numpy.array(list(chunks(A, 4)))

10000 loops, best of 3: 43.2 us per loop

6. timeit numpy.array(byN(A, 4))

10000 loops, best of 3: 100 us per loop

# Does preallocation of the array help? (11 is from len(A)+1-4)

7. timeit B=np.zeros(shape=(11, 4),dtype=np.int32)

100000 loops, best of 3: 2.19 us per loop

timeit for i in range(4):B[:,i]=A[i:11+i]

10000 loops, best of 3: 20.9 us per loop

total 23.1us per loop

As len(A) increases (20000) 4 and 5 converge to be equivalent speed (44 ms). 1,2,3 and 6 all remain about 3 times slower (135 ms). 7 is much faster (1.36 ms).

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值