方法1:矢量化一线解决方案-
np.diff(np.r_[0,np.flatnonzero(np.diff(np.sign(X))!=0)+1, len(X)])
方法2:或者,为了提高性能,我们可以使用切片来代替符号值的微分,并在连接步骤中使用更快的np.concatenate代替np.r_,如下所示:
s = np.sign(X)
out = np.diff(np.concatenate(( [0], np.flatnonzero(s[1:]!=s[:-1])+1, [len(X)] )))
方法#3:同样,如果符号变化的次数与输入数组的长度相比是相当大的,则可能要对符号变化的掩码数组进行级联.掩码数组/布尔数组比int或float数组具有更高的内存效率,可能会提高性能.
因此,另一种方法是-
s = np.sign(X)
mask = np.concatenate(( [True], s[1:]!=s[:-1], [True] ))
out = np.diff(np.flatnonzero(mask))
扩展到2D情况
我们可以将方法3扩展到2D数组的情况,并进行更多的工作,这些工作将与代码注释一起进行解释.好的是,连接部分使我们可以在扩展工作期间保持代码矢量化.因此,在需要逐行符号持久性的2D数组上,实现看起来像这样:
# Get signs. Get one-off shifted mask for each row.
# Concatenate at either ends of each row with True values, getting us 2D mask
s = np.sign(X)
T = np.ones((X.shape[0],1),dtype=bool)
mask2D = np.column_stack(( T, s[:,1:]!=s[:,:-1], T ))
# Get flattened nonzeros indices on the 2D mask.
all_intervals = np.diff(np.flatnonzero(mask2D.ravel()))
# We need to remove the indices that were generated because of the True values
# concatenation. So, get those indices and delete those.
rm_idx = (mask2D[:-1].sum(1)-1).cumsum()
all_intervals1 = np.delete(all_intervals, rm_idx + np.arange(X.shape[0]-1))
# Finally, split the indices into a list of arrays, with each array giving us
# the counts of sign persistences
out = np.split(all_intervals1, rm_idx )
样本输入,输出-
In [212]: X
Out[212]:
array([[-3, 1, -3, -2, 2, 3, -3, 1, 1, -1],
[-2, -3, 0, -2, -2, 0, 3, -1, -2, 2],
[ 0, -1, -3, -2, -2, 3, -3, -2, 1, 1],
[ 1, -3, 0, -1, -2, 1, -1, 1, 3, 2],
[-1, 1, 0, -2, 0, -1, -1, -3, 0, 1]])
In [213]: out
Out[213]:
[array([1, 1, 2, 2, 1, 2, 1]),
array([2, 1, 2, 1, 1, 2, 1]),
array([1, 4, 1, 2, 2]),
array([1, 1, 1, 2, 1, 1, 3]),
array([1, 1, 1, 1, 1, 3, 1, 1])]