python两个数组相同个数_python – 快速算法,用于查找多个数组具有相同值的索引...

最新推荐文章于 2023-10-11 15:37:39 发布

weixin_39847437

最新推荐文章于 2023-10-11 15:37:39 发布

阅读量2k

点赞数

文章标签： python两个数组相同个数

最终破解了它的矢量化解决方案！这是一个有趣的问题.问题是我们必须标记从列表的相应数组元素中获取的每对值.然后,我们应该根据它们在其他对中的唯一性来标记每个这样的对.因此,我们可以使用np.unique滥用所有可选参数,最后做一些额外的工作来保持最终输出的顺序.这里的实施基本上分三个阶段完成 –

# Stack as a 2D array with each pair from values as a column each.

# Convert to linear index equivalent considering each column as indexing tuple

arr = np.vstack(values)

idx = np.ravel_multi_index(arr,arr.max(1)+1)

# Do the heavy work with np.unique to give us :

# 1. Starting indices of unique elems,

# 2. Srray that has unique IDs for each element in idx, and

# 3. Group ID counts

_,unq_start_idx,unqID,count = np.unique(idx,return_index=True, \

return_inverse=True,return_counts=True)

# Best part happens here : Use mask to ignore the repeated elems and re-tag

# each unqID using argsort() of masked elements from idx

mask = ~np.in1d(unqID,np.where(count>1)[0])

mask[unq_start_idx] = 1

out = idx[mask].argsort()[unqID]

运行时测试

让我们将提出的矢量化方法与原始代码进行比较.由于建议的代码仅为我们提供了组ID,因此对于公平的基准测试,我们只需从原始代码中删除不用于提供给我们的部分.那么,这是函数定义 –

def groupify(values): # Original code

group = np.zeros((len(values[0]),), dtype=np.int64) - 1

next_hash = 0

matching = np.ones((len(values[0]),), dtype=bool)

while any(group == -1):

matching[:] = (group == -1)

first_ungrouped_idx = np.where(matching)[0][0]

for curr_id, value_array in enumerate(values):

needed_value = value_array[first_ungrouped_idx]

matching[matching] = value_array[matching] == needed_value

# Assign all of the found elements to a new group

group[matching] = next_hash

next_hash += 1

return group

def groupify_vectorized(values): # Proposed code

arr = np.vstack(values)

idx = np.ravel_multi_index(arr,arr.max(1)+1)

_,unq_start_idx,unqID,count = np.unique(idx,return_index=True, \

return_inverse=True,return_counts=True)

mask = ~np.in1d(unqID,np.where(count>1)[0])

mask[unq_start_idx] = 1

return idx[mask].argsort()[unqID]

运行时结果列表包含大型数组 –

In [345]: # Input list with random elements

...: values = [item for item in np.random.randint(10,40,(10,10000))]

In [346]: np.allclose(groupify(values),groupify_vectorized(values))

Out[346]: True

In [347]: %timeit groupify(values)

1 loops, best of 3: 4.02 s per loop

In [348]: %timeit groupify_vectorized(values)

100 loops, best of 3: 3.74 ms per loop

weixin_39847437

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。