python数组运算技巧_python - 布尔矩阵运算的最快方法_performance_酷徒编程知识库...

只需在compute中进行一些小的更改:def compute(m, n):

m = np.asarray(m)

n = np.asarray(n)

# Apply mask N in advance

m2 = m & n

# Pack booleans into uint8 for more efficient bitwise operations

# Also transpose for better caching (maybe?)

mb = np.packbits(m2.T, axis=1)

# Table with number of ones in each uint8

num_bits = (np.arange(256)[:, np.newaxis] & (1 << np.arange(8))).astype(bool).sum(1)

# Allocate output array

out = np.zeros((m2.shape[1], m2.shape[1]), np.int32)

# Do the counting with Numba

_compute_nb(mb, num_bits, out)

# Make output symmetric

out = out + out.T

# Add values in diagonal

out[np.diag_indices_from(out)] = m2.sum(0)

# Scale by number of ones in n

return out

我会使用一些Numba技巧。首先,您只能执行按列操作的一半,因为另一半是重复的。第三,可以使用多进程并行处理,总的来说,你可以这样做:import numpy as np

import numba as nb

def compute(m, n):

m = np.asarray(m)

n = np.asarray(n)

# Pack booleans into uint8 for more efficient bitwise operations

# Also transpose for better caching (maybe?)

mb = np.packbits(m.T, axis=1)

# Table with number of ones in each uint8

num_bits = (np.arange(256)[:, np.newaxis] & (1 << np.arange(8))).astype(bool).sum(1)

# Allocate output array

out = np.zeros((m.shape[1], m.shape[1]), np.int32)

# Do the counting with Numba

_compute_nb(mb, num_bits, out)

# Make output symmetric

out = out + out.T

# Add values in diagonal

out[np.diag_indices_from(out)] = m.sum(0)

# Scale by number of ones in n

out *= n.sum()

return out

@nb.njit(parallel=True)

def _compute_nb(mb, num_bits, out):

# Go through each pair of columns without repetitions

for i in nb.prange(mb.shape[0] - 1):

for j in nb.prange(1, mb.shape[0]):

# Count common bits

v = 0

for k in range(mb.shape[1]):

v += num_bits[mb[i, k] & mb[j, k]]

out[i, j] = v

# Test

m = np.array([[ True, True, False, True],

[False, True, True, True],

[False, False, False, False],

[False, True, False, False],

[ True, True, False, False]])

n = np.array([[ True],

[False],

[ True],

[ True],

[ True]])

out = compute(m, n)

print(out)

# [[ 8 8 0 4]

# [ 8 16 4 8]

# [ 0 4 4 4]

# [ 4 8 4 8]]

快速比较,这是针对原始循环和仅NumPy的方法的一个小型基准:import numpy as np

# Original loop

def compute_loop(m, n):

out = np.zeros((m.shape[1], m.shape[1]), np.int32)

for i in range(m.shape[1]):

for j in range(m.shape[1]):

result = m[:, i] & m[:, j]

out[i, j] = np.sum(result & n)

return out

# Divakar methods

def compute2(m, n):

return np.einsum('ij,ik,lm->jk', m, m.astype(int), n)

def compute3(m, n):

return np.einsum('ij,ik->jk',m, m.astype(int)) * n.sum()

def compute4(m, n):

return np.tensordot(m, m.astype(int),axes=((0,0))) * n.sum()

def compute5(m, n):

return m.T.dot(m.astype(int))*n.sum()

# Make random data

np.random.seed(0)

m = np.random.rand(1000, 100) > .5

n = np.random.rand(1000, 1) > .5

print(compute(m, n).shape)

# (100, 100)

%timeit compute(m, n)

# 768 µs ± 17.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit compute_loop(m, n)

# 11 s ± 1.23 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit compute2(m, n)

# 7.65 s ± 1.06 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit compute3(m, n)

# 23.5 ms ± 1.53 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit compute4(m, n)

# 8.96 ms ± 194 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit compute5(m, n)

# 8.35 ms ± 266 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
numpy数组元素周围的操作可以通过以下几种方式实现: 1. 切片操作:可以使用numpy数组的切片操作来获取数组中元素的周围元素。 例如,对于一个二维数组arr,要获取第i行第j列元素周围的元素,可以使用如下切片操作: ```python arr[i-1:i+2, j-1:j+2] ``` 这将返回一个3x3的子数组,其中心元素为arr[i,j],周围的8个元素为该子数组的其余元素。 2. 使用numpy.pad()函数:numpy.pad()函数可以用来在数组的边缘添加一个或多个值,从而扩展数组的大小。可以使用该函数来添加额外的行和列,然后通过索引访问周围的元素。 例如,对于一个二维数组arr,要获取第i行第j列元素周围的元素,可以使用如下代码: ```python padded_arr = np.pad(arr, ((1, 1), (1, 1)), mode='constant') surrounding = padded_arr[i:i+3, j:j+3] ``` 这将在数组的边缘添加一行和一列,并使用常量值填充这些额外的元素。然后可以使用切片操作来获取中心元素周围的元素。 3. 使用numpy.roll()函数:numpy.roll()函数可以用来沿着给定轴滚动数组的元素。可以使用该函数来将数组的行和列进行滚动,从而获取周围的元素。 例如,对于一个二维数组arr,要获取第i行第j列元素周围的元素,可以使用如下代码: ```python rows, cols = arr.shape row_indices = np.arange(i-1, i+2) % rows col_indices = np.arange(j-1, j+2) % cols surrounding = arr[row_indices][:, col_indices] ``` 这将将第i行向上和向下滚动一行,并将第j列向左和向右滚动一列,从而获取中心元素周围的元素。使用模运算可以确保在数组的边缘滚动时正确处理索引。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值