似乎
scipy.stats.pearsonr遵循Pearson相关系数公式的这个定义,该公式应用于来自A& A的列方对. B –
基于该公式,您可以轻松地进行矢量化,因为来自A和B的列的成对计算彼此独立.这是使用broadcasting的一个矢量化解决方案 –
# Get number of rows in either A or B
N = B.shape[0]
# Store columnw-wise in A and B, as they would be used at few places
sA = A.sum(0)
sB = B.sum(0)
# Basically there are four parts in the formula. We would compute them one-by-one
p1 = N*np.einsum('ij,ik->kj',A,B)
p2 = sA*sB[:,None]
p3 = N*((B**2).sum(0)) - (sB**2)
p4 = N*((A**2).sum(0)) - (sA**2)
# Finally compute Pearson Correlation Coefficient as 2D array
pcorr = ((p1 - p2)/np.sqrt(p4*p3[:,None]))
# Get the element corresponding to absolute argmax along the columns
out = pcorr[np.nanargma