下面是一个使用NumPy工具的关联定义,用于^{}-pd.Series(corr2_coeff_rowwise(dfa.values,dfb.values))
样本运行-
^{pr2}$
运行时测试
案例1:dfb和4列中的大量行-In [77]: dfa = pd.DataFrame(np.random.randint(1,100,(1,4)))
In [78]: dfb = pd.DataFrame(np.random.randint(1,100,(30000,4)))
# @sera's soln
In [79]: %timeit dfb.corrwith(dfa.iloc[0], axis=1)
1 loop, best of 3: 4.09 s per loop
In [80]: %timeit pd.Series(corr2_coeff_rowwise(dfa.values,dfb.values))
1000 loops, best of 3: 1.53 ms per loop
案例2:在dfb和400列中的适当行数-In [83]: dfa = pd.DataFrame(np.random.randint(1,100,(1,400)))
In [85]: dfb = pd.DataFrame(np.random.randint(1,100,(300,400)))
In [86]: %timeit dfb.corrwith(dfa.iloc[0], axis=1)
10 loops, best of 3: 44.8 ms per loop
In [87]: %timeit pd.Series(corr2_coeff_rowwise(dfa.values,dfb.values))
1000 loops, best of 3: 635 µs per loop