I have to multiply very large 2D-arrays in Python for around 100 times. Each matrix consists of 32000x32000 elements.
I'm using np.dot(X,Y), but it takes very long time for each multiplication... Below an instance of my code:
import numpy as np
X = None
for i in range(100)
multiplying = True
if X == None:
X = generate_large_2darray()
multiplying = False
else:
Y = generate_large_2darray()
if multiplying:
X = np.dot(X, Y)
Is there any other method much faster?
Update
Here is a screenshot showing the htop interface. My python script is using only one core. Also, after 3h25m only 4 multiplications have been done.
Update 2
I've tried to execute:
import numpy.distutils.system_info as info
info.get_info('atlas')
but I've received:
/home/francescof/.local/lib/python2.7/site-packages/numpy/distutils/system_info.py:564: UserWarning: Specified path /home/apy/atlas/lib is invalid. warnings.warn('Specified path %s is invalid.' % d) {}
So, I think it's not well-configured.
Vice versa, regarding blas I just receive {}, with no warnings or errors.
解决方案
As suggested by ali_m, the using of a BLAS library can speed up the operations. However, the problem in my system was a bad configuration of numpy. Here is the solution:
1) make sure to have all required libraries (you can use ATLAS, OpenBLAS, etc.). I've chosen ATLAS in my case since directly supported in Ubuntu.
sudo apt-get install libatlas3gf-base libatlas-base-dev libatlas-dev
2) remove any previous numpy installations, e.g., pypm uninstall numpy (if you installed it using ActivePython)
3) install again numpy using pip: pip install numpy
4) make sure your atlas is correctly linked:
import numpy.distutils.system_info as info
info.get_info('atlas')
ATLAS version 3.8.4 built by buildd on Sat Sep 10 23:12:12 UTC 2011:
UNAME : Linux crested 2.6.24-29-server #1 SMP Wed Aug 10 15:58:57 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
INSTFLG : -1 0 -a 1
ARCHDEFS : -DATL_OS_Linux -DATL_ARCH_HAMMER -DATL_CPUMHZ=1993 -DATL_USE64BITS -DATL_GAS_x8664
F2CDEFS : -DAdd_ -DF77_INTEGER=int -DStringSunStyle
CACHEEDGE: 393216
F77 : gfortran, version GNU Fortran (Ubuntu/Linaro 4.6.1-9ubuntu2) 4.6.1
F77FLAGS : -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -Wa,--noexecstack -fPIC -m64
SMC : gcc, version gcc (Ubuntu/Linaro 4.6.1-9ubuntu2) 4.6.1
SMCFLAGS : -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -Wa,--noexecstack -fPIC -m64
SKC : gcc, version gcc (Ubuntu/Linaro 4.6.1-9ubuntu2) 4.6.1
SKCFLAGS : -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -Wa,--noexecstack -fPIC -m64
{'libraries': ['lapack', 'f77blas', 'cblas', 'atlas'], 'library_dirs': ['/usr/lib/atlas-base/atlas', '/usr/lib/atlas-base'], 'define_macros': [('ATLAS_INFO', '"\\"3.8.4\\""')], 'language': 'f77', 'include_dirs': ['/usr/include/atlas']}