python while循环可以并行吗,python中的并行化(非对称)循环

The following code is written in python and it works, i.e. returns the expected result. However, it is very slow and I believe that can be optimized.

G_tensor = numpy.matlib.identity(N_particles*3,dtype=complex)

for i in range(N_particles):

for j in range(i, N_particles):

if i != j:

#Do lots of things, here is shown an example.

# However you should not be scared because

#it only fills the G_tensor

R = numpy.linalg.norm(numpy.array(positions[i])-numpy.array(positions[j]))

rx = numpy.array(positions[i][0])-numpy.array(positions[j][0])

ry = numpy.array(positions[i][1])-numpy.array(positions[j][1])

rz = numpy.array(positions[i][2])-numpy.array(positions[j][2])

krq = (k*R)**2

pf = -k**2*alpha*numpy.exp(1j*k*R)/(4*math.pi*R)

a = 1.+(1j*k*R-1.)/(krq)

b = (3.-3.*1j*k*R-krq)/(krq)

G_tensor[3*i+0,3*j+0] = pf*(a + b * (rx*rx)/(R**2)) #Gxx

G_tensor[3*i+1,3*j+1] = pf*(a + b * (ry*ry)/(R**2)) #Gyy

G_tensor[3*i+2,3*j+2] = pf*(a + b * (rz*rz)/(R**2)) #Gzz

G_tensor[3*i+0,3*j+1] = pf*(b * (rx*ry)/(R**2)) #Gxy

G_tensor[3*i+0,3*j+2] = pf*(b * (rx*rz)/(R**2)) #Gxz

G_tensor[3*i+1,3*j+0] = pf*(b * (ry*rx)/(R**2)) #Gyx

G_tensor[3*i+1,3*j+2] = pf*(b * (ry*rz)/(R**2)) #Gyz

G_tensor[3*i+2,3*j+0] = pf*(b * (rz*rx)/(R**2)) #Gzx

G_tensor[3*i+2,3*j+1] = pf*(b * (rz*ry)/(R**2)) #Gzy

G_tensor[3*j+0,3*i+0] = pf*(a + b * (rx*rx)/(R**2)) #Gxx

G_tensor[3*j+1,3*i+1] = pf*(a + b * (ry*ry)/(R**2)) #Gyy

G_tensor[3*j+2,3*i+2] = pf*(a + b * (rz*rz)/(R**2)) #Gzz

G_tensor[3*j+0,3*i+1] = pf*(b * (rx*ry)/(R**2)) #Gxy

G_tensor[3*j+0,3*i+2] = pf*(b * (rx*rz)/(R**2)) #Gxz

G_tensor[3*j+1,3*i+0] = pf*(b * (ry*rx)/(R**2)) #Gyx

G_tensor[3*j+1,3*i+2] = pf*(b * (ry*rz)/(R**2)) #Gyz

G_tensor[3*j+2,3*i+0] = pf*(b * (rz*rx)/(R**2)) #Gzx

G_tensor[3*j+2,3*i+1] = pf*(b * (rz*ry)/(R**2)) #Gzy

Do you know how can I parallelize it? You should note that the two loops are not symmetric.

Edit one: A numpythonic solution was presented above and I made a comparison between the c++ implementation, my loop version in python and thr numpythonic. Results are the following:

- c++ = 0.14seg

- numpythonic version = 1.39seg

- python loop version = 46.56seg

Probably results can get better if we use the intel version of numpy.

解决方案

Python is not a fast language. Number crunching with python should always use for time critical parts code written in a compiled language. With compilation down to the CPU level you can speed up the code by a factor up to 100 and then still go for parallelization. So I would not look down to using more cores doing inefficient stuff, but to work more efficient. I see the following ways to speed up the code:

1) Better use of numpy: Can you do your calculations instead on scalar level directly on vector/matrix level? eg. rx = positions[:,0]-positions[0,:] (not checked if that is correct) but something along those lines.

If that is not possible with your kind of calculations, than you can go for option 2 or 3

2) Use cython. Cython compiles Python code to C, which is then compiled to your CPU. By using static typing at the right places you can make your code much faster, see cython tutorials eg.: http://cython.readthedocs.io/en/latest/src/quickstart/cythonize.html

3) If you are familiar with FORTRAN, it might be a good idea to write just this part in FORTRAN and then call it from Python using f2py. In fact, your code looks a lot like FORTRAN anyway. For C and C++ SWIG is one great tool to make compiled code available in Python, but there are plenty of other techniques (cython, Boost::Python, ctypes, numba etc.)

When you have done this, and it is still to slow, using GPU power with pyCUDA or parallelization with mpi4py or multiprocessing might be an option.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值