我有一个三维数组(如下,z),例如,在时间上代表一系列2D数组(如下,a1和a2)。我想为所有这些2D数组沿它们的轴选择一些值(下面两个参考轴(x和y)的条件),然后对产生的“较小”2D数组执行一些操作(例如,平均值、求和…)。在
下面的代码提出了几种方法。我发现solution1非常不雅观,但它似乎比solution2快。为什么会这样呢?有没有更好的方法(更简洁、更高效(速度和内存))?在
关于步骤2,哪一个是最好的选择,还有其他更有效的方法吗?为什么计算C2不起作用?谢谢!
[灵感来源:Get mean of 2D slice of a 3D array in numpy]import numpy
import time
# Control parameters (to be modified to make different tests)
xx=1000
yy=6000
# Some 2D arrays, z is a 3D array containing a succesion of such arrays (2 here)
a1=numpy.arange(xx*yy).reshape((yy, xx))
a2=numpy.linspace(0,100, num=xx*yy).reshape((yy, xx))
z=numpy.array((a1, a2))
# Axes x and y along which conditioning for the 2D arrays is made
x=numpy.arange(xx)
y=numpy.arange(yy)
# Condition is on x and y, to be applied on a1 and a2 simultaneously
xmin, xmax = xx*0.4, xx*0.8
ymin, ymax = yy*0.2, yy*0.5
xcond = numpy.logical_and(x>=xmin, x<=xmax)
ycond = numpy.logical_and(y>=ymin, y<=ymax)
def solution1():
xcond2D = numpy.tile(xcond, (yy, 1))
ycond2D = numpy.tile(ycond[numpy.newaxis].transpose(), (1, xx))
xymask = numpy.logical_not(numpy.logical_and(xcond2D, ycond2D))
xymaskzdim = numpy.tile(xymask, (z.shape[0], 1, 1))
return numpy.ma.MaskedArray(z, xymaskzdim)
def solution2():
return z[:,:,xcond][:,ycond, :]
start=time.clock()
z1=solution1()
end=time.clock()
print "Solution1: %s sec" % (end-start)
start=time.clock()
z2=solution2()
end=time.clock()
print "Solution2: %s sec" % (end-start)
# Step 2
# Now compute some calculation on the resulting z1 or z2
print "A1: ", z2.reshape(z2.shape[0], z2.shape[1]*z2.shape[2]).mean(axis=1)
print "A2: ", z1.reshape(z1.shape[0], z1.shape[1]*z1.shape[2]).mean(axis=1)
print "B1: ", z2.mean(axis=2).mean(axis=1)
print "B2: ", z1.mean(axis=2).mean(axis=1)
print "Numpy version: ", numpy.version.version
print "C1: ", z2.mean(axis=(1, 2))
print "C2: ", z1.mean(axis=(1, 2))
输出:
^{pr2}$