I'm currently learning about broadcasting in Numpy and in the book I'm reading (Python for Data Analysis by Wes McKinney the author has mentioned the following example to "demean" a two-dimensional array:
import numpy as np
arr = np.random.randn(4, 3)
print(arr.mean(0))
demeaned = arr - arr.mean(0)
print(demeaned)
print(demeand.mean(0))
Which effectively causes the array demeaned to have a mean of 0.
我的想法是将其应用于类似图像的三维数组:
import numpy as np
arr = np.random.randint(0, 256, (400,400,3))
demeaned = arr - arr.mean(2)
当然哪个失败了,因为根据广播规则,尾随尺寸必须匹配,在这里不是这种情况:
print(arr.shape) # (400, 400, 3)
print(arr.mean(2).shape) # (400, 400)
现在,通过从数组第三维中的每个索引中减去均值,我已经使其大部分工作了:
demeaned = np.ones(arr.shape)
for i in range(3):
demeaned[...,i] = arr[...,i] - means
print(demeaned.mean(0))
此时,返回值非常接近零,我认为这是一个精度误差。我真的对这个想法正确,还是我想念了另一个警告?
Also, this doesn't seam to be the cleanest, most 'numpy'-way to achieve what i wanted to achieve. Is there a function or a principle that i can make use of to improve the code?