I have a function that does the task I need, but is not very efficient. The function takes a color image represented by a numpy array and creates a new array of every complete, 3x3 slice of my image. At the end I reshape it for my desired purpose, effectively reshaping to (columns*rows, 3*3*3 slice) Here is the code:
def get_kernels(im, k_size):
X, Y = im.shape[1] + 1 - k_size, im.shape[0] + 1 - k_size
new = np.zeros((Y, X, k_size, k_size, 3))
for y in range(Y):
for x in range(X):
new[y, x] = im[y:y + k_size, x:x + k_size]
return new.reshape(X * Y, k_size ** 2 * 3)
解决方案
You can use scikit-image's view_as_windows to create those sliding windows and then we need to permute axes and reshape -
from skimage.util.shape import view_as_windows
def get_kernels_vectorized(im, k_size):
X, Y = im.shape[1] + 1 - k_size, im.shape[0] + 1 - k_size
new = view_as_windows(im,(k_size,k_size,1))[...,0].transpose(0,1,3,4,2)
return new.reshape(X * Y, k_size ** 2 * 3)
Explanation on usage of view_as_windows
The idea with view_as_windows is that we feed in the input arg window_shape as a tuple of length same as the number of dimensions in the input array whose sliding windows are needed. Also, the axes along which we don't need to slide are fed as 1s. So, the input value for window_shape here is (k_size, k_size, 1) as the last axis is of color channels, along which we don't need to slide.
Sample run for verification -
In [186]: np.random.seed(0)
...: im = np.random.randint(0,9,(6,8,3))
In [189]: out1 = get_kernels(im, k_size=3)
In [190]: out2 = get_kernels_vectorized(im, k_size=3)
In [191]: np.array_equal(out1, out2)
Out[191]: True
Timings on 3264x2448 image with kernel size = 3 -
In [177]: np.random.seed(0)
...: im = np.random.randint(0,9,(3264,2448,3))
In [178]: %timeit get_kernels(im, k_size=3)
1 loop, best of 3: 5.46 s per loop
In [179]: %timeit get_kernels_vectorized(im, k_size=3)
1 loop, best of 3: 327 ms per loop
16x+ speedup here.