Indexing
Numpy is by its design a SIMD structure, which is best examplified by the list indexing feature:
python - How to filter numpy array by list of indices? - Stack Overflow
filter_indices = [1,3,5] np.array([11,13,155,22,0xff,32,56,88])[filter_indices]
Together with ohther python structures, such as dictionary, the SIMD execution can be done, but with a more unintuitive grammatic style:
python - Fast replacement of values in a numpy array - Stack Overflow
python - How to use a dictionary to translate/replace elements of an array? - Stack Overflow
from numpy import copy newArray = copy(theArray) for k, v in d.iteritems(): newArray[theArray==k] = v
numpy.copy — NumPy v1.22 Manual
test code:
#!/usr/bin/env python2.7 from numpy import copy, random, arange random.seed(0) data = random.randint(30, size=10**5) d = {4: 0, 9: 5, 14: 10, 19: 15, 20: 0, 21: 1, 22: 2, 23: 3, 24: 0} dk = d.keys() dv = d.values() def f1(a, d): b = copy(a) for k, v in d.iteritems(): b[a==k] = v return b def f2(a, d): for i in xrange(len(a)): a[i] = d.get(a[i], a[i]) return a def f3(a, dk, dv): mp = arange(0, max(a)+1) mp[dk] = dv return mp[a] a = copy(data) res = f2(a, d) assert (f1(data, d) == res).all() assert (f3(data, dk, dv) == res).all()
Any All in Python - GeeksforGeeks
python - Use a.any() or a.all() - Stack Overflow
==> obviously direct indexing filter by numpy arrays is going to be faster than numpy + dict;
===> use numpy arrays when you can; python has weaker SIMD support in general.
the indices slicing support makes SIMD easier and more versatile in implementation, but depending on the hardware support and numpy low-level implementations, i.e. how the hardware organize physical storage and accesses and how numpy utilize such features, different slicing directions might have drastically different performances.
==> in general assume row-major or C-like array storage structures, meaning slicing lower indices will be more efficient.
==> some ASICs might have much stronger support for higher indices slicing.
Some Tips
Numpy
While python is not structured for SIMD style prarallel programming, since you don't even have accesses to pointers, Numpy is. And this package has proven to be quite efficient (, power of modular design I guess), so try to base your SIMD coding with Numpy.
Slicing
Numpy slicing is not as straightforward as manipulating the memory buffers directly with pointers in C/C++, but still many slicing options are supported; there are 2 versions:
1. basic:
array[row][col][level]
this is a successive access, reducing array dimensionality by 1 per dereference action, [], if a non-full range of indices is given.
slicing in this mode is restrictive:
array[:][:col][:level]
is actually
array[:col][:level]
==> use this mode for clear, structured expressions of dereferences.
2. multi-directional access, or full slicing support
array[row, col, level]
this mode does the same for specified accesses, but differs in slicing behavior from successive derefences;
array[:, :col, :level]
actually works as intended, i.e. "for all rows, take all cols till col; for each of the taken cols, take all levels till level."
!!!!
Slicing combined with filter/list indexing as introduced in Indexing section is a powerful feature, which strongly mirrors a basic gather/scatter action in most SIMD ISAs, but they can be extremely confusing to use.
e.g.
for a 3D array a:
array([[[0.87330218, 0.348806 , 0.98876196, 0.44153593, 0.35657919],
[0.0591688 , 0.01207211, 0.76808385, 0.5382626 , 0.74737973],
[0.61562341, 0.49494463, 0.99326787, 0.78333718, 0.18965861],
[0.10603183, 0.78535426, 0.54849272, 0.6651616 , 0.99013694]],
[[0.42220155, 0.65080645, 0.92558894, 0.11468048, 0.70492543],
[0.58528903, 0.71053382, 0.96009024, 0.84545703, 0.89357304],
[0.61943998, 0.99428317, 0.54617109, 0.62770748, 0.39451982],
[0.94771556, 0.56667405, 0.18225097, 0.75520699, 0.99649013]],
[[0.05937206, 0.71885611, 0.08577789, 0.82468742, 0.61361646],
[0.13556848, 0.05283339, 0.63987149, 0.91302604, 0.37158879],
[0.37965324, 0.71274351, 0.19897426, 0.48187764, 0.55820695],
[0.20501126, 0.44322089, 0.90804689, 0.55505773, 0.66719231]]])
a[0][np.array((1,2)), np.array((3,4))] or a[0, np.array((1,2)), np.array((3,4))]
will give:
array([0.5382626 , 0.18965861])
add in slicing:
a[:2, np.array((1,2)), np.array((3,4))]
==>
array([[0.5382626 , 0.18965861],
[0.84545703, 0.39451982]])
while
a[:2][np.array((1,2)), np.array((3,4))]
==>
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: index 2 is out of bounds for axis 0 with size 2
Maybe try:
==> try with a terminal and toy examples to help;
==> while slicing, except for only the lowest index, use the [,,,,,] mode/notation only, to avoid confusing yourself;
==> of course, you can always reshape whichever array you are dealing with into 1D and treat it as a C pointer.
Grammar
min vs. minima
minimum is the elementwise SIMD like comparison
numpy.minimum — NumPy v1.22 Manual
array.min/np.amin is a reduction operation, not elementwise
numpy.ndarray.min — NumPy v1.22 Manual
numpy.amin — NumPy v1.22 Manual
==> naturally the same applies to max/maximum
additionally, for the position of the extrema, see:
numpy.argmin — NumPy v1.22 Manual
Broadcast
for initialization use np.full():
python - NumPy array initialization (fill with identical values) - Stack Overflow
updates: array.fill()
numpy.ndarray.fill — NumPy v1.22 Manual
SIMD Alignment
pad an existing array
numpy.pad — NumPy v1.22 Manual
reshape and resize: reshape explicitly requires the new shape to be compatible with the old one
numpy.resize — NumPy v1.22 Manual
numpy.reshape — NumPy v1.22 Manual
==> by choosing order='C' / 'F' (row major or C-like vs. column major or Fortran-like) while casting arrays from nD to 1D or vice versa, you can achieve interleave/deinterleave effects.