SSE是一种Intel的SIMD优化指令,单指令流多数据操作,并行计算指令,一般是128位操作,可以同时处理4个32位数的操作。
// Intel SSE
// shift the entire 128 bit value with 2 bytes to the right; this is done
// without sign extension by shifting in zeros
__m128i val = _mm_srli_si128(vector_of_8_s16, 2);
// insert the least significant 16 bits of "some_16_bit_val"
// the whole thing in this case, into the selected 16 bit
// integer of vector "val"(the 16 bit element with index 7 in this case)
val = _mm_insert_epi16(val, some_16_bit_val, 7);
对应的,在Android ARM 体系中,要用 VEXT 指令集.
int16x8_t val = vextq_s16(vector_of_8_s16, another_vector_s16, 1);
http://stackoverflow.com/questions/7203231/neon-vs-intel-sse-equivalence-of-certain-operations