我正在使用AVX内在
_mm256_extract_epi32().
我不完全确定我是否正确使用它,因为gcc不喜欢我的代码,而clang编译并运行它没有问题.
我是基于整数变量的值来提取通道,而不是使用常量.
使用clang3.8(或clang4)为avx2编译以下代码段时,它为generates code并使用vpermd指令.
#include
#include
#include
uint32_t foo( int a, __m256i vec )
{
uint32_t e = _mm256_extract_epi32( vec, a );
return e*e;
}
现在,如果我改用gcc,那么就说gcc 7.2然后编译器无法生成代码,错误如下:
In file included from /opt/compiler-explorer/gcc-7.2.0/lib/gcc/x86_64-linux-gnu/7.2.0/include/immintrin.h:41:0,
from :2:
/opt/compiler-explorer/gcc-7.2.0/lib/gcc/x86_64-linux-gnu/7.2.0/include/avxintrin.h: In function 'foo':
/opt/compiler-explorer/gcc-7.2.0/lib/gcc/x86_64-linux-gnu/7.2.0/include/avxintrin.h:524:20: error: the last argument must be a 1-bit immediate
return (__m128i) __builtin_ia32_vextractf128_si256 ((__v8si)__X, __N);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /opt/compiler-explorer/gcc-7.2.0/lib/gcc/x86_64-linux-gnu/7.2.0/include/immintrin.h:37:0,
from :2:
/opt/compiler-explorer/gcc-7.2.0/lib/gcc/x86_64-linux-gnu/7.2.0/include/smmintrin.h:449:11: error: selector must be an integer constant in the range 0..3
return __builtin_ia32_vec_ext_v4si ((__v4si)__X, __N);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
我有两个问题:
>为什么使用变量时clang很好,并且gcc想要一个
不变?
>为什么gcc不能下定决心?首先它需要一个1位立即值,后来它想要一个0..3范围内的整数常量,这些是不同的东西.
Intels Intrinsics Guide没有为_mm256_extract_epi32()的索引值指定约束,那么谁在这里,gcc还是clang?