armv8, dup
报错
/tmp/ccPrZnMZ.s: Assembler messages:
/tmp/ccPrZnMZ.s:111: Error: operand mismatch – `dup v0.4s,x3’
/tmp/ccPrZnMZ.s:111: Info: did you mean this?
/tmp/ccPrZnMZ.s:111: Info: dup v0.4s, w3
/tmp/ccPrZnMZ.s:111: Info: other valid variant(s):
/tmp/ccPrZnMZ.s:111: Info: dup v0.8b, w3
/tmp/ccPrZnMZ.s:111: Info: dup v0.16b, w3
/tmp/ccPrZnMZ.s:111: Info: dup v0.4h, w3
/tmp/ccPrZnMZ.s:111: Info: dup v0.8h, w3
/tmp/ccPrZnMZ.s:111: Info: dup v0.2s, w3
/tmp/ccPrZnMZ.s:111: Info: dup v0.2d, x3
code
void batch_assembly(float* src, float* out, int count, float u, float std, float w, float b)
{
int i = 10;
asm volatile(
"dup v0.4s, %4 \n"
"dup v1.4s, %w4 \n"
"dup v2.4s, %w5 \n"
"dup v3.4s, %w6 \n"
"1: \n"
"prfm pldl1keep, [%1, #128] \n"
"ld1 {v0.4s}, [%1], #16 \n"
"fabs v0.4s, v0.4s \n"
"subs %2, %2, #4 \n"
"st1 {v0.4s}, [%0], #16 \n"
"bgt 1b \n "
:"=r"(out) // 0, x0
:"r"(src), // 1, x1
"0"(out), //
"r"(count), // 3, w2
"r"(u), // 4
"r"(std),
"r"(w),
"r"(b)
:"cc", "memory", "v0", "v1", "v2", "v3"
);
}
出错的就是这句"dup v0.4s, %4 \n"
解释
- 因为输入参数列表中
r(u)
表示采用一个寄存器表示u,所以%4等价于某个x寄存器
(64位寄存器) dup vd.4s, rn
第二个参数又不能是64位
,只能用32位的寄存器,但是u又是哪一个32位的寄存器呢?报错的提示信息dup v0.4h, w3
可以看到是w3
寄存器- 将
"dup v0.4s, %4 \n"
改为"dup v0.4s, w3 \n"
, 或者"dup v0.4s, %w4 \n"
,第二种方式的w4,是输出+输入参数列表的排序, 因为u
从0开始排第4,感觉这里的w就是words
的意思,将参数4表示成32位的格式。因为在armv7中直接用%4
就可以了。