neon浮点运算,使用NEON内在函数除以浮点数

I'm processing an image by four pixels at the time, this on a armv7 for an Android application.

I want to divide a float32x4_t vector by another vector but the numbers in it are varying from circa 0.7 to 3.85, and it seems to me that the only way to divide is using right shift but that is for a number which is 2^n.

Also, I'm new in this, so any constructive help or comment is welcomed.

Example:

How can I perform these operations with NEON intrinsics?

float32x4_t a = {25.3,34.1,11.0,25.1};

float32x4_t b = {1.2,3.5,2.5,2.0};

// somthing like this

float32x4 resultado = a/b; // {21.08,9.74,4.4,12.55}

解决方案

The NEON instruction set does not have a floating-point divide.

If you know a priori that your values are not poorly scaled, and you do not require correct rounding (this is almost certainly the case if you're doing image processing), then you can use a reciprocal estimate, refinement step, and multiply instead of a divide:

// get an initial estimate of 1/b.

float32x4_t reciprocal = vrecpeq_f32(b);

// use a couple Newton-Raphson steps to refine the estimate. Depending on your

// application's accuracy requirements, you may be able to get away with only

// one refinement (instead of the two used here). Be sure to test!

reciprocal = vmulq_f32(vrecpsq_f32(b, reciprocal), reciprocal);

reciprocal = vmulq_f32(vrecpsq_f32(b, reciprocal), reciprocal);

// and finally, compute a/b = a*(1/b)

float32x4_t result = vmulq_f32(a,reciprocal);

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值