C# (float) cast is costly for speed if not used appropriately

最新推荐文章于 2024-09-05 13:06:43 发布

andy_212

最新推荐文章于 2024-09-05 13:06:43 发布

阅读量522

点赞数

分类专栏： C# 文章标签： float c# input optimization output benchmarking

C# 专栏收录该内容

15 篇文章 0 订阅

订阅专栏

I'd like to share findings regarding C# (float) cast.

As we convert double to float, we found several slow down issues.
We realized C# (float) cast can be costly if not used appropriately.

------------------------------------------------------------
Slow cases
------------------------------------------------------------
(A)
private void someMath(float[] input, float[] output)
{
int length = input.Length;
for (int i = 0; i < length; i++)
{
output[i] = (float)Math.Log10(input[i]); // <--- inline (float)
cast is slow!
}
}

(B)
private void Copy(double[] input, float[] output)
{
int length = input.Length;
for (int i = 0; i < length; i++)
{
output[i] = (float)input[i]; // <--- inline (float)
cast is slow!
}
}

In these examples, "inline" (float) casts are executed on the same line as
other operation
such as Math.Log10() or simple data fetch from input array.

These are slow. Even with Release build.
(A): It takes 3 to 6 % more than double[] case. ;-)
(B): It takes as twice(!) as double[] case. ;-)

In my understanding and articles on the Net, the slow down comes from
writing intermediate value
back to memory as follows. The extra trips are costly.

(A) CPU/FPU +--> fetch --> Math.Log10 --+ +--> (float) --+
| | | |
| | | |
| V | V
memory input written back to heap output

Extra memory access!

(B) CPU/FPU +--> fetch --+ +--> (float) --+
| | | |
| | | |
| V | V
memory input written back to heap output
Extra memory access!

------------------------------------------------------------
Fast cases
------------------------------------------------------------

To avoid the extra memory access, we can use a temporary variable to store
the intermediate data.
The temporary variable is allocated in CPU register and we can keep the
speed fast.

(C)
private void someMath(float[] input, float[] output)
{
int length = input.Length;
for (int i = 0; i < length; i++)
{
double tmp = Math.Log10(input[i]); // <-- store in a
temporary variable in CPU register
output[i] = (float)tmp; // <-- then (float) cast.
Fast!
}
}

(D)
private void Copy(double[] input, float[] output)
{
int length = input.Length;
for (int i = 0; i < length; i++)
{
double tmp = input[i]; // <-- store in a
temporary variable in CPU register
output[i] = (float)tmp; // <-- then (float) cast.
Fast!
}
}

In these improved versions, the intermediate data are not written back to
the memory.
The improved versions are actually slightly faster than the double[] case.
(C): 1% faster than double[] case.
(D): 3% faster than double[] case.

(C) CPU/FPU +--> fetch --> Math.Log10 --> stays in -----> (float) --+
| CPU register |
| Fast! |
| V
memory input
output

(D) CPU/FPU +--> fetch --> stays in -----> (float) --+
| CPU register |
| Fast! |
| V
memory input output

OK, this is what we found from benchmarking and googling.

The same thing can be said for ArraySegment<float> arrays as well.
This is because the issue relates to float variables in the array, not the
array itself.

You would say this is .NET compiler optimization issue.
If you know optimization flags or anything that can fix this issue on
compiler side, please let us know.
That would be a great help!
(By the way, simple release build does not help.)

Otherwise, we will need to optimize our code by hand using temporary
variable technique as in the example.
Well, we have many instances of this kind of "inline" casts in our code.

andy_212

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
C# (float) cast is costly for speed if not used appropriately

Id like to share findings regarding C# (float) cast.As we convert double to float, we found several slow down issues.We realized C# (float) cast can be costly if not used appropriately.--------------
复制链接

扫一扫

专栏目录