We have 2 cases with code blocks A, B and C.哪个效率比较高

最新推荐文章于 2022-08-08 16:33:03 发布

穴工

最新推荐文章于 2022-08-08 16:33:03 发布

阅读量432

点赞数

分类专栏：编程文章标签：缓存

编程专栏收录该内容

11 篇文章 0 订阅

订阅专栏

Today I was asked this question. We have 2 cases with code blocks A, B and C. These code blocks don't share any resources except an iterator (int i).

Please give 3 possible reasons why case 1 could be faster than case 2, and 3 possible reasons why case 2 could be faster than case 1:

case 1

for (i=0; i<N; ++i){
 A;
 B;
 C;
}

case 2

for (i=0; i<N; ++i){
 A;
}
for (i=0; i<N; ++i){
 B;
}
for (i=0; i<N; ++i){
 C;
}

根据情况的不同，原来在某种情况下效率不好的程序却会达到好的效果。

these are only reasons why it could be faster (of course it depends of what exactly are A B and C)

case1

only a single occurrence of loop prologue/epilogue (less code to run)只有一个单独的开场白的出现
better scheduling of A B and C generated code (more parallelism)更好的ABC调度代码
may factorize code (no dependency on output, but A B and C may read the same inputs)可能因式化代码，不依赖于输出，但是ABC可能读取相同输入

case2

lower register pressure in each loop (avoid spilling)更少的寄存器压力在每次循环中，避免了泄露
more likely to unroll loop (when A, B or C is trivial)更可能展开循环，当ABC是琐碎的
more likely the entire loop being into instruction cache (useful when N is big)更可能整个循环到指令缓存，有用当N是很大

第二个得票很高的答案

Assuming no dependencies between A, B and C, I would guess that Case 2 would normally be faster than case 1 because of:

Data locality
Code locality
Branch prediction

假设ABC没有依赖性，会猜例子2可能更快，因为数据局部性，代码局部性，分支预测。

However, if the code blocks are very short, then theoretically the extra loop overhead in Case 2 might dominate. Note also @James Kanze's answer, which is another reason why Case 1 could be faster.

但是，如果代码块很短，理论上额外循环开销在例子2会主宰。

Of course, if there are truly no dependencies, then the compiler is free to transform Case 1 into Case 2, and vice versa.

当然，如果真的有依赖性，编译器自由转换例子1到例子2，反之成立。

参考：

http://programmers.stackexchange.com/questions/64132/interesting-interview-question