哪个更快：while（1）或while（2）？

最新推荐文章于 2022-07-21 01:22:18 发布

asdfgh0077

最新推荐文章于 2022-07-21 01:22:18 发布

阅读量1.4k

点赞数 1

文章标签： c performance while-loop

原文链接：https://oldbug.net/q/1gGBz/Which-is-faster-while-1-or-while-2

版权

本文翻译自：Which is faster: while(1) or while(2)?

This was an interview question asked by a senior manager. 这是一位高级经理提出的面试问题。

Which is faster? 哪个更快？

while(1) {
    // Some code
}

or 要么

while(2) {
    //Some code
}

I said that both have the same execution speed, as the expression inside while should finally evaluate to true or false . 我说，都具有相同的执行速度，因为里面的表现while应该最终评估为true或false 。 In this case, both evaluate to true and there are no extra conditional instructions inside the while condition. 在这种情况下，两者都计算为true ， while条件中没有额外的条件指令。 So, both will have the same speed of execution and I prefer while (1). 因此，两者都具有相同的执行速度，而我更喜欢（1）。

But the interviewer said confidently: "Check your basics. while(1) is faster than while(2) ." 但是采访者自信地说：“检查你的基础知识。 while(1)比while(2)更快。” (He was not testing my confidence) （他没有测试我的信心）

Is this true? 这是真的？

#1楼

参考：https://stackoom.com/question/1gGBz/哪个更快-while-或while

#2楼

Your explanation is correct. 你的解释是正确的。 This seems to be a question that tests your self-confidence in addition to technical knowledge. 除了技术知识之外，这似乎是一个测试你的自信心的问题。

By the way, if you answered 顺便问一下，如果你回答

Both pieces of code are equally fast, because both take infinite time to complete 这两段代码同样快，因为两者都需要无限的时间才能完成

the interviewer would say 面试官会说

But while (1) can do more iterations per second; 但是while (1)可以每秒进行更多的迭代; can you explain why? 你能解释一下原因吗？ (this is nonsense; testing your confidence again) （这是废话;再次测试你的信心）

So by answering like you did, you saved some time which you would otherwise waste on discussing this bad question. 所以通过像你一样回答，你节省了一些时间，否则你会浪费在讨论这个糟糕的问题上。

Here is an example code generated by the compiler on my system (MS Visual Studio 2012), with optimizations turned off: 以下是我的系统（MS Visual Studio 2012）上的编译器生成的示例代码，关闭了优化：

yyy:
    xor eax, eax
    cmp eax, 1     (or 2, depending on your code)
    je xxx
    jmp yyy
xxx:
    ...

With optimizations turned on: 启用优化后：

xxx:
    jmp xxx

So the generated code is exactly the same, at least with an optimizing compiler. 所以生成的代码完全相同，至少使用优化编译器。

#3楼

Both loops are infinite, but we can see which one takes more instructions/resources per iteration. 两个循环都是无限的，但是我们可以看到哪个循环每次迭代需要更多的指令/资源。

Using gcc, I compiled the two following programs to assembly at varying levels of optimization: 使用gcc，我将以下两个程序编译为不同优化级别的程序集：

int main(void) {
    while(1) {}
    return 0;
}

int main(void) {
    while(2) {}
    return 0;
}

Even with no optimizations ( -O0 ), the generated assembly was identical for both programs . 即使没有优化（ -O0 ）， 生成的程序集对于两个程序都是相同的。 Therefore, there is no speed difference between the two loops. 因此，两个循环之间没有速度差异。

For reference, here is the generated assembly (using gcc main.c -S -masm=intel with an optimization flag): 作为参考，这里是生成的程序集（使用gcc main.c -S -masm=intel和优化标志）：

With -O0 : 使用-O0 ：

    .file   "main.c"
    .intel_syntax noprefix
    .def    __main; .scl    2;  .type   32; .endef
    .text
    .globl  main
    .def    main;   .scl    2;  .type   32; .endef
    .seh_proc   main
main:
    push    rbp
    .seh_pushreg    rbp
    mov rbp, rsp
    .seh_setframe   rbp, 0
    sub rsp, 32
    .seh_stackalloc 32
    .seh_endprologue
    call    __main
.L2:
    jmp .L2
    .seh_endproc
    .ident  "GCC: (tdm64-2) 4.8.1"

With -O1 : 使用-O1 ：

    .file   "main.c"
    .intel_syntax noprefix
    .def    __main; .scl    2;  .type   32; .endef
    .text
    .globl  main
    .def    main;   .scl    2;  .type   32; .endef
    .seh_proc   main
main:
    sub rsp, 40
    .seh_stackalloc 40
    .seh_endprologue
    call    __main
.L2:
    jmp .L2
    .seh_endproc
    .ident  "GCC: (tdm64-2) 4.8.1"

With -O2 and -O3 (same output): 使用-O2和-O3 （相同输出）：

    .file   "main.c"
    .intel_syntax noprefix
    .def    __main; .scl    2;  .type   32; .endef
    .section    .text.startup,"x"
    .p2align 4,,15
    .globl  main
    .def    main;   .scl    2;  .type   32; .endef
    .seh_proc   main
main:
    sub rsp, 40
    .seh_stackalloc 40
    .seh_endprologue
    call    __main
.L2:
    jmp .L2
    .seh_endproc
    .ident  "GCC: (tdm64-2) 4.8.1"

In fact, the assembly generated for the loop is identical for every level of optimization: 实际上，为循环生成的程序集对于每个优化级别都是相同的：

 .L2:
    jmp .L2
    .seh_endproc
    .ident  "GCC: (tdm64-2) 4.8.1"

The important bits being: 重要的是：

.L2:
    jmp .L2

I can't read assembly very well, but this is obviously an unconditional loop. 我不能很好地阅读汇编，但这显然是一个无条件的循环。 The jmp instruction unconditionally resets the program back to the .L2 label without even comparing a value against true, and of course immediately does so again until the program is somehow ended. jmp指令无条件地将程序重置为.L2标签，甚至没有将值与true进行比较，当然会立即再次执行，直到程序以某种方式结束。 This directly corresponds to the C/C++ code: 这直接对应于C / C ++代码：

L2:
    goto L2;

Edit: 编辑：

Interestingly enough, even with no optimizations , the following loops all produced the exact same output (unconditional jmp ) in assembly: 有趣的是，即使没有优化 ，以下循环都会在汇编中产生完全相同的输出（无条件jmp ）：

while(42) {}

while(1==1) {}

while(2==2) {}

while(4<7) {}

while(3==3 && 4==4) {}

while(8-9 < 0) {}

while(4.3 * 3e4 >= 2 << 6) {}

while(-0.1 + 02) {}

And even to my amazement: 令我惊讶的是：

#include<math.h>

while(sqrt(7)) {}

while(hypot(3,4)) {}

Things get a little more interesting with user-defined functions: 用户定义的函数使事情变得更有趣：

int x(void) {
    return 1;
}

while(x()) {}

#include<math.h>

double x(void) {
    return sqrt(7);
}

while(x()) {}

At -O0 , these two examples actually call x and perform a comparison for each iteration. 在-O0 ，这两个示例实际上调用x并对每次迭代执行比较。

First example (returning 1): 第一个例子（返回1）：

.L4:
    call    x
    testl   %eax, %eax
    jne .L4
    movl    $0, %eax
    addq    $32, %rsp
    popq    %rbp
    ret
    .seh_endproc
    .ident  "GCC: (tdm64-2) 4.8.1"

Second example (returning sqrt(7) ): 第二个例子（返回sqrt(7) ）：

.L4:
    call    x
    xorpd   %xmm1, %xmm1
    ucomisd %xmm1, %xmm0
    jp  .L4
    xorpd   %xmm1, %xmm1
    ucomisd %xmm1, %xmm0
    jne .L4
    movl    $0, %eax
    addq    $32, %rsp
    popq    %rbp
    ret
    .seh_endproc
    .ident  "GCC: (tdm64-2) 4.8.1"

However, at -O1 and above, they both produce the same assembly as the previous examples (an unconditional jmp back to the preceding label). 但是，在-O1及以上，它们都产生与前面示例相同的程序集（无条件的jmp返回到前面的标签）。

TL;DR TL; DR

Under GCC, the different loops are compiled to identical assembly. 在GCC下，不同的循环被编译为相同的程序集。 The compiler evaluates the constant values and doesn't bother performing any actual comparison. 编译器会评估常量值，并且不会执行任何实际比较。

The moral of the story is: 这个故事的寓意是：

There exists a layer of translation between C++ source code and CPU instructions, and this layer has important implications for performance. C ++源代码和CPU指令之间存在一层转换，这一层对性能有重要影响。
Therefore, performance cannot be evaluated by only looking at source code. 因此，仅通过查看源代码无法评估性能。
The compiler should be smart enough to optimize such trivial cases. 编译器应该足够聪明以优化这些琐碎的案例。 Programmers should not waste their time thinking about them in the vast majority of cases. 在绝大多数情况下，程序员不应该浪费时间思考它们。

#4楼

You should have asked him how did he reached to that conclusion. 你应该问他他是如何得出这个结论的。 Under any decent compiler out there, the two compile to the same asm instructions. 在任何体面的编译器下，两个编译为相同的asm指令。 So, he should have told you the compiler as well to start off. 所以，他应该告诉你编译器也要开始。 And even so, you would have to know the compiler and platform very well to even make a theoretical educated guess. 即使这样，你也必须非常了解编译器和平台，甚至做出理论上有根据的猜测。 And in the end, it doesn't really matter in practice, since there are other external factors like memory fragmentation or system load that will influence the loop more than this detail. 最后，它在实践中并不重要，因为还有其他外部因素，如内存碎片或系统负载，这将影响循环而不是这个细节。

#5楼

Here's a problem: If you actually write a program and measure its speed, the speed of both loops could be different! 这是一个问题：如果您实际编写程序并测量其速度，则两个循环的速度可能会有所不同！ For some reasonable comparison: 对于一些合理的比较：

unsigned long i = 0;
while (1) { if (++i == 1000000000) break; }

unsigned long i = 0;
while (2) { if (++i == 1000000000) break; }

with some code added that prints the time, some random effect like how the loop is positioned within one or two cache lines could make a difference. 添加一些代码打印时间，一些随机效果，如循环在一个或两个缓存行中的位置可能会有所不同。 One loop might by pure chance be completely within one cache line, or at the start of a cache line, or it might to straddle two cache lines. 一个循环可能完全在一个缓存行内，或者在缓存行的开头，或者它可能跨越两个缓存行。 And as a result, whatever the interviewer claims is fastest might actually be fastest - by coincidence. 因此，无论采访者声称最快，实际上可能是最快的 - 巧合。

Worst case scenario: An optimising compiler doesn't figure out what the loop does, but figures out that the values produced when the second loop is executed are the same ones as produced by the first one. 最糟糕的情况：优化编译器不会弄清楚循环的作用，但会发现第二个循环执行时产生的值与第一个循环产生的值相同。 And generate full code for the first loop, but not for the second. 并为第一个循环生成完整代码，但不为第二个循环生成完整代码。

#6楼

The existing answers showing the code generated by a particular compiler for a particular target with a particular set of options do not fully answer the question -- unless the question was asked in that specific context ("Which is faster using gcc 4.7.2 for x86_64 with default options?", for example). 显示特定编译器为具有特定选项集的特定目标生成的代码的现有答案不能完全回答问题 - 除非在该特定上下文中询问了问题（“使用gcc 4.7.2 for x86_64更快”使用默认选项？“，例如）。

As far as the language definition is concerned, in the abstract machine while (1) evaluates the integer constant 1 , and while (2) evaluates the integer constant 2 ; 就语言定义而言，在抽象机器中 while (1)计算整数常量1 ，而while (2)计算整数常量2 ; in both cases the result is compared for equality to zero. 在两种情况下，结果都将相等性与零进行比较。 The language standard says absolutely nothing about the relative performance of the two constructs. 语言标准绝对没有说明两种结构的相对性能。

I can imagine that an extremely naive compiler might generate different machine code for the two forms, at least when compiled without requesting optimization. 我可以想象一个非常天真的编译器可能会为这两种形式生成不同的机器代码，至少在编译时没有请求优化。

On the other hand, C compilers absolutely must evaluate some constant expressions at compile time, when they appear in contexts that require a constant expression. 另一方面，C编译器绝对必须在编译时评估一些常量表达式，当它们出现在需要常量表达式的上下文中时。 For example, this: 例如，这个：

int n = 4;
switch (n) {
    case 2+2: break;
    case 4:   break;
}

requires a diagnostic; 需要诊断; a lazy compiler does not have the option of deferring the evaluation of 2+2 until execution time. 延迟编译器没有选择将执行时间推迟到2+2的评估。 Since a compiler has to have the ability to evaluate constant expressions at compile time, there's no good reason for it not to take advantage of that capability even when it's not required. 由于编译器必须能够在编译时评估常量表达式，因此即使不需要，也没有充分的理由不利用该功能。

The C standard ( N1570 6.8.5p4) says that C标准（ N1570 6.8.5p4）说明了这一点

An iteration statement causes a statement called the loop body to be executed repeatedly until the controlling expression compares equal to 0. 迭代语句会导致重复执行一个称为循环体的语句，直到控制表达式比较等于0。

So the relevant constant expressions are 1 == 0 and 2 == 0 , both of which evaluate to the int value 0 . 所以相关的常量表达式是1 == 0和2 == 0 ，两者都计算为int值0 。 (These comparison are implicit in the semantics of the while loop; they don't exist as actual C expressions.) （这些比较隐含在while循环的语义中;它们不作为实际的C表达式存在。）

A perversely naive compiler could generate different code for the two constructs. 一个反常天真的编译器可以为这两个结构生成不同的代码。 For example, for the first it could generate an unconditional infinite loop (treating 1 as a special case), and for the second it could generate an explicit run-time comparison equivalent to 2 != 0 . 例如，对于第一个，它可以生成无条件无限循环（将1视为特殊情况），对于第二个，它可以生成等效于2 != 0的显式运行时比较。 But I've never encountered a C compiler that would actually behave that way, and I seriously doubt that such a compiler exists. 但我从未遇到过实际上会以这种方式运行的C编译器，我非常怀疑这样的编译器是否存在。

Most compilers (I'm tempted to say all production-quality compilers) have options to request additional optimizations. 大多数编译器（我很想说所有生产质量编译器）都可以选择进行额外的优化。 Under such an option, it's even less likely that any compiler would generate different code for the two forms. 在这样的选项下，任何编译器都不太可能为这两种形式生成不同的代码。

If your compiler generates different code for the two constructs, first check whether the differing code sequences actually have different performance. 如果编译器为这两个构造生成不同的代码，请首先检查不同的代码序列是否实际上具有不同的性能。 If they do, try compiling again with an optimization option (if available). 如果是，请尝试使用优化选项再次编译（如果可用）。 If they still differ, submit a bug report to the compiler vendor. 如果它们仍然不同，请向编译器供应商提交错误报告。 It's not (necessarily) a bug in the sense of a failure to conform to the C standard, but it's almost certainly a problem that should be corrected. 它（不一定）是一个不符合C标准意义上的错误，但它几乎肯定是一个应该纠正的问题。

Bottom line: while (1) and while(2) almost certainly have the same performance. 底线： while (1)和while(2) 几乎肯定具有相同的性能。 They have exactly the same semantics, and there's no good reason for any compiler not to generate identical code. 它们具有完全相同的语义，并且没有充分的理由让任何编译器不生成相同的代码。

And though it's perfectly legal for a compiler to generate faster code for while(1) than for while(2) , it's equally legal for a compiler to generate faster code for while(1) than for another occurrence of while(1) in the same program. 尽管它是完全合法的编译器生成速度更快的代码while(1)比while(2)这是同样的法律对编译器生成速度更快的代码while(1)比另一种发生while(1)在同样的计划。

(There's another question implicit in the one you asked: How do you deal with an interviewer who insists on an incorrect technical point. That would probably be a good question for the Workplace site ). （你问的那个问题隐含着另一个问题：你如何处理一个坚持不正确的技术要点的面试官。这可能是Workplace网站的一个好问题）。

asdfgh0077

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
哪个更快：while（1）或while（2）？

This was an interview question asked by a senior manager. 这是一位高级经理提出的面试问题。 Which is faster? 哪个更快？
复制链接

扫一扫

哪个更快：while（1）或while（2）？

See also: Is "for(;;)" faster than "while (TRUE)"? 另请参阅： “for（;;）”是否比“while（TRUE）”快？ If not, why do people use it? 如果没有，为什么人们会使用它？

#1楼

#2楼

#3楼

TL;DR TL; DR

#4楼

#5楼

#6楼