linux内核中的likely和unlikely
Kernel version:2.6.14
CPU architecture:ARM920T
Author:ce123(http://blog.csdn.net/ce123)
GCC version:arm-linux-gcc-3.4.1
看内核时经常遇到if(likely( )){}或是if(unlikely( ))这样的语句,不甚了解,例如(选自kernel/fork.c中copy_process):
SET_LINKS(p);
if (unlikely(p->ptrace & PT_PTRACED))
__ptrace_link(p, current->parent);
下面详细分析一下。
likely() 与 unlikely()是内核中定义的两个宏。位于/include/linux/compiler.h中,具体定义如下:
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
__builtin_expect是GCC(version>=2.9)引进的内建函数,其作用就是帮助编译器判断条件跳转的预期值,避免跳转造成时间乱费,有利于代码优化。查阅GCC手册,发现其定义如下(http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html):
-- Built-in Function: long __builtin_expect (long EXP, long C)
You may use `__builtin_expect' to provide the compiler with branch
prediction information. In general, you should prefer to use
actual profile feedback for this (`-fprofile-arcs'), as
programmers are notoriously bad at predicting how their programs
actually perform. However, there are applications in which this
data is hard to collect.
The return value is the value of EXP, which should be an integral
expression. The value of C must be a compile-time constant. The
semantics of the built-in are that it is expected that EXP == C.
For example:
if (__builtin_expect (x, 0))
foo ();
would indicate that we do not expect to call `foo', since we
expect `x' to be zero. Since you are limited to integral
expressions for EXP, you should use constructions such as
if (__builtin_expect (ptr != NULL, 1))
error ();
when testing pointer or floating-point values.
大致意思是:可以使用。由于大部分程序员在分支预测方面做得很糟糕,所以GCC提供了__builtin_expect这个内建函数,给编译器提供分支预测信息,以帮助程序员处理分支预测,优化程序。其第一个参数EXP为一个整型表达式,这个内建函数的返回值也是这个EXP,而C为一个编译期常量,这个函数的语义是:你期望EXP表达式的值等于常量C,从而GCC为你优化程序,将符合这个条件的分支放在合适的地方。由于该内建函数只提供了整型表达式,所以如果你要优化其他类型的表达式,可以采用指针的形式。
当GCC的版本较低时(_GNUC_MINOR__ < 96),__builtin_expect直接返回EXP。下面的代码摘自/include/linux/compiler-gcc2.h。
/* These definitions are for GCC v2.x. */
/* Somewhere in the middle of the GCC 2.96 development cycle, we implemented
a mechanism by which the user can annotate likely branch directions and
expect the blocks to be reordered appropriately. Define __builtin_expect
to nothing for earlier compilers. */
#include <linux/compiler-gcc.h>
#if __GNUC_MINOR__ < 96
# define __builtin_expect(x, expected_value) (x)
#endif
总结一下:if() 语句照常用, 和以前一样, 只是 如果你觉得if()是1 的可能性非常大的时候, 就在表达式的外面加一个likely(),如果可能性非常小(比如几率非常小),就用unlikely()包裹上。下面我们看一个例子。
//test_builtin_expect.c
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
int test_likely(int x)
{
if(likely(x))
x = 5;
else
x = 6;
return x;
}
int test_unlikely(int x)
{
if(unlikely(x))
x = 5;
else
x = 6;
return x;
}
root@czu:~/桌面/socket# arm-linux-gcc -fprofile-arcs -O2 -c test.c
root@czu:~/桌面/socket# arm-linux-gcc -fprofile-arcs -O2 -o test test.croot@czu:~/桌面/socket# arm-linux-objdump -D test > test.dis
000088cc <test_likely>:
88cc: e3500000 cmp r0, #0 ; 0x0
88d0: e92d4010 stmdb sp!, {r4, lr}
88d4: e59fc044 ldr ip, [pc, #68] ; 8920 <.text+0x148>
88d8: e59fe044 ldr lr, [pc, #68] ; 8924 <.text+0x14c>
88dc: e3a00005 mov r0, #5 ; 0x5
88e0: 0a000006 beq 8900 <test_likely+0x34>//前面通过cmp将r0和0进行比较,因为x=1的概率很大,优先执行不等于0的分支
88e4: e89c0018 ldmia ip, {r3, r4}
88e8: e3a02000 mov r2, #0 ; 0x0
88ec: e3a01001 mov r1, #1 ; 0x1
88f0: e0933001 adds r3, r3, r1
88f4: e0a44002 adc r4, r4, r2
88f8: e88c0018 stmia ip, {r3, r4}
88fc: e8bd8010 ldmia sp!, {r4, pc}
8900: e89e0006 ldmia lr, {r1, r2}
8904: e3a04000 mov r4, #0 ; 0x0
8908: e3a03001 mov r3, #1 ; 0x1
890c: e0911003 adds r1, r1, r3
8910: e0a22004 adc r2, r2, r4
8914: e3a00006 mov r0, #6 ; 0x6
8918: e88e0006 stmia lr, {r1, r2}
891c: e8bd8010 ldmia sp!, {r4, pc}
8920: 000121e0 andeq r2, r1, r0, ror #3
8924: 000121e8 andeq r2, r1, r8, ror #3
00008928 <test_unlikely>:
8928: e3500000 cmp r0, #0 ; 0x0
892c: e92d4010 stmdb sp!, {r4, lr}
8930: e59fc044 ldr ip, [pc, #68] ; 897c <.text+0x1a4>
8934: e59fe044 ldr lr, [pc, #68] ; 8980 <.text+0x1a8>
8938: e3a00005 mov r0, #5 ; 0x5
893c: 1a000007 bne 8960 <test_unlikely+0x38>//前面通过cmp将r0和0进行比较,因为x=0的概率很大,优先执行等于0的分支
8940: e89c0018 ldmia ip, {r3, r4}
8944: e3a02000 mov r2, #0 ; 0x0
8948: e3a01001 mov r1, #1 ; 0x1
894c: e0933001 adds r3, r3, r1
8950: e0a44002 adc r4, r4, r2
8954: e3a00006 mov r0, #6 ; 0x6
8958: e88c0018 stmia ip, {r3, r4}
895c: e8bd8010 ldmia sp!, {r4, pc}
8960: e89e0006 ldmia lr, {r1, r2}
8964: e3a04000 mov r4, #0 ; 0x0
8968: e3a03001 mov r3, #1 ; 0x1
896c: e0911003 adds r1, r1, r3
8970: e0a22004 adc r2, r2, r4
8974: e88e0006 stmia lr, {r1, r2}
8978: e8bd8010 ldmia sp!, {r4, pc}
897c: 000121f8 streqd r2, [r1], -r8
8980: 000121f0 streqd r2, [r1], -r0
如果我们将代码修改一下,不用这两个宏结果会怎样呢?
//test_builtin_expect.c
int test_likely(int x)
{
if(x)
x = 5;
else
x = 6;
return x;
}
int test_unlikely(int x)
{
if(x)
x = 5;
else
x = 6;
return x;
}
反汇编代码如下:
00008460 <test_likely>:
8460: e3500000 cmp r0, #0 ; 0x0
8464: 03a00006 moveq r0, #6 ; 0x6
8468: 13a00005 movne r0, #5 ; 0x5
846c: e1a0f00e mov pc, lr
00008470 <test_unlikely>:
8470: e3500000 cmp r0, #0 ; 0x0
8474: 03a00006 moveq r0, #6 ; 0x6
8478: 13a00005 movne r0, #5 ; 0x5
847c: e1a0f00e mov pc, lr
如上述例子分析所示,两个函数编译生成的汇编语句所使用到的跳转指令不一样,仔细分析下会发现__builtin_expect实际上是为了满足在大多数情况不执行跳转指令,__builtin_expect仅仅是告诉编译器优化,并没有改变其对真值的判断。宏likely和宏unlikely唯一的作用就是选择”将if分支还是else分支放在跳转指令之后,从而优化程序的执行效率”。 因为likely(EXP)代表条件表达式EXP很可能成立,而unlikely(EXP)代表条件表达式EXP很可能不成立,当程序员清楚EXP表达式 多数情况成立(不成立)时,就可使用likely(unlikely),使if分支(else分支)紧跟跳转指令其后,从而在大多数情况下不用执行跳转指令,避开跳转指令所带来的开销,从而达到优化的目的。
还有一点需要注意的是,在生成汇编时用的是arm-linux-gcc -fprofile-arcs -O2 -c test_builtin_expect.c,而不是arm-linux-gcc -O2 -c test_builtin_expect.c。