首先,无论在何处,当我们遇到宏likely和宏unlikely时,都要明确一点:
if(likely(value)) 等价于 if(value)
if(unlikely(value)) 也等价于 if(value)
也就是说 ,当value值为真时执行if分支,为假时执行else分支,从阅读和理解代码的角度来看,是一样的!!!
以下为这两个宏的一般定义:
#define likely(x) __builtin_expect(!!(x), 1)
#define unlikely(x) __builtin_expect(!!(x), 0)
而__builtin_expect() 是 GCC (version >= 2.96)提供给程序员使用的,目的是将“分支转移”的信息提供给编译器,这样编译器可以对代码进行优化,以减少指令跳转带来的性能下降。__builtin_expect()在GCC的官方文档中解释如下(可跳过):
-- Built-in Function: long __builtin_expect (long EXP, long C)
You may use `__builtin_expect' to provide the compiler with branch
prediction information. In general, you should prefer to use
actual profile feedback for this (`-fprofile-arcs'), as
programmers are notoriously bad at predicting how their programs
actually perform. However, there are applications in which this
data is hard to collect.
The return value is the value of EXP, which should be an integral
expression. The value of C must be a compile-time constant. The
semantics of the built-in are that it is expected that EXP == C.
For example:
if (__builtin_expect (x, 0))
foo ();
would indicate that we do not expect to call `foo', since we
expect `x' to be zero. Since you are limited to integral
expressions for EXP, you should use constructions such as
if (__builtin_expect (ptr != NULL, 1))
error ();
when testing pointer or floating-point values.
也就是说,GCC的内建方法会判断 EXP == C 是否成立,成立则将if分支中的执行语句紧跟放在汇编跳转指令之后,否则将else分支中的执行语句紧跟汇编跳转指令之后。如下例子所示:
编译并导出目标文件的汇编表示:
gcc -fprofile-arcs -O2 -c test.c
objdump -d test.o
得到如下汇编:
test.o: file format elf32-i386
Disassembly of section .text:
00000000 <test_likely>:
0: 55 push %ebp
1: 83 05 00 00 00 00 01 addl $0x1,0x0
8: 89 e5 mov %esp,%ebp
a: 83 15 04 00 00 00 00 adcl $0x0,0x4
11: 83 7d 08 02 cmpl $0x2,0x8(%ebp) //留意这里!!!判断 x == 2
15: 75 15 jne 2c <test_likely+0x2c> //跳转指令jne!!!x != 2 时才跳转!
17: 83 05 08 00 00 00 01 addl $0x1,0x8 //if分支代码 x++
1e: b8 03 00 00 00 mov $0x3,%eax
23: 83 15 0c 00 00 00 00 adcl $0x0,0xc
2a: 5d pop %ebp
2b: c3 ret
2c: 8b 45 08 mov 0x8(%ebp),%eax //跳转到这里,else分支代码 x--
2f: 5d pop %ebp
30: 83 e8 01 sub $0x1,%eax
33: 83 05 10 00 00 00 01 addl $0x1,0x10
3a: 83 15 14 00 00 00 00 adcl $0x0,0x14
41: c3 ret
42: 8d b4 26 00 00 00 00 lea 0x0(%esi),%esi
49: 8d bc 27 00 00 00 00 lea 0x0(%edi),%edi
00000050 <test_unlikely>:
50: 55 push %ebp
51: 89 e5 mov %esp,%ebp
53: 8b 45 08 mov 0x8(%ebp),%eax
56: 83 05 18 00 00 00 01 addl $0x1,0x18
5d: 83 15 1c 00 00 00 00 adcl $0x0,0x1c
64: 83 f8 02 cmp $0x2,%eax //留意这里!!!判断 x == 2
67: 74 13 je 7c <test_unlikely+0x2c> //跳转指令je!!!x == 2 时就跳转!
69: 83 e8 01 sub $0x1,%eax //else分支代码 x--
6c: 83 05 28 00 00 00 01 addl $0x1,0x28
73: 83 15 2c 00 00 00 00 adcl $0x0,0x2c
7a: 5d pop %ebp
7b: c3 ret
7c: 83 05 20 00 00 00 01 addl $0x1,0x20 //跳转到这里,if分支代码 x++
83: b0 03 mov $0x3,%al
85: 83 15 24 00 00 00 00 adcl $0x0,0x24
8c: 5d pop %ebp
8d: c3 ret
8e: 66 90 xchg %ax,%ax
00000090 <_GLOBAL__I_0_test_unlikely>:
90: 55 push %ebp
91: 89 e5 mov %esp,%ebp
93: 83 ec 08 sub $0x8,%esp
96: c7 04 24 00 00 00 00 movl $0x0,(%esp)
9d: e8 fc ff ff ff call 9e <_GLOBAL__I_0_test_unlikely+0xe>
a2: c9 leave
a3: c3 ret
注意:likely和unlikely所生成的跳转指令是不同的,分别是jne和je!!!
如上述例子分析所示,宏likely和宏unlikely唯一的作用就是选择“将if分支还是else分支放在跳转指令之后,从而优化程序的执行效率”。因为likely(EXP)代表条件表达式EXP很可能成立,而unlikely(EXP)代表条件表达式EXP很可能不成立,当程序员清楚EXP表达式多数情况成立(不成立)时,就可使用likely(unlikely),使if分支(else分支)紧跟跳转指令其后,从而在大多数情况下不用执行跳转指令,避开跳转指令所带来的开销,从而达到优化的目的。
PS本人一个想不明白的问题,希望知道的牛人帮忙解答一下:(如下)
若在上面的例子中添加多如下一个函数
得到的汇编代码如下:
test.o: file format elf32-i386
Disassembly of section .text:
00000000 <test_normal>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 83 7d 08 02 cmpl $0x2,0x8(%ebp)
7: 75 15 jne 1e <test_normal+0x1e>
9: 83 05 30 00 00 00 01 addl $0x1,0x30
10: b8 03 00 00 00 mov $0x3,%eax
15: 83 15 34 00 00 00 00 adcl $0x0,0x34
1c: 5d pop %ebp
1d: c3 ret
1e: 8b 45 08 mov 0x8(%ebp),%eax
21: 5d pop %ebp
22: 83 e8 01 sub $0x1,%eax
25: 83 05 38 00 00 00 01 addl $0x1,0x38
2c: 83 15 3c 00 00 00 00 adcl $0x0,0x3c
33: c3 ret
34: 8d b6 00 00 00 00 lea 0x0(%esi),%esi
3a: 8d bf 00 00 00 00 lea 0x0(%edi),%edi
00000040 <test_likely>:
40: 55 push %ebp
41: 83 05 00 00 00 00 01 addl $0x1,0x0
48: 89 e5 mov %esp,%ebp
4a: 83 15 04 00 00 00 00 adcl $0x0,0x4
51: 83 7d 08 02 cmpl $0x2,0x8(%ebp)
55: 75 15 jne 6c <test_likely+0x2c>
57: 83 05 08 00 00 00 01 addl $0x1,0x8
5e: b8 03 00 00 00 mov $0x3,%eax
63: 83 15 0c 00 00 00 00 adcl $0x0,0xc
6a: 5d pop %ebp
6b: c3 ret
6c: 8b 45 08 mov 0x8(%ebp),%eax
6f: 5d pop %ebp
70: 83 e8 01 sub $0x1,%eax
73: 83 05 10 00 00 00 01 addl $0x1,0x10
7a: 83 15 14 00 00 00 00 adcl $0x0,0x14
81: c3 ret
82: 8d b4 26 00 00 00 00 lea 0x0(%esi),%esi
89: 8d bc 27 00 00 00 00 lea 0x0(%edi),%edi
00000090 <test_unlikely>:
90: 55 push %ebp
91: 89 e5 mov %esp,%ebp
93: 8b 45 08 mov 0x8(%ebp),%eax
96: 83 05 18 00 00 00 01 addl $0x1,0x18
9d: 83 15 1c 00 00 00 00 adcl $0x0,0x1c
a4: 83 f8 02 cmp $0x2,%eax
a7: 74 13 je bc <test_unlikely+0x2c>
a9: 83 e8 01 sub $0x1,%eax
ac: 83 05 28 00 00 00 01 addl $0x1,0x28
b3: 83 15 2c 00 00 00 00 adcl $0x0,0x2c
ba: 5d pop %ebp
bb: c3 ret
bc: 83 05 20 00 00 00 01 addl $0x1,0x20
c3: b0 03 mov $0x3,%al
c5: 83 15 24 00 00 00 00 adcl $0x0,0x24
cc: 5d pop %ebp
cd: c3 ret
ce: 66 90 xchg %ax,%ax
000000d0 <_GLOBAL__I_0_test_normal>:
d0: 55 push %ebp
d1: 89 e5 mov %esp,%ebp
d3: 83 ec 08 sub $0x8,%esp
d6: c7 04 24 00 00 00 00 movl $0x0,(%esp)
dd: e8 fc ff ff ff call de <_GLOBAL__I_0_test_normal+0xe>
e2: c9 leave
e3: c3 ret
如上所示,对比test_normal和test_likely两函数的汇编代码,可以看出test_likely在跳转指令之前多了addl和adcl两条指令,但两者其他的汇编代码流程是一样的,这两条指令的作用何在?同样,test_unlikely在跳转指令之前也有这两天指令的存在。
还有,多了两条指令,表示CPU就要消耗执行这两天指令的指令周期。此处就有问题了,如果程序员知道条件表达式EXP(这里为x == 2)很可能成立,使用if(EXP)即可,还有使用if(likely(EXP))的需要么?同样,若EXP很可能不成立,程序员使用if(!EXP){……//原不成立分支} else{……//原成立分支}即可,还有必要使用if(unlikely(EXP))么?况且likely和unlikely都要莫名其妙地多执行了addl和adcl两条指令。
希望高手指教。