关于内联函数,我们所看到的大部分书籍都是如下描述的,当然,这是我个人的语言:
内联函数是什么函数呢?直接了当的说,就是用inline关键字修饰的函数,不管是全局函数,还是成员函数,只要被inline关键字修饰,它就是内联函数。
内联函数与其他普通的函数有什么区别呢?内联函数在生成可执行文件时,其代码块是直接嵌入到调用处的。它与宏非常类似。当然,是否以嵌入方式存在,由编译器决定。
内联函数的核心作用是利用减少函数调用的产生开销来提高程序的性能。一句话,内联函数因为把代码直接嵌入到调用处,省略了函数调用这一步,无需初始化栈,保存寄存器等等,因而节省了运行开销。
当然,上面的话是完全正确的,可是大多数人仅仅读到上面的话就停止了自己的脚步,没有更加的深入下去,本文就带着一箩筐的好奇心来一探究竟。
内联函数到底有没有以嵌入式方式嵌入到调用处呢?
在VS2010里,我们写了一个小程序,如下:
1 #include <iostream> 2 using namespace std; 3 4 inline void test(){ 5 cout<<"This is a inline function"<<endl; 6 } 7 8 int main(){ 9 10 test(); 11 test(); 12 13 return 0; 14 }
test()是一个内联函数,但是究竟有没有被嵌入呢?在10行设置一个断点,按F5开始调试,再按ctrl+alt+D调出汇编码(VS2008下是alt+8)。
在汇编码里我们看到,test()是用call指令调用的,与其他函数并无不同。也就是说,test这个内联函数并没有被嵌入到调用处,它的作用跟普通函数一样。
下面我们在Linux下试试,环境是CentOS,g++ 4.4.5。
我们用以下指令来生成汇编码:
1 g++ -S source.cpp
汇编码如下:
1 .file "cpp.cpp" 2 .local _ZStL8__ioinit 3 .comm _ZStL8__ioinit,1,1 4 .section .rodata 5 .LC0: 6 .string "This is a inline function." 7 .section .text._Z4testv,"axG",@progbits,_Z4testv,comdat 8 .weak _Z4testv 9 .type _Z4testv, @function 10 _Z4testv: 11 .LFB957: 12 .cfi_startproc 13 .cfi_personality 0x0,__gxx_personality_v0 14 pushl %ebp 15 .cfi_def_cfa_offset 8 16 movl %esp, %ebp 17 .cfi_offset 5, -8 18 .cfi_def_cfa_register 5 19 subl $24, %esp 20 movl $.LC0, 4(%esp) 21 movl $_ZSt4cout, (%esp) 22 call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc 23 movl $_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_, 4(%esp) 24 movl %eax, (%esp) 25 call _ZNSolsEPFRSoS_E 26 leave 27 ret 28 .cfi_endproc 29 .LFE957: 30 .size _Z4testv, .-_Z4testv 31 .text 32 .globl main 33 .type main, @function 34 main: 35 .LFB958: 36 .cfi_startproc 37 .cfi_personality 0x0,__gxx_personality_v0 38 pushl %ebp 39 .cfi_def_cfa_offset 8 40 movl %esp, %ebp 41 .cfi_offset 5, -8 42 .cfi_def_cfa_register 5 43 andl $-16, %esp 44 call _Z4testv 45 call _Z4testv 46 movl $0, %eax 47 movl %ebp, %esp 48 popl %ebp 49 ret 50 .cfi_endproc 51 .LFE958: 52 .size main, .-main 53 .type _Z41__static_initialization_and_destruction_0ii, @function 54 _Z41__static_initialization_and_destruction_0ii: 55 .LFB967: 56 .cfi_startproc 57 .cfi_personality 0x0,__gxx_personality_v0 58 pushl %ebp 59 .cfi_def_cfa_offset 8 60 movl %esp, %ebp 61 .cfi_offset 5, -8 62 .cfi_def_cfa_register 5 63 subl $24, %esp 64 cmpl $1, 8(%ebp) 65 jne .L7 66 cmpl $65535, 12(%ebp) 67 jne .L7 68 movl $_ZStL8__ioinit, (%esp) 69 call _ZNSt8ios_base4InitC1Ev 70 movl $_ZNSt8ios_base4InitD1Ev, %eax 71 movl $__dso_handle, 8(%esp) 72 movl $_ZStL8__ioinit, 4(%esp) 73 movl %eax, (%esp) 74 call __cxa_atexit 75 .L7: 76 leave 77 ret 78 .cfi_endproc 79 .LFE967: 80 .size _Z41__static_initialization_and_destruction_0ii, .-_Z41__static_initialization_and_destruction_0ii 81 .type _GLOBAL__I_main, @function 82 _GLOBAL__I_main: 83 .LFB968: 84 .cfi_startproc 85 .cfi_personality 0x0,__gxx_personality_v0 86 pushl %ebp 87 .cfi_def_cfa_offset 8 88 movl %esp, %ebp 89 .cfi_offset 5, -8 90 .cfi_def_cfa_register 5 91 subl $24, %esp 92 movl $65535, 4(%esp) 93 movl $1, (%esp) 94 call _Z41__static_initialization_and_destruction_0ii 95 leave 96 ret 97 .cfi_endproc 98 .LFE968: 99 .size _GLOBAL__I_main, .-_GLOBAL__I_main 100 .section .ctors,"aw",@progbits 101 .align 4 102 .long _GLOBAL__I_main 103 .weakref _ZL20__gthrw_pthread_oncePiPFvvE,pthread_once 104 .weakref _ZL27__gthrw_pthread_getspecificj,pthread_getspecific 105 .weakref _ZL27__gthrw_pthread_setspecificjPKv,pthread_setspecific 106 .weakref _ZL22__gthrw_pthread_createPmPK14pthread_attr_tPFPvS3_ES3_,pthread_create 107 .weakref _ZL20__gthrw_pthread_joinmPPv,pthread_join 108 .weakref _ZL21__gthrw_pthread_equalmm,pthread_equal 109 .weakref _ZL20__gthrw_pthread_selfv,pthread_self 110 .weakref _ZL22__gthrw_pthread_detachm,pthread_detach 111 .weakref _ZL22__gthrw_pthread_cancelm,pthread_cancel 112 .weakref _ZL19__gthrw_sched_yieldv,sched_yield 113 .weakref _ZL26__gthrw_pthread_mutex_lockP15pthread_mutex_t,pthread_mutex_lock 114 .weakref _ZL29__gthrw_pthread_mutex_trylockP15pthread_mutex_t,pthread_mutex_trylock 115 .weakref _ZL31__gthrw_pthread_mutex_timedlockP15pthread_mutex_tPK8timespec,pthread_mutex_timedlock 116 .weakref _ZL28__gthrw_pthread_mutex_unlockP15pthread_mutex_t,pthread_mutex_unlock 117 .weakref _ZL26__gthrw_pthread_mutex_initP15pthread_mutex_tPK19pthread_mutexattr_t,pthread_mutex_init 118 .weakref _ZL29__gthrw_pthread_mutex_destroyP15pthread_mutex_t,pthread_mutex_destroy 119 .weakref _ZL30__gthrw_pthread_cond_broadcastP14pthread_cond_t,pthread_cond_broadcast 120 .weakref _ZL27__gthrw_pthread_cond_signalP14pthread_cond_t,pthread_cond_signal 121 .weakref _ZL25__gthrw_pthread_cond_waitP14pthread_cond_tP15pthread_mutex_t,pthread_cond_wait 122 .weakref _ZL30__gthrw_pthread_cond_timedwaitP14pthread_cond_tP15pthread_mutex_tPK8timespec,pthread_cond_timedwait 123 .weakref _ZL28__gthrw_pthread_cond_destroyP14pthread_cond_t,pthread_cond_destroy 124 .weakref _ZL26__gthrw_pthread_key_createPjPFvPvE,pthread_key_create 125 .weakref _ZL26__gthrw_pthread_key_deletej,pthread_key_delete 126 .weakref _ZL30__gthrw_pthread_mutexattr_initP19pthread_mutexattr_t,pthread_mutexattr_init 127 .weakref _ZL33__gthrw_pthread_mutexattr_settypeP19pthread_mutexattr_ti,pthread_mutexattr_settype 128 .weakref _ZL33__gthrw_pthread_mutexattr_destroyP19pthread_mutexattr_t,pthread_mutexattr_destroy 129 .ident "GCC: (GNU) 4.4.5 20101001 (Vine Linux 4.4.5-6vl6)" 130 .section .note.GNU-stack,"",@progbits
可以通过上面的汇编码看到第44和第45行,依旧采用的调用的方式来调用内联函数。
44 call _Z4testv 45 call _Z4testv
你可能会惊奇,靠,这内联函数搞个屁啊,都不带嵌入的。
是的,它不会主动嵌入,当然这个跟编译器的实现密切相关,也就是说,编译器的作者想让内联函数什么时候嵌入就什么时候嵌入。
不过g++为我们提供了一个指令,可以生成将内联函数嵌入的汇编码,指令如下:
1 g++ -S -O source.cpp
注意,都是大写。-O的意思是优化编译,在这种编译条件下,我们再来看看我们的汇编码:
1 .file "cpp.cpp" 2 .text 3 .type _GLOBAL__I_main, @function 4 _GLOBAL__I_main: 5 .LFB972: 6 .cfi_startproc 7 .cfi_personality 0x0,__gxx_personality_v0 8 pushl %ebp 9 .cfi_def_cfa_offset 8 10 movl %esp, %ebp 11 .cfi_offset 5, -8 12 .cfi_def_cfa_register 5 13 subl $24, %esp 14 movl $_ZStL8__ioinit, (%esp) 15 call _ZNSt8ios_base4InitC1Ev 16 movl $__dso_handle, 8(%esp) 17 movl $_ZStL8__ioinit, 4(%esp) 18 movl $_ZNSt8ios_base4InitD1Ev, (%esp) 19 call __cxa_atexit 20 leave 21 ret 22 .cfi_endproc 23 .LFE972: 24 .size _GLOBAL__I_main, .-_GLOBAL__I_main 25 .section .ctors,"aw",@progbits 26 .align 4 27 .long _GLOBAL__I_main 28 .section .rodata.str1.1,"aMS",@progbits,1 29 .LC0: 30 .string "This is a inline function." 31 .text 32 .globl main 33 .type main, @function 34 main: 35 .LFB962: 36 .cfi_startproc 37 .cfi_personality 0x0,__gxx_personality_v0 38 pushl %ebp 39 .cfi_def_cfa_offset 8 40 movl %esp, %ebp 41 .cfi_offset 5, -8 42 .cfi_def_cfa_register 5 43 andl $-16, %esp 44 pushl %esi 45 pushl %ebx 46 subl $24, %esp 47 movl $26, 8(%esp) 48 movl $.LC0, 4(%esp) 49 movl $_ZSt4cout, (%esp) 50 .cfi_escape 0x10,0x3,0x7,0x55,0x9,0xf0,0x1a,0x9,0xf8,0x22 51 .cfi_escape 0x10,0x6,0x7,0x55,0x9,0xf0,0x1a,0x9,0xfc,0x22 52 call _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_i 53 movl $_ZSt4cout, %esi 54 movl _ZSt4cout, %eax 55 movl -12(%eax), %eax 56 movl 124(%esi,%eax), %ebx 57 testl %ebx, %ebx 58 jne .L4 59 call _ZSt16__throw_bad_castv 60 .L4: 61 cmpb $0, 28(%ebx) 62 je .L5 63 movzbl 39(%ebx), %eax 64 .p2align 4,,4 65 jmp .L6 66 .L5: 67 movl %ebx, (%esp) 68 .p2align 4,,5 69 call _ZNKSt5ctypeIcE13_M_widen_initEv 70 movl (%ebx), %eax 71 movl $10, 4(%esp) 72 movl %ebx, (%esp) 73 call *24(%eax) 74 .L6: 75 movsbl %al,%eax 76 movl %eax, 4(%esp) 77 movl $_ZSt4cout, (%esp) 78 call _ZNSo3putEc 79 movl %eax, (%esp) 80 call _ZNSo5flushEv 81 movl $26, 8(%esp) 82 movl $.LC0, 4(%esp) 83 movl $_ZSt4cout, (%esp) 84 call _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_i 85 movl _ZSt4cout, %eax 86 movl -12(%eax), %eax 87 movl 124(%esi,%eax), %ebx 88 testl %ebx, %ebx 89 jne .L7 90 call _ZSt16__throw_bad_castv 91 .L7: 92 cmpb $0, 28(%ebx) 93 je .L8 94 movzbl 39(%ebx), %eax 95 .p2align 4,,4 96 jmp .L9 97 .L8: 98 movl %ebx, (%esp) 99 .p2align 4,,5 100 call _ZNKSt5ctypeIcE13_M_widen_initEv 101 movl (%ebx), %eax 102 movl $10, 4(%esp) 103 movl %ebx, (%esp) 104 call *24(%eax) 105 .L9: 106 movsbl %al,%eax 107 movl %eax, 4(%esp) 108 movl $_ZSt4cout, (%esp) 109 call _ZNSo3putEc 110 movl %eax, (%esp) 111 call _ZNSo5flushEv 112 movl $0, %eax 113 addl $24, %esp 114 popl %ebx 115 popl %esi 116 movl %ebp, %esp 117 popl %ebp 118 ret 119 .cfi_endproc 120 .LFE962: 121 .size main, .-main 122 .local _ZStL8__ioinit 123 .comm _ZStL8__ioinit,1,1 124 .weakref _ZL20__gthrw_pthread_oncePiPFvvE,pthread_once 125 .weakref _ZL27__gthrw_pthread_getspecificj,pthread_getspecific 126 .weakref _ZL27__gthrw_pthread_setspecificjPKv,pthread_setspecific 127 .weakref _ZL22__gthrw_pthread_createPmPK14pthread_attr_tPFPvS3_ES3_,pthread_create 128 .weakref _ZL20__gthrw_pthread_joinmPPv,pthread_join 129 .weakref _ZL21__gthrw_pthread_equalmm,pthread_equal 130 .weakref _ZL20__gthrw_pthread_selfv,pthread_self 131 .weakref _ZL22__gthrw_pthread_detachm,pthread_detach 132 .weakref _ZL22__gthrw_pthread_cancelm,pthread_cancel 133 .weakref _ZL19__gthrw_sched_yieldv,sched_yield 134 .weakref _ZL26__gthrw_pthread_mutex_lockP15pthread_mutex_t,pthread_mutex_lock 135 .weakref _ZL29__gthrw_pthread_mutex_trylockP15pthread_mutex_t,pthread_mutex_trylock 136 .weakref _ZL31__gthrw_pthread_mutex_timedlockP15pthread_mutex_tPK8timespec,pthread_mutex_timedlock 137 .weakref _ZL28__gthrw_pthread_mutex_unlockP15pthread_mutex_t,pthread_mutex_unlock 138 .weakref _ZL26__gthrw_pthread_mutex_initP15pthread_mutex_tPK19pthread_mutexattr_t,pthread_mutex_init 139 .weakref _ZL29__gthrw_pthread_mutex_destroyP15pthread_mutex_t,pthread_mutex_destroy 140 .weakref _ZL30__gthrw_pthread_cond_broadcastP14pthread_cond_t,pthread_cond_broadcast 141 .weakref _ZL27__gthrw_pthread_cond_signalP14pthread_cond_t,pthread_cond_signal 142 .weakref _ZL25__gthrw_pthread_cond_waitP14pthread_cond_tP15pthread_mutex_t,pthread_cond_wait 143 .weakref _ZL30__gthrw_pthread_cond_timedwaitP14pthread_cond_tP15pthread_mutex_tPK8timespec,pthread_cond_timedwait 144 .weakref _ZL28__gthrw_pthread_cond_destroyP14pthread_cond_t,pthread_cond_destroy 145 .weakref _ZL26__gthrw_pthread_key_createPjPFvPvE,pthread_key_create 146 .weakref _ZL26__gthrw_pthread_key_deletej,pthread_key_delete 147 .weakref _ZL30__gthrw_pthread_mutexattr_initP19pthread_mutexattr_t,pthread_mutexattr_init 148 .weakref _ZL33__gthrw_pthread_mutexattr_settypeP19pthread_mutexattr_ti,pthread_mutexattr_settype 149 .weakref _ZL33__gthrw_pthread_mutexattr_destroyP19pthread_mutexattr_t,pthread_mutexattr_destroy 150 .ident "GCC: (GNU) 4.4.5 20101001 (Vine Linux 4.4.5-6vl6)" 151 .section .note.GNU-stack,"",@progbits
可以看到
.LFB962 和 .L6
是我们的test内联函数,因为它使用了LC0这个字符串,可见,只有在启用了优化编译时,内联函数才会被嵌入到实际调用的代码段中。
而宏,无论是否启用优化编译,宏的块都是嵌入在调用宏的地方的,这是宏与内联函数最大的不同。
这就是C++Primer上明明白白写的一句话:
内联说明(inline specification)对于编译器来说只是一个建议,编译器可以选择忽略这个建议
很暧昧吧,哈哈!
纸上得来终觉浅,绝知此事要躬行。继续努力。
-以上-