引言
先从一小段代码说起:
#include <stdio.h>
int main()
{
int sum = 0;
for (int i = 0; i < 100; i++) {
sum += i;
}
printf("sum = %d\n", sum);
return 0;
}
将代码以-O2
选项编译后,查看目标程序中的汇率指令:
gcc test.c -O2
objdump -d a.out
发现main函数汇编代码的第二行,是将立即数0x1356(十进制:4950)移入esi
寄存器中。也就是说,程序没有按原有的逻辑去执行循环累加,而直接返回了计算结果。相对没有加-O2
的汇编代码,精简了许多操作,而这些细微的差异如果不注意,会违背开发者的初衷,甚至影响程序预期结果。
那么-O2
到底包含了哪些编译选项?
# gcc选项解释
# -Q:使编译器在编译每个函数时输出函数名, 并在每个编译阶段结束时输出一些统计信息。
# 当出现在--help选项之前时, --help的输出内容会有所改变:
# 不再显示编译选项的通用描述, 而是显示该选项在当前的编译命令中是否开启。
# 对于有具体设置的值的选项, 会显示该选项被设置的具体值。
# --help=optimizers:显示所有的优化编译选项
[root@localhost ~]# gcc -Q -O2 --help=optimizers | grep "启用" | wc -l
129
[root@localhost ~]# gcc -Q -O2 --help=optimizers | grep "启用"
-faggressive-loop-optimizations [启用]
-falign-labels [启用]
-fasynchronous-unwind-tables [启用]
-fauto-inc-dec [启用]
-fbranch-count-reg [启用]
-fcaller-saves [启用]
-fcode-hoisting [启用]
-fcombine-stack-adjustments [启用]
-fcompare-elim [启用]
-fcprop-registers [启用]
-fcrossjumping [启用]
-fcse-follow-jumps [启用]
-fdce [启用]
-fdefer-pop [启用]
-fdelete-null-pointer-checks [启用]
-fdevirtualize [启用]
-fdevirtualize-speculatively [启用]
-fdse [启用]
-fearly-inlining [启用]
-fexpensive-optimizations [启用]
-fforward-propagate [启用]
-ffp-int-builtin-inexact [启用]
-ffunction-cse [启用]
-fgcse [启用]
-fgcse-lm [启用]
-fguess-branch-probability [启用]
-fhoist-adjacent-loads [启用]
-fif-conversion [启用]
-fif-conversion2 [启用]
-findirect-inlining [启用]
-finline [启用]
-finline-atomics [启用]
-finline-functions-called-once [启用]
-finline-small-functions [启用]
-fipa-bit-cp [启用]
-fipa-cp [启用]
-fipa-icf [启用]
-fipa-icf-functions [启用]
-fipa-icf-variables [启用]
-fipa-profile [启用]
-fipa-pure-const [启用]
-fipa-ra [启用]
-fipa-reference [启用]
-fipa-sra [启用]
-fipa-vrp [启用]
-fira-hoist-pressure [启用]
-fira-share-save-slots [启用]
-fira-share-spill-slots [启用]
-fisolate-erroneous-paths-dereference [启用]
-fivopts [启用]
-fjump-tables [启用]
-flifetime-dse [启用]
-flra-remat [启用]
-fmath-errno [启用]
-fmove-loop-invariants [启用]
-fomit-frame-pointer [启用]
-foptimize-sibling-calls [启用]
-foptimize-strlen [启用]
-fpartial-inlining [启用]
-fpeephole [启用]
-fpeephole2 [启用]
-fplt [启用]
-fprefetch-loop-arrays [启用]
-fprintf-return-value [启用]
-freg-struct-return [启用]
-frename-registers [启用]
-freorder-blocks [启用]
-freorder-blocks-and-partition [启用]
-freorder-functions [启用]
-frerun-cse-after-loop [启用]
-frtti [启用]
-fsched-critical-path-heuristic [启用]
-fsched-dep-count-heuristic [启用]
-fsched-group-heuristic [启用]
-fsched-interblock [启用]
-fsched-last-insn-heuristic [启用]
-fsched-rank-heuristic [启用]
-fsched-spec [启用]
-fsched-spec-insn-heuristic [启用]
-fsched-stalled-insns-dep [启用]
-fschedule-fusion [启用]
-fschedule-insns2 [启用]
-fshort-enums [启用]
-fshrink-wrap [启用]
-fshrink-wrap-separate [启用]
-fsigned-zeros [启用]
-fsplit-ivs-in-unroller [启用]
-fsplit-wide-types [启用]
-fssa-backprop [启用]
-fssa-phiopt [启用]
-fstdarg-opt [启用]
-fstore-merging [启用]
-fstrict-aliasing [启用]
-fstrict-volatile-bitfields [启用]
-fthread-jumps [启用]
-fno-threadsafe-statics [启用]
-ftrapping-math [启用]
-ftree-bit-ccp [启用]
-ftree-builtin-call-dce [启用]
-ftree-ccp [启用]
-ftree-ch [启用]
-ftree-coalesce-vars [启用]
-ftree-copy-prop [启用]
-ftree-cselim [启用]
-ftree-dce [启用]
-ftree-dominator-opts [启用]
-ftree-dse [启用]
-ftree-forwprop [启用]
-ftree-fre [启用]
-ftree-loop-if-convert [启用]
-ftree-loop-im [启用]
-ftree-loop-ivcanon [启用]
-ftree-loop-optimize [启用]
-ftree-phiprop [启用]
-ftree-pre [启用]
-ftree-pta [启用]
-ftree-reassoc [启用]
-ftree-scev-cprop [启用]
-ftree-sink [启用]
-ftree-slsr [启用]
-ftree-sra [启用]
-ftree-switch-conversion [启用]
-ftree-tail-merge [启用]
-ftree-ter [启用]
-ftree-vrp [启用]
-funwind-tables [启用]
-fvar-tracking [启用]
-fvar-tracking-assignments [启用]
-fweb [启用]
上述命令也可以查看-O3
、-O1
开启的编译选项,默认-O == -O1
。 如:
gcc -Q -O1 --help=optimizers | grep "启用"
gcc -Q -O3 --help=optimizers | grep "启用"
关于具体每个编译选项的解释,牵扯到太多的背景知识。可以通过man文档查找说明。也可以查阅GNU在线文档:Option Summary (Using the GNU Compiler Collection (GCC))
如何禁用某个函数的编译优化?
有通过在函数体中添加空的汇编指令的方法(不推荐):
void func()
{
// ...
asm volatile("");
// ...
}
还有通过添加预处理指令的方式:
#pragma GCC optimize ("O0")
void func()
{
// ...
}
但我还是更倾向在函数声明时就表示不使用编译优化:
void func() __attribute__((optimize("O0")));
void func()
{
// ...
}