Clang 9.0版本合并了一个非常有用的功能:-ftime-trace,该功能允许以友好的格式生成时间跟踪分析数据,对于开发人员更好地理解编译器将大部分时间花在何处以及其他需要改进的领域非常有用。已经在Unity、微信编译优化中使用。
在此之前,gcc/clang都已支持-ftime-report,可以打印编译过程每个阶段的时间摘要信息,但对用户很不友好。
1 clang/llvm编译器设置
设置clang编译器:
export CC=/usr/bin/clang
export CXX=/usr/bin/clang++
设置clang编译器-ftime-trace
参数:
set(CMAKE_C_COMPILER "/usr/bin/clang")
set(CMAKE_CXX_COMPILER "/usr/bin/clang++")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -ftime-trace")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -ftime-trace")
2 示例编译时长分析
示例工程目录:
├── build
├── CMakeLists.txt
└── demo.cc
// demo.cc
#include <vector>
#include <string>
#include <unordered_map>
#include <regex>
int main()
{
std::vector<int> v(10);
v.push_back(7);
std::unordered_map<std::string, double> m;
m.insert(std::make_pair("foo", 1.0));
std::regex re("^asd.*$");
return 0;
}
# CMakeLists.txt
cmake_minimum_required(VERSION 3.5)
# Set the project name
project (demo)
SET (CMAKE_C_COMPILER "/usr/bin/clang")
SET (CMAKE_CXX_COMPILER "/usr/bin/clang++")
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -ftime-trace")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -ftime-trace")
# Add an executable
add_executable(demo demo.cc)
编译:
export CC=/usr/bin/clang
export CXX=/usr/bin/clang++
cd build
cmake ..
make
~/project/compile-time-analysis/time-trace/build$ find ./ -name *.json
./CMakeFiles/demo.dir/demo.cc.json
CMakeFiles/demo.dir/demo.cc.json
3 查看编译时长统计数据
通过chrome://tracing和speedscope可视化,我们可以清楚的知道在整个文件编译过程中那部分最耗时,并做有针对性的优化。 Trace Event Format
3.1 chrome://tracing/
3.2 https://www.speedscope.app/
- Time Order
- Left Heavy
- Sandwich
4 多个源文件编译统计分析
Clang 9+ -ftime-trace可以很方便地获取单个文件编译最耗时,但当一个工程中有多个文件需要构建时,如何找最耗时的操作就尤为重要,例如:在整个构建过程中那个文件编译最耗时,那个头文件包含最耗时,那个C++模板实例化最耗时,那个文件以及那个函数代码生成最耗时等。
4.1 将所有源文件的编译统计汇总
chrome://tracing只支持单个源文件的编译统计的导入,通过python脚本将所有源文件的编译统计汇总,用speedscope或chrome://tracing查看即可该工程的编译统计情况。
#!/usr/bin/env python3
"""Combine JSON from multiple -ftime-traces into one.
Run with (e.g.): python combine_traces.py foo.json bar.json.
"""
import json
import sys
if __name__ == '__main__':
start_time = 0
combined_data = []
for filename in sys.argv[1:]:
with open(filename, 'r') as f:
file_time = None
for event in json.load(f)['traceEvents']:
# Skip metadata events
# Skip total events
# Filter out shorter events to reduce data size
if event['ph'] == 'M' or \
event['name'].startswith('Total') or \
event['dur'] < 5000:
continue
if event['name'] == 'ExecuteCompiler':
# Find how long this compilation takes
file_time = event['dur']
# Set the file name in ExecuteCompiler
event['args']['detail'] = filename
# Offset start time to make compiles sequential
event['ts'] += start_time
# Add data to combined
combined_data.append(event)
# Increase the start time for the next file
# Add 1 to avoid issues with simultaneous events
start_time += file_time + 1
with open('combined.json', 'w') as f:
json.dump({'traceEvents': sorted(combined_data, key=lambda k: k['ts'])}, f)
4.2 Clang Build Analyzer
https://github.com/aras-p/ClangBuildAnalyzer
ClangBuildAnalyzer(-ftime-trace)
5 clang -ftime-trace实现原理分析
5.1 clang/llvm编译步骤
The Architecture of Open Source Applications: LLVM
查看clang/llvm编译的步骤:
// min.cc
int min(int lhs, int rhs)
{
if (lhs > rhs) {
return rhs;
}
return lhs;
}
~/project/compile-time-analysis/time-trace$ clang -ccc-print-phases min.cc
+- 0: input, "min.cc", c++
+- 1: preprocessor, {0}, c++-cpp-output
+- 2: compiler, {1}, ir
+- 3: backend, {2}, assembler
+- 4: assembler, {3}, object
5: linker, {4}, image
- 预处理(preprocessor):
这阶段的工作主要是头文件导入,宏展开/替换,预编译指令处理,以及注释的去除。
clang -E min.cc
- 编译(compiler):
这阶段做的事情比较多,主要有:
a. 词法分析(Lexical Analysis):将代码转换成一系列 token,如大中小括号 paren’()’ square’[]’ brace’{}’、标识符 identifier、字符串 string_literal、数字常量 numeric_constant 等等;
clang -fsyntax-only -Xclang -dump-tokens min.cc
b. 语法分析、语义分析
这个阶段有两个模块Parser(语法syntax分析器)、Sema(语义分析Semantic)配合完成:
Parser:遍历每个Token做词句分析,根据当前语言的语法,验证语法是否正确,最后生成一个 节点(Nodes)并记录相关的信息。
Semantic:在Lex 跟 syntax Analysis之后, 已经确保 词 句已经是正确的形式,semantic 接着做return values, size boundaries, uninitialized variables 等检查,如果发现语义上有错误给出提示;如果没有错误就会将 Token 按照语法组合成语义,生成 Clang 语义节点(Nodes),然后将这些节点按照层级关系构成抽象语法树(AST)。
clang -fsyntax-only -Xclang -ast-dump min.cc
c. 静态分析(Static Analysis):检查代码错误,例如参数类型是否错误,调用对象方法是否有实现;
d. 中间代码生成(Code Generation):将语法树自顶向下遍历逐步翻译成 LLVM IR。
clang -S -emit-llvm min.cc -o min.ll
- 生成汇编代码(backend):
LLVM 将 LLVM IR 生成当前平台的汇编代码,期间 LLVM 根据编译设置的优化级别 Optimization Level 做对应的优化(Optimize),例如, Debug 的 -O0 不需要优化,而 Release 的 -Os 是尽可能优化代码效率并减少体积。
clang -S min.cc -o min.s
- 生成目标文件:
汇编器(Assembler)将汇编代码转换为机器代码,它会创建一个目标对象文件,以 .o 结尾。
clang -c min.cc -o min.o
- 链接:
链接器(Linker)把若干个目标文件链接在一起,生成可执行文件。
5.2 clang/llvm源码目录
clang
├── lib
│ ├── Analysis
│ ├── ARCMigrate
│ ├── AST
│ ├── ASTMatchers
│ ├── Basic
│ ├── CMakeLists.txt
│ ├── CodeGen
│ ├── CrossTU
│ ├── DirectoryWatcher
│ ├── Driver
│ ├── Edit
│ ├── Format
│ ├── Frontend
│ ├── FrontendTool
│ ├── Headers
│ ├── Index
│ ├── Lex
│ ├── Parse
│ ├── Rewrite
│ ├── Sema
│ ├── Serialization
│ ├── StaticAnalyzer
│ ├── Testing
│ └── Tooling
5.3 -ftime-trace
clang/llvm每个文件一个编译单元,逐个文件进行编译,最后链接生成可执行程序,-ftime-trace
通过llvm/trunk/include/llvm/Support/TimeProfiler.h
,llvm/trunk/lib/Support/TimeProfiler.cpp
中定义llvm::TimeTraceScope
在编译个各个阶段进行插桩(RAII),输出每个编译阶段的编译时长。
-ftime-trace源码:Time trace profiler output support (-ftime-trace)
clang driver源码分析:
Clang里面真正的前端是什么? @蓝色
有关LLVM(https://github.com/yejinlei/about-compiler)
LLVM代码研读(2) — LLVM前端: CLANG剖析
如何学习 clang和LLVM(有关于源代码阅读),需要哪些知识?
main->ExecuteCC1Tool->cc1_main
~/project/compile-time-analysis/time-trace$ clang -### min.cc
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
"/usr/lib/llvm-10/bin/clang" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-emit-obj" "-mrelax-all" "-disable-free" "-disable-llvm-verifier" "-discard-value-names" "-main-file-name" "min.cc" "-mrelocation-model" "static" "-mthread-model" "posix" "-mframe-pointer=all" "-fmath-errno" "-fno-rounding-math" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-target-cpu" "x86-64" "-dwarf-column-info" "-fno-split-dwarf-inlining" "-debugger-tuning=gdb" "-resource-dir" "/usr/lib/llvm-10/lib/clang/10.0.0" "-internal-isystem" "/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9" "-internal-isystem" "/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9" "-internal-isystem" "/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/x86_64-linux-gnu/c++/9" "-internal-isystem" "/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/backward" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/usr/lib/llvm-10/lib/clang/10.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fdeprecated-macro" "-fdebug-compilation-dir" "/home/wwchao/project/compile-time-analysis/time-trace" "-ferror-limit" "19" "-fmessage-length" "0" "-fgnuc-version=4.2.1" "-fobjc-runtime=gcc" "-fcxx-exceptions" "-fexceptions" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-faddrsig" "-o" "/tmp/min-aa1470.o" "-x" "c++" "min.cc"
"/usr/bin/ld" "-z" "relro" "--hash-style=gnu" "--build-id" "--eh-frame-hdr" "-m" "elf_x86_64" "-dynamic-linker" "/lib64/ld-linux-x86-64.so.2" "-o" "a.out" "/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crt1.o" "/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crti.o" "/usr/bin/../lib/gcc/x86_64-linux-gnu/9/crtbegin.o" "-L/usr/bin/../lib/gcc/x86_64-linux-gnu/9" "-L/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu" "-L/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../../lib64" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/usr/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu/../../lib64" "-L/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../.." "-L/usr/lib/llvm-10/bin/../lib" "-L/lib" "-L/usr/lib" "/tmp/min-aa1470.o" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "-lc" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "/usr/bin/../lib/gcc/x86_64-linux-gnu/9/crtend.o" "/usr/bin/../lib/gcc/x86_64-linux-gnu/9/../../../x86_64-linux-gnu/crtn.o"
Note: 分析clang源码可以编译debug版本gdb调试或添加打印信息.
补充clang dirver编译关键节点信息
6 参考链接
[1] https://github.com/aras-p/ClangBuildAnalyzer/tree/v1.2.0
[2] https://www.snsystems.com/technology/tech-blog/clang-time-trace-feature
[3] https://aras-p.info/blog/2019/01/16/time-trace-timeline-flame-chart-profiler-for-Clang/
[4] https://firmwaresecurity.com/2019/11/01/clang-build-analyzer-clang-build-analysis-tool/
[5] https://aras-p.info/blog/2019/09/28/Clang-Build-Analyzer/
[6] https://aras-p.info/blog/2019/01/12/Investigating-compile-times-and-Clang-ftime-report/
[7] https://github.com/jlfwong/speedscope
[8] https://reviews.llvm.org/D58675
[9] https://github.com/aras-p/llvm-project-20170507/pull/2/commits
[10] https://github.com/aras-p/llvm-project-20170507/pull/2
[11] https://blog.csdn.net/imtech4713/article/details/103622767
[12] https://www.jianshu.com/p/96058bf1ecc2
[13] Clang/LLVM 介绍、OC 程序的编译过程
[14] 初探 Clang
[15] Hades:移动端静态分析框架
[16] Clang 10 documentationl
[17] LLVM Clang 9.0添加“-ftime-trace”以生成有用的时间跟踪分析数据
[18] Clang 9.0.0 Release Notes
[19] GCC 10 Release Series
[20] Clang里面真正的前端是什么? @蓝色
[21] LLVM代码研读(2) — LLVM前端: CLANG剖析
[22] 有关LLVM(https://github.com/yejinlei/about-compiler)
[23] The Architecture of Open Source Applications: LLVM
[24] 如何学习 clang和LLVM(有关于源代码阅读),需要哪些知识?
[25] 微信团队分享:极致优化,iOS版微信编译速度3倍提升的实践总结