Getting Started with LLVM Core Libraries
目录
Build and Install LLVM
External Projects
- Clang extra tools
- Compiler-RT
- DragonEgg
- LLVM test suite
- LLDB
- libc++
Tools and Design
Frontend
- 使用libclang
- 理解Clang diagnostics
- the frontend phases:
- Lexical analysis
- Syntactic analysis(AST)
- TranslationUnitDecl
- TypedefDecl
- FunctionDecl
- CFG
- TranslationUnitDecl
- Semantic analysis
LLVM IR
- syntax
- Module
- Function
- BasicBlock
- Instruction
- use-def与def-use链
- Value
- User
- generator
- IR级优化:pass
- -Ox
- -print-stats
- 理解pass间依赖:
- e.g. Loop Info and Dominator Tree
- 理解pass API
- 定制pass
Backend
- TableGen
- SelectionDAG
- Scheduler
- MachineInstr
- Register allocation
- Prologue and epilogue
- MCInst
JIT
- ExecutionEngine
- llvm::JIT / llvm::MCJIT
- RTDyldMemoryManager
- allocateCodeSection() allocateDataSection()
- getSymbolAddress() 外部库中符号的地址
- finalizeMemory()
- JITCodeEmitter < MachineCodeEmitter
- JITMemoryManager
- JITResolver 处理还未编译的目标函数(生成stub?)
- TargetJITInfo
- replaceMachineCodeForFunction
- relocate
- emitFunctionStub
- <Target>CodeEmitter
- OwningPtr<ExecutionEngine> EE(EngineBuilder(M).create());
- Function *SumFn = M->getFunction("sum");
- int (*Sum)(int, int) = (int (*)(int, int))EE->getPointerToFunction(SumFn);
- $ clang++ sum-jit.cpp -g -O3 -rdynamic -fno-rtti $(llvm-config --cppflags --ldflags --libs jit native irreader) -o sum-jit
- 传递一般的参数:GenericValue //不知道这里是怎么自动调整栈的?
- 新的MCJIT
- MCJIT::finalizeObject() //看起来这里强调了对象的生命周期管理
- ObjectBuffer ObjectCache ObjectImage
- RuntimeDyldImpl::loadObject()
- RuntimeDyld::getSymbolLoadAddress()
- MCJIT::emitObject()
交叉编译
- build,host,target
- Multilib
- Cross Linux from Scratch http://trac.cross-lfs.org
- There are complete development boards emulated by QEMU.(完全仿真的开发板?不错耶)
Clang静态分析
- (后端指令生成的)性能优化 & (基于前端AST的)静态分析,编译器技术的两大法宝!
- 竞争对手:HP Fortify and Synopsis Coverity
- exponential-time complexity,不支持inter-module analysis
- e.g. forward dataflow analysis
- 给变量符号关联一些属性,然后在后面用到的地方检查约束是否满足
- False positives:往往导致程序员忽略所有的警告信息
- symbolic execution engine
- 经过一个分支的时候,直接把if条件作为假设,往下进一步推理(靠)
- clang -cc1 --analyze –analyzer-checker=<package>与clang --analyze -Xanalyzer ...
- 可用的checker:
- alpha.core.BoolAssignment, alpha.security.MallocOverflow, alpha.unix.cstring.NotNullTerminated
- core.NullDereference, core.DivideZero, core.StackAddressEscape
- cplusplus.NewDelete
- debug.DumpCFG, debug.DumpDominators, debug.ViewExplodedGraph
- llvm.Conventions
- security.FloatLoopCounter, security.insecureAPI.UncheckedReturn, ...
- unix.API, unix.Malloc, unixMallocSizeof, unix.MismatchedDeallocator
- -fsyntax-only
- scan-build
- $ scan-build gcc -c joe.c -o joe.o
- $ scan-view <ouput_dir>
- 实际的例子:
- $ scan-build ../httpd-2.4.9/configure -prefix=$(pwd)/../install
- $ scan-build make
- 扩展
- ProgramState, ProgramPoint, ExplodedGraph
- 程序状态不可变?
- To save space, this graph is folded:llvm::FoldingSetNode(引用计数?)
- 挑战:how to model the memory behavior(别名问题)
- lvalue -> memory region -> binding -> symbolic value
- class ReactorChecker : public Checker<check::PostCall> {
- ...
- void checkPostCall(const CallEvent/*最近的一次函数调用*/ &Call, CheckerContext &C) const;
- if (Call.getCalleeIdentifier() == ...
- p241 The mutable keyword should only be used for mutexes or such caching scenarios.
- K, 新checker需要用TableGen语法来注册?? lib/StaticAnalyzer/Checkers/Checkers.td
- ProgramState, ProgramPoint, ExplodedGraph
LibTooling
- 生成一个命令数据库
- $ cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON ../ 嗯?
- 理解/使用基于LibTooling的工具:Clang Tidy/Modernizer/Apply Replacements/Format/Query, Modularize, PPTrace(Clang Extra Tools)
- clang-tidy
- $ clang-tidy -checks="llvm-*" file.cpp
- 重构工具:
- Clang Modernizer(革命性的?):生成“迁移到C++ 11”的Source-to-Source修改建议
- Clang Apply Replacements
- 数据结构:clang::tooling::Replacement(YAML格式?避免解析处理patch文件)
- 例:$ clang-modernize -loop-convert -serialize-replacements test.cpp --serialize-dir=./
- => $ clang-apply-replacements ./
- 来自LibFormat的功能:-style=<LLVM|Google|Chromium|Mozilla|Webkit>
- ClangFormat:转换IOCCC,强制编码规范(Java里怎么就没这样的命令行工具?)
- Modularize
- $ readelf -s screen.o
- 要求符号的类型信息:C头文件?
- p261 条件宏给编译器带来了不必要的头文件重复解析负担
- 模块化与import关键词?(让我想起了D语言了)
- Module Map Checker
- $ module-map-checker module.modulemap 确保覆盖了所有.h头文件
- PPTrace(预处理的跟踪输出)
- Clang Query:查询AST,同时对AST matchers做测试
- clang-query> match callExpr()
- Clang Check(只有几百行代码?)
- $ clang-check program.c -ast-dump -- (使用程序数据库或在--后提供编译参数)
- Remove c_str() calls
- clang-tidy
- 构建代码重构工具:libclang / plugins / LibTooling
- Dissecting tooling boilerplate code:ParseCommandLineOptions
- Using AST matchers
- clang-query> match methodDecl(hasName("walk"))
- clang-query> match recordDecl(isSameOrDerivedFrom(hasName("Animal")))
- clang-query> match recordDecl(hasMethod(methodDecl(hasName("walk"))))
- 根据名字来做匹配?支持正则表达式吗
- clang-query> match memberCallExpr(callee(memberExpr(member(hasName("walk")))))
- clang-query> match memberCallExpr(callee(memberExpr(member(hasName("walk")))), thisPointerType(recordDecl(isSameOrDerivedFrom(hasName("Animal")))))
- 这里的语法虽然能够工作,但非常笨拙!
- Putting the AST matcher predicates in C++ code
- Writing the callbacks
- tooling::Replacements *Replace;
- const CXXMethodDecl *method = matchResult.Nodes.getNodeAs<CXXMethodDecl>("methodDecl");
匹配得到的是一个AST Node? - Replace->insert(Replacement(*matchResult.SourceManager,
CharSourceRange::getTokenRange(SourceRange(method->getLocation())), NewMethodName));