0.编译llvm和clang
cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=host -DLLVM_ENABLE_PROJECTS=clang ../llvm/
如果想enable更多的project,则需要指定LLVM_ENABLE_RUNTIMES
环境变量:
-DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi"
-DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi"
cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=host -DLLVM_ENABLE_PROJECTS=“clang;compiler-rt;lld;libcxx;libcxxabi” …/llvm/
直接用如下命令进行编译:
cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=host -DLLVM_ENABLE_PROJECTS=“clang;compiler-rt” …/llvm/
更多命令参考:
[1]: https://llvm.org/docs/CMake.html
[2]: https://llvm.org/docs/CMake.html#llvm-related-variables
[3]: https://github.com/vusec/fuzzbench-snappy/blob/0f2cab6dc1cf8335035f9d5f0eed8b0c58189821/fuzzers/snappy/builder.Dockerfile
1.About Transformation Pass
A transformation pass will normally inherit from PassInfoMixin
.
e.g.,
struct InjectFuncCall : public llvm::PassInfoMixin<InjectFuncCall> {
llvm::PreservedAnalyses run(llvm::Module &M,
llvm::ModuleAnalysisManager &);
bool runOnModule(llvm::Module &M);
};
-
Gain better understanding of sanitize’s implementations
ref: https://cyruscyliu.github.io/posts/2021-11-02-libFuzzer-cov-control/
std::make_pair(CoverageFunc, "-fsanitize-coverage-type=1"), std::make_pair(CoverageBB, "-fsanitize-coverage-type=2"), std::make_pair(CoverageEdge, "-fsanitize-coverage-type=3"), std::make_pair(CoverageIndirCall, "-fsanitize-coverage-indirect-calls"), std::make_pair(CoverageTraceBB, "-fsanitize-coverage-trace-bb"), std::make_pair(CoverageTraceCmp, "-fsanitize-coverage-trace-cmp"), std::make_pair(CoverageTraceDiv, "-fsanitize-coverage-trace-div"), std::make_pair(CoverageTraceGep, "-fsanitize-coverage-trace-gep"), std::make_pair(Coverage8bitCounters, "-fsanitize-coverage-8bit-counters"), std::make_pair(CoverageTracePC, "-fsanitize-coverage-trace-pc"),
2.学习使用sanitizer的插桩
-
源码参考: Transforms/Instrumentation/SanitizerCoverage.cpp中
InjectCoverage(F, BlocksToInstrument, IsLeafFunc); InjectCoverageForIndirectCalls(F, IndirCalls); InjectTraceForCmp(F, CmpTraceTargets); InjectTraceForSwitch(F, SwitchTraceTargets); InjectTraceForDiv(F, DivTraceTargets); InjectTraceForGep(F, GepTraceTargets); InjectTraceForLoadsAndStores(F, Loads, Stores);
-
编译命令
-
-fno-sanitize-coverage
: used to disable this option if it’s already provided or implied by another option.
e.g.,fno-sanitize-coverage=trace-cmp
-
fsanitize-coverage=trace-cmp
: used to instruct the compiler to insert calls to some cmp functions -
ref: https://learn.microsoft.com/en-us/cpp/build/reference/fsanitize-coverage?view=msvc-170
-
-
比较__sanitizer_cov_trace_pc()和__sanitizer_cov_trace_pc_guard
The intent of__sanitizer_cov_trace_pc_guard
is to eventually replace all of {bool coverage, 8bit-counters, trace-pc} with just this one. The callback of__sanitizer_cov_trace_pc
is not implemented in the Sanitizer run-time and should be defined by the user. -
关于使用
-fsanitize-coverage=trace-cmp
直接单独使用该option,不能起到插桩的作用。研究发现需要结合其它option如fsanitize-coverage=trace-pc-guard
一起才有效. e.g.,clang++ test.cpp -fsanitize=address -fsanitize-coverage=trace-pc-guard -fsanitize-coverage=trace-cmp -o test
找到了相关解释:Currently, these flags do not work by themselves - they require one of -fsanitize-coverage={trace-pc,inline-8bit-counters,inline-bool} flags to work
链接: https://clang.llvm.org/docs/SanitizerCoverage.html -
关于查看生成的二进制的相关反汇编代码,仅查看其中单个函数
gdb ./test_bb -batch -ex 'disassemble main'
../../build_llvm2/bin/llvm-objdump ./test_cmp --disassemble-symbols=main
-
关于结合其它flag使用
trace-pc-guard
Sanitizer Coverage offers different levels of instrumentation. edge (default): edges are instrumented (see below). bb: basic blocks are instrumented. func: only the entry block of every function will be instrumented. Use these flags together with trace-pc-guard or trace-pc, like this: -fsanitize-coverage=func,trace-pc-guard.
-
理解
-fsanitize-coverage=inline-8bit-counters
和-fsanitize-coverage=trace-pc-guard
- 1.8bit-counter直接修改一块tagged内存(),而trace-pc-guard根据用户定义来修改一块tagged的内存. 管理两块不同的内存.管理方式不同: guard通过自定义函数调用,8bit_counter通过指令直接修改.
if (Options.TracePCGuard) { auto GuardPtr = IRB.CreateIntToPtr( IRB.CreateAdd(IRB.CreatePointerCast(FunctionGuardArray, IntptrTy), ConstantInt::get(IntptrTy, Idx * 4)), Int32PtrTy); IRB.CreateCall(SanCovTracePCGuard, GuardPtr)->setCannotMerge(); } if (Options.Inline8bitCounters) { auto CounterPtr = IRB.CreateGEP( Function8bitCounterArray->getValueType(), Function8bitCounterArray, {ConstantInt::get(IntptrTy, 0), ConstantInt::get(IntptrTy, Idx)}); auto Load = IRB.CreateLoad(Int8Ty, CounterPtr); auto Inc = IRB.CreateAdd(Load, ConstantInt::get(Int8Ty, 1)); auto Store = IRB.CreateStore(Inc, CounterPtr); SetNoSanitizeMetadata(Load); SetNoSanitizeMetadata(Store); }
- 2.两者有不同的初始化方式
可以看到,pc_guard初始化对*guard进行区分,这样运行时调用__sanitizer_cov_trace_pc_guard函数,检查extern "C" void __sanitizer_cov_8bit_counters_init(char *start, char *end) { // [start,end) is the array of 8-bit counters created for the current DSO. // Capture this array in order to read/modify the counters. } extern "C" void __sanitizer_cov_trace_pc_guard_init(uint32_t *start, uint32_t *stop) { static uint64_t N; // Counter for the guards. if (start == stop || *start) return; // Initialize only once. printf("INIT: %p %p\n", start, stop); for (uint32_t *x = start; x < stop; x++) *x = ++N; // Guards should start from 1. } extern "C" void __sanitizer_cov_trace_pc_guard(uint32_t *guard) { if (!*guard) return; // Duplicate the guard check. // If you set *guard to 0 this code will not be called again for this edge. // Now you can get the PC and do whatever you want: // store it somewhere or symbolize it and print right away. // The values of `*guard` are as you set them in // __sanitizer_cov_trace_pc_guard_init and so you can make them consecutive // and use them to dereference an array or a bit vector. void *PC = __builtin_return_address(0); ....... }
*guard
值就能区分是一个新的基本块,guard本身是一个index.
而8bit_counters对应了一块新的内存Function8bitCounterArray
,其初始化值应该都为0.运行时,直接基于指令得到当前的index,然后操作该内存(i.e.,对该内存值+1).
- 1.8bit-counter直接修改一块tagged内存(),而trace-pc-guard根据用户定义来修改一块tagged的内存. 管理两块不同的内存.管理方式不同: guard通过自定义函数调用,8bit_counter通过指令直接修改.
-
理解编译时候,
sanitize_*
参数是怎么work的?- 有一些很重要的文件:
SanitizerArgs.cpp
,BackendUtil.cpp
,SanitizerCoverage.cpp
- 貌似有一个值传播链:
1. 在SanitizerArgs.cpp中, 有: std::make_pair(CoverageTraceCmp, "-fsanitize-coverage-trace-cmp"), 2. 在BackendUtil.cpp中, 有: Opts.TraceCmp = CGOpts.SanitizeCoverageTraceCmp; 3. 在SanitizerCoverage.cpp中, 有: Options.TraceCmp |= ClCMPTracing;
- 有一些很重要的文件:
-
关于设置新的flag,如
trace-semantic-ret
,需要怎么修改llvm源码?-
核心思想:参考已有的一个flag(i.e.,
trace-cmp
)进行修改
搜索的关键字有: SanitizeCoverageTraceCmp, CoverageTraceCmp, TraceCmp, Options.TraceCmp,SanitizerCoverageOptions -
这里涉及的文件有:
-
llvm/include/llvm/Transforms/instrumentation.h
搜索SanitizerCoverageOptions
,在结构体中定义:bool TraceSemanticFunc = false;
-
clang/lib/CodeGen/BackendUtil.cpp
搜索SanitizerCoverageOptions
, 添加Opts.TraceSemanticFunc = CGOpts.SanitizeCoverageTraceSemanticFunc;
-
llvm/include/clang/Basic/CodeGenOptions.def
搜索SanitizeCoverageTraceCmp
, 添加CODEGENOPT(SanitizeCoverageTraceSemanticFunc, 1, 0) ///< Enable func instruction tracing ///< in sanitizer semantic coverage.
-
llvm/include/clang/Basic/CodeGenOptions.h
搜索SanitizeCoverageTraceCmp
,在hasSanitizeCoverage()
添加SanitizeCoverageTraceSemanticFunc
的逻辑 -
clang/include/clang/Driver/Options.td
搜索SanitizeCoverageTraceCmp
,添加如下逻辑def fsanitize_coverage_trace_semantic_func : Flag<["-"], "fsanitize-coverage-trace-semantic-func">, HelpText<"Enable func semantic tracing in sanitizer coverage">, MarshallingInfoFlag<CodeGenOpts<"SanitizeCoverageTraceSemanticFunc">>;
-
clang/lib/Driver/SanitizerArgs.cpp,
搜索CoverageTraceCmp
,添加如下CoverageTraceSemanticFunc = 1 << 19,
std::make_pair(CoverageTraceSemanticFunc, "-fsanitize-coverage-trace-semantic-func")
.Case("trace-semantic-func", CoverageTraceSemanticFunc)
谨记:下面的内容不要随便enable,否则会造成全面的插桩,非区分的那种插桩
所以,CoverageTraceSemanticRet | CoverageTraceSemanticFunc
不要添加到后面CoverageFeatures |= CoverageInline8bitCounters | CoverageIndirCall | CoverageTraceCmp | CoveragePCTable |CoverageTraceSemanticRet | CoverageTraceSemanticFunc ;
-
-
-
关于编译时保留变量符号名, 可以使用
-fno-discard-value-names
参数- e.g.,
clang++ -emit-llvm -c inputs/input_for_cc.c -o input_for_cc.bc -fno-discard-value-names
- e.g.,
../../build_llvm2/bin/clang++ -fno-discard-value-names -o0 test.cpp -fsanitize-coverage=trace-pc-guard -fsanitize-coverage=trace-semantic-ret -o test lodes_intrumentation.cc
- 在新版的llvm中,直接使用
-load LLVMHello.so hello
会报错,需要在命令行参数中添加-enable-new-pm=0
- e.g.,
3.关于compile-rt的作用
使用自己编译的clang++,对目标文件进行插桩时,有如下报错:
/usr/bin/ld: cannot find /home/ubuntu/llvm-14/build_llvm/lib/clang/14.0.0/lib/linux/libclang_rt.asan_static-x86_64.a: No such file or directory
/usr/bin/ld: cannot find /home/ubuntu/llvm-14/build_llvm/lib/clang/14.0.0/lib/linux/libclang_rt.asan-x86_64.a: No such file or directory
/usr/bin/ld: cannot find /home/ubuntu/llvm-14/build_llvm/lib/clang/14.0.0/lib/linux/libclang_rt.asan_cxx-x86_64.a: No such file or directory
其中,测试文件为:
void foo() { }
int main(int argc, char **argv) {
int a = 0;
if (argc > 1) foo();
if (argc> 2) a += argc;
return a;
}
编译命令为:
../bin/clang++ -g test.cpp -fsanitize=address -fsanitize-coverage=trace-pc-guard -o test
看起来是因为没有编译自制的libclang_rt.asan-x86_64.a
库
make install指定目录
make install DESTDIR=debian/tmp
经过clang处理,库函数的名字被加了前缀或者后缀,如果需要保持原始库的名字,则
可通过extern "C" { void xxx(){xxx; }}
的方式来定义,然后运行时,通过引入PRE_lOAD即可,如
LD_PRELOAD=./sanitizer.so ./test 1-zlib.z
- https://groups.google.com/g/llvm-dev/c/31ImqM_DVIs
clang默认的sanitizer是在哪里实现的?如果想做一些修改应该从哪里着手?
答: 在Libfuzzer中实现的,具体在FuzzerTracePC.cpp
中实现。
如做了一些修改,需要将libfuzzer编译成的库替换掉默认的libclang_rt.fuzzer-x86_64.a
库文件.
进一步的,发现只改那个地方还是不够,因为链接的时候会报错,后来发现还有两个文件也很重要,是这样发现的
strings /home/ubuntu/llvm-14/build_llvm2/lib/clang/14.0.0/lib/linux/libclang_rt.asan-x86_64.a|grep sanitizer_cov
- sanitizer_coverage_libcdep_new.cpp
- sanitizer_coverage_fuchsia.cpp
需要在这两个文件中做一些修改,如:
SANITIZER_INTERFACE_WEAK_DEF(void, __sanitizer_cov_trace_semantic, u32) {}
SANITIZER_INTERFACE_WEAK_DEF(void, __sanitizer_cov_trace_error, u32) {}