LibFuzzer

最新推荐文章于 2024-05-13 09:38:09 发布

老和山乔治

最新推荐文章于 2024-05-13 09:38:09 发布

阅读量4.5k

点赞数

分类专栏： LLVM

本文链接：https://blog.csdn.net/FJDJFKDJFKDJFKD/article/details/93741934

版权

LLVM 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

Fuzzing

A software testing technique, often automated or semi-automated, that involves passing invalid, unexpected or random input to a program and monitor result for crashes, failed assertions, races, leaks, etc.

Fuzzer types

Generation Based: Generate from scratch with no prior state
Mutation Based: Mutate existing state based on some rules
Evolutionary: Generation or mutation based or both, in-processing with code coverage feedback

Fuzzing in the past

以浏览器 Fuzzing 为例

Generate an HTML page
Write it to the disk
Launch browser
Open the page or serve it over HTTP
Check if the browser crashed
Close the browser

No coverage

Large search space
Cannot fuzz specific function
Hard to fuzz network protocols
Speed of regular fuzzers (html, css, dom, etc mutators)

LibFuzzer 的特点

LibFuzzer is in-process, coverage-guided, evolutionary fuzzing engine. LibFuzzer is linked with the library under test, and feeds fuzzed inputs to the library via a specific fuzzing entrypoint (aka “target function”); the fuzzer then tracks which areas of the code are reached, and generates mutations on the corpus of input data in order to maximize the code coverage. The code coverage information for libFuzzer is provided by LLVM’s SanitizerCoverage instrumentation.

LibFuzzer 是一个 in-process，coverage-guided，evolutionary 的模糊测试引擎，它是 LLVM 项目的一部分。LibFuzzer 和要被测试的库链接在一起，通过一个特殊的模糊测试进入点（目标函数）从而将产生fuzz输入数据给到被测试的库。fuzzer 会跟踪哪些代码区域已经测试过，然后在输入数据的语料库上产生变异，来最大化代码覆盖。代码覆盖的信息由 LLVM 的 SanitizerCoverage 插桩提供。

Coverage-guided fuzz testing
在这里插入图片描述

配合的 Memory tool

How to see the invisible

在这里插入图片描述
Sanitizer 不是 Fuzz 测试里必须的，但是加上它可以发现很多难以发现的 bug。

AddressSanitizer is not the only dynamic testing tool that can be combined with fuzzing. For example, add -fsanitize=signed-integer-overflow -fno-sanitize-recover=all to the build flags.

In some cases you may want to run fuzzing without any additional tool (e.g. a sanitizer). This will allow you to find only the simplest bugs (null dereferences, assertion failures) but will run faster. Later you may run a sanitized build on the generated corpus to find more bugs. The downside is that you may miss some bugs this way.

LibFuzzer 的使用¹²

可以看到，如果要 Fuzz 一个程序，用户自己编写 LLVMFuzzerTestOneInput 函数。 Libfuzzer 生成的测试数据 Data 以及测试数据的长度 Size，把这些生成的测试数据让程序来处理即可，LibFuzz 引擎同时会尽可能的触发更多的代码逻辑。 LibFuzzer 作为一个测试引擎负责生成这些测试数据，并且提供了一套异常检测机制。

There is nothing in these fuzz targets that makes them tied to libFuzzer – there is just one function that takes an array of bytes as a parameter. And so it is possible, and even desirable, to fuzz the same targets with different other fuzzing engines.

// fuzz_target.cc
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
  DoSomethingInterestingWithMyAPI(Data, Size);
  return 0;  // Non-zero return values are reserved for future use.
}

将待测对象和 libFuzzer.a 库编译链接
clang++ -g -std=c++11 -fsanitize=address -fsanitize-coverage=trace-pc-guard fourth_fuzzer.cc ../../libFuzzer/libFuzzer.a -o fourth_fuzzer
可以看到在编译时其中有很多前面提到的 Memory tool，Sanitizer 用于检测运行时出现的内存错误等异常。
之后运行即可
mkdir corpus3 && ./fourth_fuzzer corpus3/ -max_len=1024

总结

似乎可以认为 Libfuzzer 已经把一个 fuzzer 的核心（样本生成引擎和异常检测系统）做好了，我们需要做的是根据目标程序的逻辑，把 Libfuzzer 生成的数据，交给目标程序处理，这与 Fuzzing 这一技术本身是定义是相吻合的。

Fuzzer 进阶

Seed corpus 增加种子样本

mkdir corpus1
./Fuzzer -max_total_time=60 -max_len=1024 -print_final_stats=1 -dict=./dictionary/path corpus1

Fuzzer 程序可以有多个目录作为参数，此时 fuzzer 会递归遍历所有目录，把目录中的文件读入最为样本数据传给测试函数，同时会把那些可以产生新的的代码路径的样本保存到第一个目录里面。

./woff2-2016-05-06-fsanitize_fuzzer MY_CORPUS/ seeds/

When a libFuzzer-based fuzzer is executed with one more directory as arguments, it will first read files from every directory recursively and execute the target function on all of them. Then, any input that triggers interesting code path(s) will be written back into the first corpus directory (in this case, MY_CORPUS).

使用 dictionary

Another important way to improve fuzzing efficiency is to use a dictionary. This works well if the input format being fuzzed consists of tokens or have lots of magic values.
./libxml2-v2.9.2-fsanitize_fuzzer -dict=afl/dictionaries/xml.dict -jobs=8 -workers=8 CORPUS

Parallel runs 任务并行

Another way to increase the fuzzing efficiency is to use more CPUs. If you run the fuzzer with -jobs=N it will spawn N independent jobs but no more than half of the number of cores you have; use -workers=M to set the number of allowed parallel jobs.
./libxml2-v2.9.2-fsanitize_fuzzer -dict=afl/dictionaries/xml.dict -jobs=8 -workers=8 CORPUS

精简样本集

-merge=1
mkdir corpus_min
./Fuzzer -merge=1 corpus1_min corpus1

代码覆盖率

-dump_coverage
./Fuzzer corpus1_min -runs=0 -dump_coverage=1

https://www.secpulse.com/archives/71898.html ↩︎
https://github.com/Dor1s/libfuzzer-workshop ↩︎

老和山乔治

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
LibFuzzer

FuzzingA software testing technique, often automated or semi-automated, that involves passing invalid, unexpected or random input to a program and monitor result for crashes, failed assertions, race...
复制链接

扫一扫