这个辅助器使用rust语言写的,主要是有main.rs,symcc.rs
然后这个整体的项目搭建,需要添加包,还有一些依赖详细参考 cargo官方指南这个我去年看的时候还挺多的,今年不知道为啥就剩四章节了.,里面能自己改能在线运行还挺好的当时.在找到其他比较好的教程再添加吧
一些用到的API可以直接去 Rust官方手册
原理
大致原理就是通过符号执行产生的测试用例来提供给fuzzer,然后fuzzer继续变异执行,在没有新路径的情况下又将fuzzer产生的一些测试用例重新跑一遍找到最可能产生新路径,喂给symcc,然后长此以往,起到加速的作用
然后开始看代码,按理说应该先看main,然后一点一点往下看,连续的我就直接全部贴上来了
main.rs
这个叫派生,然后使用那个结构体就能够获取到命令行参数了
#[derive(Debug, StructOpt)]/StructOpt:通过结构体来解析命令行参数。
#[structopt(about = "Make SymCC collaborate with AFL.", no_version)]
然后下面是统计结构体,主要是为了统计symcc执行的时间,求解器的时间和一些成功执行的数量,
然后他的实现很简单,就是统计,然后写入日志
struct Stats {//统计时间跟执行成功失败的数量
/// Number of successful executions.
total_count: u32,
/// Time spent in successful executions of SymCC.
total_time: Duration,
/// Time spent in the solver as part of successfully running SymCC.
solver_time: Option<Duration>,
/// Number of failed executions.
failed_count: u32,
/// Time spent in failed SymCC executions.
failed_time: Duration,
}
然后看下面的状态结构体的实现
主要是按个test_input()函数,这个就是通过 SymCC 运行单个输入并处理它生成的新测试用例。将测试用例按照symccResult分别放入不同的文件夹,对产生的测试用例进行处理的函数为process_new_testcase()
impl State {
/// Initialize the run-time environment in the given output directory.
///
/// This involves creating the output directory and all required
/// subdirectories.
fn initialize(output_dir: impl AsRef<Path>) -> Result<Self> {//对照着fuzzing目录结构,symcc分别创建同样的目录文件
let symcc_dir = output_dir.as_ref();
创建symcc下面的一些文件,和fuzzer保持结构一致
fs::create_dir(&symcc_dir).with_context(|| {
format!("Failed to create SymCC's directory {}", symcc_dir.display())
})?;
let symcc_queue =
TestcaseDir::new(symcc_dir.join("queue")).context("Failed to create SymCC's queue")?;
let symcc_hangs = TestcaseDir::new(symcc_dir.join("hangs"))?;
let symcc_crashes = TestcaseDir::new(symcc_dir.join("crashes"))?;
let stats_file = File::create(symcc_dir.join("stats"))?;
Ok(State {//都创建成功后,返回一个state
current_bitmap: AflMap::new(), //初始AflMap对象
processed_files: HashSet::new(),
queue: symcc_queue,
hangs: symcc_hangs,
crashes: symcc_crashes,
stats: Default::default(), // Is this bad style?
last_stats_output: Instant::now(),
stats_file,
})
}
/// Run a single input through SymCC and process the new test cases it
/// generates.
fn test_input(
&mut self,
input: impl AsRef<Path>,
symcc: &SymCC,
afl_config: &AflConfig,
) -> Result<()> {
log::info!("Running on input {}", input.as_ref().display());
let tmp_dir = tempdir()//创建一个临时文件当做输出文件
.context("Failed to create a temporary directory for this execution of SymCC")?;
let mut num_interesting = 0u64;
let mut num_total = 0u64;
运行symcc 返回SymCCResult
let symcc_result = symcc
.run(&input, tmp_dir.path().join("output"))
.context("Failed to run SymCC")?;
for new_test in symcc_result.test_cases.iter() {对产生的测试用例进行一些操作,决定放在哪个文件下面
let res = process_new_testcase(&new_test, &input, &tmp_dir, &afl_config, self)?;
/好像没有区分unique crashes,可以在crashes中再来一次merge,判断路径是不是一样的
num_total += 1;
if res == TestcaseResult::New {
log::debug!("Test case is interesting");
num_interesting += 1;
}
}
log::info!(
"Generated {} test cases ({} new)",
num_total,
num_interesting
);///这个会在屏幕上面一直打印,因为会一直loop
if symcc_result.killed {
log::info!(
"The target process was killed (probably timeout or out of memory); \
archiving to {}",
self.hangs.path.display()
);
symcc::copy_testcase(&input, &mut self.hangs, &input)在这里加入hang
.context("Failed to archive the test case")?;
}
self.processed_files.insert(input.as_ref().to_path_buf());///将已经跑过的输入加入到hashset
self.stats.add_execution(&symcc_result);///根据符号执行的结果,统计时间,用例数量
Ok(())
}
}
现在来看一下process_new_testcase()函数
也没什么难的
然后就是在main函数中处理好一些参数进行调用
fn main() -> Result<()> {
let options = CLI::from_args();//参数结构体
env_logger::builder()
.filter_level(if options.verbose {
log::LevelFilter::Debug/过滤级别
} else {
log::LevelFilter::Info
})
.init();/日志初始化
if !options.output_dir.is_dir() {//判断输入目录是否是目录
log::error!(
"The directory {} does not exist!",
options.output_dir.display()
);
return Ok(());
}
let afl_queue = options.output_dir.join(&options.fuzzer_name).join("queue");
if !afl_queue.is_dir() {
log::error!("The AFL queue {} does not exist!", afl_queue.display());
return Ok(());
}
let symcc_dir = options.output_dir.join(&options.name);
if symcc_dir.is_dir() {
log::error!(
"{} already exists; we do not currently support resuming",//其实在这里可以做个备份,如果已经存在的话
symcc_dir.display()
);
return Ok(());
}
/在afl的输出目录下会创建一个跟fuzzername一样的文件夹,里面是queue,cur_input,hang,crashes,bitmap,fuzz_state...
let symcc = SymCC::new(symcc_dir.clone(), &options.command);初始化symcc
log::debug!("SymCC configuration: {:?}", &symcc);
let afl_config = AflConfig::load(options.output_dir.join(&options.fuzzer_name))?;这个就是加载一下当时执行fuzz的时候的一些命令
log::debug!("AFL configuration: {:?}", &afl_config);
let mut state = State::initialize(symcc_dir)?;///拿到配置之后,创建类似的文件结构
loop {
match afl_config
.best_new_testcase(&state.processed_files)从已经跑过的测试用例中找分数最大的
.context("Failed to check for new test cases")?
{
None => {
log::debug!("Waiting for new test cases...");
thread::sleep(Duration::from_secs(5));
}
Some(input) => state.test_input(&input, &symcc, &afl_config)?,
}用符号执行去跑然后将有趣的测试用例加到不同文件里,,然后fuzzer再用这些用例继续变异
if state.last_stats_output.elapsed().as_secs() > STATS_INTERVAL_SEC {
if let Err(e) = state.stats.log(&mut state.stats_file) {
log::error!("Failed to log run-time statistics: {}", e);
}
state.last_stats_output = Instant::now();//记录当前时间作为上一次的时间
}
}
}
然后就是在main文件中使用的一些辅助函数了,在symcc中
所以在开头记得将文件包含进来
mod symcc;//使用symcc文件
然后看symcc.rs
symcc.rs
insert_input_file函数
/// 使用输入文件来代替@@,,,,,找到@@所在的位置,用input_file代替
fn insert_input_file<S: AsRef<OsStr>, P: AsRef<Path>>(
command: &[S],
input_file: P,
) -> Vec<OsString> {
let mut fixed_command: Vec<OsString> = command.iter().map(|s| s.into()).collect();
if let Some(at_signs) = fixed_command.iter_mut().find(|s| *s == "@@") {
*at_signs = input_file.as_ref().as_os_str().to_os_string();
}
fixed_command //返回fixed_command,类型为vec<OsString>
}
然后是AFLMAP
pub struct AflMap {
data: [u8; 65536], //u8类型,长度65536
}
impl AflMap {//new,load,merge
/// Create an empty map.
pub fn new() -> AflMap {
AflMap { data: [0; 65536] } //设置data初始元素都为0
}
/// Load a map from disk.
pub fn load(path: impl AsRef<Path>) -> Result<AflMap> {
let data = fs::read(&path).with_context(|| {
format!(
"Failed to read the AFL bitmap that \
afl-showmap should have generated at {}",
path.as_ref().display()
)
})?;
ensure!(
data.len() == 65536,
"The file to load the coverage map from has the wrong size ({})",
data.len()
);
let mut result = AflMap::new();//初始化[0,65536]
result.data.copy_from_slice(&data);//将元素复制到result中
Ok(result)//表示成功执行
}
/// Merge with another coverage map in place.
///
/// Return true if the map has changed, i.e., if the other map yielded new
/// coverage.///即如果另一个地图产生了新的覆盖范围(路径),则返回 true。
pub fn merge(&mut self, other: &AflMap) -> bool {//和另外一个aflmap合并
let mut interesting = false;
for (known, new) in self.data.iter_mut().zip(other.data.iter()) {//块迭代,同时迭代其他两个迭代器的迭代器
if *known != (*known | new) {
*known |= new;//能生效的情况只有0,1
interesting = true;
}
}
/***iter(), which iterates over &T.
iter_mut(), which iterates over &mut T.
into_iter(), which iterates over T. */
interesting
}
}
然后是测试用例的分数,其实我没太搞懂这个分数是如何打的
这个是在afl_config.best_new_testcase()里面的max_by_key(),但是这个传的是一整个对象哇
impl TestcaseScore {//根据路径初始化
/// Score a test case.
///
/// If anything goes wrong, return the minimum score.
fn new(t: impl AsRef<Path>) -> Self {
let size = match fs::metadata(&t) {//遍历符号链接以查询有关目标文件的信息。
Err(e) => {//发生错误1.path不存在2.没有权限
// Has the file disappeared?
log::warn!(
"Warning: failed to score test case {}: {}",
t.as_ref().display(),
e
);
return TestcaseScore::minimum();
}
Ok(meta) => meta.len(),
};
let name: OsString = match t.as_ref().file_name() {
None => return TestcaseScore::minimum(),
Some(n) => n.to_os_string(),//=>OsStr,some是匹配类型
};
let name_string = name.to_string_lossy();//==>unicode
TestcaseScore {//初始化
new_coverage: name_string.ends_with("+cov"),//是否以这个字符结尾,返回布尔值
derived_from_seed: name_string.contains("orig:"),//是否包含,返回布尔值
file_size: -i128::from(size),//得分是通过size大小???
base_name: name,
}
}
/// Return the smallest possible score.
fn minimum() -> TestcaseScore {
TestcaseScore {
new_coverage: false,
derived_from_seed: false,
file_size: std::i128::MIN, //最小的整型 -170141183460469231731687303715884105728 最大的 340282366920938463463374607431768211455
base_name: OsString::from(""),
}
}
}
然后看copy_testcase()函数
pub fn copy_testcase(//将测试用例复制到目标文件目录下,并且使用上一个的id
testcase: impl AsRef<Path>,
target_dir: &mut TestcaseDir,
parent: impl AsRef<Path>,
) -> Result<()> {
let orig_name = parent
.as_ref()
.file_name()//返回的最后一个组件Path,如果有的话。如果是文件就返回文件名,如果是目录就是目录名称
.expect("The input file does not have a name")
.to_string_lossy();
ensure!(//id:开头id:000000
orig_name.starts_with("id:"),//生成的crashes测试用例的名称,,类似id:000000,orig:dtd1
"The name of test case {} does not start with an ID",
parent.as_ref().display()
);
//fuzzer的name命名类似:id:000000,79946,sig:11,src:000001,op:havoc,rep:4
if let Some(orig_id) = orig_name.get(3..9) {//取id:后面的000000这六位
let new_name = format!("id:{:06},src:{}", target_dir.current_id, &orig_id);//新名字id:current_id,src:orig_id
let target = target_dir.path.join(new_name);//目标路径名
log::debug!("Creating test case {}", target.display());
fs::copy(testcase.as_ref(), target).with_context(|| {将一个文件的内容复制到另一个文件(p,Q):p->Q
format!(
"Failed to copy the test case {} to {}",
testcase.as_ref().display(),
target_dir.path.display()
)
})?;
target_dir.current_id += 1;
} else {
bail!(
"Test case {} does not contain a proper ID",
parent.as_ref().display()
);
}
Ok(())
}
然后是AFLcnfig
impl AflConfig {
/// Read the AFL configuration from a fuzzer instance's output directory.
/// afl运行之后从afl-status状态文件中获取命令行的一些路径等参数
pub fn load(fuzzer_output: impl AsRef<Path>) -> Result<Self> {
let afl_stats_file_path = fuzzer_output.as_ref().join("fuzzer_stats");//状态文件
let mut afl_stats_file = File::open(&afl_stats_file_path).with_context(|| {
format!(
"Failed to open the fuzzer's stats at {}",
afl_stats_file_path.display()
)
})?;
let mut afl_stats = String::new();
afl_stats_file
.read_to_string(&mut afl_stats)//afl_stats就是读进去的缓冲区
.with_context(|| {
format!(
"Failed to read the fuzzer's stats at {}",
afl_stats_file_path.display()
)
})?;
let afl_command: Vec<_> = afl_stats
.lines()
.find(|&l| l.starts_with("command_line"))/找到命令行类似command_line:afl-fuzz -M master -i in -o out -t 2000+ binary --valid --recover @@
.expect("The fuzzer stats don't contain the command line")
.splitn(2, ':')//以冒号分割
.nth(1)//检索第一个参数
.expect("The fuzzer stats follow an unknown format")
.trim()//去除首尾空白字符
.split_whitespace()//空白符分割字符串
.collect();//vec afl_command通过迭代就可以拿到每一个选项参数
let afl_target_command: Vec<_> = afl_command
.iter()
.skip_while(|s| **s != "--")//在闭包上面迭代,不要带--的
.map(OsString::from)
.collect();
let afl_binary_dir = Path::new(/由上面的命令行我们可以知道第0个元素就是afl的位置
afl_command
.get(0)
.expect("The AFL command is unexpectedly short"),
)
.parent()//的上一层目录
.unwrap();//避免为空时发生Panics
Ok(AflConfig {
show_map: afl_binary_dir.join("afl-showmap"), //如果没有发生错误的话,返回afl-showmap程序,
use_standard_input: !afl_target_command.contains(&"@@".into()), //如果使用了@@ 使用标准化输入 True
use_qemu_mode: afl_command.contains(&"-Q".into()), //-Q 是否使用黑盒模式
target_command: afl_target_command, //命令行的所有选项
queue: fuzzer_output.as_ref().join("queue"), //队列输出文件目录
})
}
/// Return the most promising unseen test case of this fuzzer.
/// 返回最有希望且未出现过的测试用例,save_if_instersting
///
pub fn best_new_testcase(&self, seen: &HashSet<PathBuf>) -> Result<Option<PathBuf>> {
let best = fs::read_dir(&self.queue)//从输入队列中读取产生unique crashes的用例
.with_context(|| {
format!(
"Failed to open the fuzzer's queue at {}",
self.queue.display()
)
})?
.collect::<io::Result<Vec<_>>>()//==>vec
.with_context(|| {
format!(
"Failed to read the fuzzer's queue at {}",
self.queue.display()
)
})?
.into_iter()
.map(|entry| entry.path())//iter::Map 将 iter 的值映射到 f 的迭代器
.filter(|path| path.is_file() && !seen.contains(path))//过滤
.max_by_key(|path| TestcaseScore::new(path));//返回指定函数中最大值的元素,,也就分高的(通过对象内存判断??)
Ok(best)
}
pub fn run_showmap(//用于对单个用例进行执行路径跟踪,并记录结果AflShowmapResult
&self,
testcase_bitmap: impl AsRef<Path>,Path ,PathBuf==>分别是str,string
testcase: impl AsRef<Path>,
) -> Result<AflShowmapResult> {
let mut afl_show_map = Command::new(&self.show_map);//运行cmd执行afl-showmap程序.arg(cmd_str)
if self.use_qemu_mode {//qemu_mode就加上-Q
afl_show_map.arg("-Q");
}
afl_show_map
.args(&["-t", "5000", "-m", "none", "-b", "-o"])
.arg(testcase_bitmap.as_ref())
.args(insert_input_file(&self.target_command, &testcase))//使用用例名字来代替@@
.stdout(Stdio::null())//读取,,这里使用stdio::null不会有什么问题,但是换成piped当写入太多字节可能会堵塞
.stderr(Stdio::null())
.stdin(if self.use_standard_input {//写入
Stdio::piped()
} else {
Stdio::null()
});
log::debug!("Running afl-showmap as follows: {:?}", &afl_show_map);
let mut afl_show_map_child = afl_show_map.spawn().context("Failed to run afl-showmap")?; //线程并发运行
//标准输出Command::spawn()
if self.use_standard_input {
io::copy(//将读取器(testcase)的全部内容复制到写入器(afl_show_map_child)中。
&mut File::open(&testcase)?,
afl_show_map_child
.stdin
.as_mut()
.expect("Failed to open the stardard input of afl-showmap"),
)
.context("Failed to pipe the test input to afl-showmap")?;
}
let afl_show_map_status = afl_show_map_child
.wait()
.context("Failed to wait for afl-showmap")?;/等待完成会返回process::ExitStatus。0:成功
log::debug!("afl-showmap returned {}", &afl_show_map_status);
match afl_show_map_status一些状态码,,类似switch
.code()
.expect("No exit code available for afl-showmap")
{
0 => {
let map = AflMap::load(&testcase_bitmap).with_context(|| {从本地加载bitmap
format!(
"Failed to read the AFL bitmap that \
afl-showmap should have generated at {}",
testcase_bitmap.as_ref().display()
)
})?;
Ok(AflShowmapResult::Success(Box::new(map)))//map创建成功
}
1 => Ok(AflShowmapResult::Hang),
2 => Ok(AflShowmapResult::Crash),
unexpected => panic!("Unexpected return code {} from afl-showmap", unexpected),/其余的状态码不具体分析
}
}
}
然后是符号执行的封装
impl SymCC {
/// Create a new SymCC configuration.
pub fn new(output_dir: PathBuf, command: &[String]) -> Self {
let input_file = output_dir.join(".cur_input");
SymCC {
use_standard_input: !command.contains(&String::from("@@")),
bitmap: output_dir.join("bitmap"),
command: insert_input_file(command, &input_file),
input_file,
}
}//初始化bitmap,命令行
/// 从日志文件中计算求解时间
fn parse_solver_time(output: Vec<u8>) -> Option<Duration> {
let re = Regex::new(r#""solving_time": (\d+)"#).unwrap();正则表达式
output
// split into lines
.rsplit(|n| *n == b'\n')
// convert to string
.filter_map(|s| str::from_utf8(s).ok())
// check that it's an SMT log line
.filter(|s| s.trim_start().starts_with("[STAT] SMT:"))//因为在求解那边是以这个开头的
// find the solving_time element
.filter_map(|s| re.captures(s))//表示单个匹配项的一组捕获字符串。
// convert the time to an integer
.filter_map(|c| c[1].parse().ok())
// associate the integer with a unit
.map(Duration::from_micros)/ 微秒级时间
// get the first one
.next()
}//返回output,smt求解时间
pub fn run(//就相当于运行symcc,然后会把一些参数给你指定好,返回SymCCResult:执行的时间,求解的时间,以及产生的新的用例,和进程的状态
&self,
input: impl AsRef<Path>,
output_dir: impl AsRef<Path>,
) -> Result<SymCCResult> {
fs::copy(&input, &self.input_file).with_context(|| {/将提供的input拷贝到input_file中
format!(
"Failed to copy the test case {} to our workbench at {}",
input.as_ref().display(),
self.input_file.display()
)
})?;
fs::create_dir(&output_dir).with_context(|| {创建输出文件夹
format!(
"Failed to create the output directory {} for SymCC",
output_dir.as_ref().display()
)
})?;
let mut analysis_command = Command::new("timeout");
analysis_command
.args(&["-k", "5", &TIMEOUT.to_string()])//TIMEOUT 90
.args(&self.command)
.env("SYMCC_ENABLE_LINEARIZATION", "1")//线性化
.env("SYMCC_AFL_COVERAGE_MAP", &self.bitmap)//指定覆盖率文件位置
.env("SYMCC_OUTPUT_DIR", output_dir.as_ref())//指定输出文件位置
.stdout(Stdio::null())
.stderr(Stdio::piped()); // capture SMT logs
if self.use_standard_input {
analysis_command.stdin(Stdio::piped());
} else {
analysis_command.stdin(Stdio::null());
analysis_command.env("SYMCC_INPUT_FILE", &self.input_file);//指定输入文件
}
log::debug!("Running SymCC as follows: {:?}", &analysis_command);
let start = Instant::now();/对单调递增时钟的测量。不透明且仅适用于Duration.
let mut child = analysis_command.spawn().context("Failed to run SymCC")?; //开启子线程运行symcc
if self.use_standard_input {
io::copy(/将输入以流的方式写入给symcc
&mut File::open(&self.input_file).with_context(|| {
format!(
"Failed to read the test input at {}",
self.input_file.display()
)
})?,
child
.stdin
.as_mut()
.expect("Failed to pipe to the child's standard input"),
)
.context("Failed to pipe the test input to SymCC")?;
}
let result = child
.wait_with_output()/等待symcc运行,,返回的是状态码
.context("Failed to wait for SymCC")?;
let total_time = start.elapsed();返回自创建此瞬间以来经过的时间量。
let killed = match result.status.code() {
Some(code) => {
log::debug!("SymCC returned code {}", code);
(code == 124) || (code == -9) // as per the man-page of timeout
}
None => {/None:表示进程被信号终止
let maybe_sig = result.status.signal();///如果进程被信号终止,则返回该信号。
if let Some(signal) = maybe_sig {
log::warn!("SymCC received signal {}", signal);
}
maybe_sig.is_some()
}
};
let new_tests = fs::read_dir(&output_dir)/读取输出的目录下产生的测试用例
.with_context(|| {
format!(
"Failed to read the generated test cases at {}",
output_dir.as_ref().display()
)
})?
.collect::<io::Result<Vec<_>>>()
.with_context(|| {
format!(
"Failed to read all test cases from {}",
output_dir.as_ref().display()
)
})?
.iter()
.map(|entry| entry.path())
.collect();/返回的是一个测试用例path集合vec
///进程结构
/*
pub struct ProcessConfig<'a> {
program: &'a str,
args: &'a [~str],
env: Option<&'a [(~str, ~str)]>,
cwd: Option<&'a Path>,
stdin: StdioContainer,
stdout: StdioContainer,
stderr: StdioContainer,
extra_io: &'a [StdioContainer],
uid: Option<uint>,
gid: Option<uint>,
detach: bool,
}
*/
求解器中时间打印是 LOG_STAT("SMT: { \"solving_time\": " + decstr(solving_time_) + " }\n");
let solver_time = SymCC::parse_solver_time(result.stderr);///从打印控制台上获取,类似下面的test的结构
if solver_time.is_some() && solver_time.unwrap() > total_time {//is_some:如果选项是 Some 值,则返回 true。
log::warn!("Backend reported inaccurate solver time!");
}
Ok(SymCCResult {执行完之后symcc的结果
test_cases: new_tests,//
killed,
time: total_time,
solver_time: solver_time.map(|t| cmp::min(t, total_time)),
})
}
}
大概就这么多了