angr符号执行用例解析——0ctf_trace

最新推荐文章于 2024-03-02 10:38:29 发布

FunctionY

最新推荐文章于 2024-03-02 10:38:29 发布

阅读量1.9k

点赞数

本文链接：https://blog.csdn.net/doudoudouzoule/article/details/79550902

版权

在这个用例中，给出了一个带有程序执行的trace的text文件。这个文件有两列，地址和执行指令。所以我们知道所有的执行的指令和分支，但是我们不知道初始数据。
通过逆向，我们发现在栈上有一段缓冲区最先由已知字符串常量初始化的，然后一个未知的字符串被添加到后面（也就是flag），最终由相同的快速排序转化进行了排序。因此我们需要以某种方式找到这个flag。

使用angr很容易处理这个问题，我们只需要在每次分支时把它指向正确的地址，并且求解器立刻就可以找到flag。

首先拿到这个题目，我们先观察题目中的两个文件，一个trace是相当于一个blob文件，没有入口点。data.bin是trace编译后的二进制文件。
其中trace中的指令看不懂，上网查了一下发现是MIPS指令集（看来需要恶补一下各个架构的指令集的逆向了）。
大概上网查了一下，还算是好理解，u是无符号的意思，i是立即数的意思，指令 addiu 是寄存器的内容加上无符号的立即数的操作。

关于MIPS指令集可以参考：http://blog.csdn.net/flyingqr/article/details/7072977

solve.py的代码为：

#!/usr/bin/env python2

"""
In this challenge we're given a text file with trace of a program execution. The file has
two columns, address and instruction executed. So we know all the instructions being executed,
and which branches were taken. But the initial data is not known.

Reversing reveals that a buffer on the stack is initialized with known constant string first,
then an unknown string is appended to it (the flag), and finally it's sorted with some
variant of quicksort. And we need to find the flag somehow.

angr easily solves this problem. We only have to direct it to the right direction
at every branch, and solver finds the flag at a glance.
"""

from __future__ import print_function

import struct

import angr

MAIN_START = 0x4009d4
MAIN_END = 0x00400c18

FLAG_LOCATION = 0x400D80
FLAG_PTR_LOCATION = 0x410EA0

def load_trace():
    res = []
    delay_slots = set()
    with open("./trace_8339a701aae26588966ad9efa0815a0a.log") as f:
        for line in f:
            if line.startswith('[INFO]'):
                addr = int(line[6:6+8], 16)

                res.append(addr)

                # every command like this is in delay slot
                # (in this particular binary)
                if ("move r1, r1" in line):
                    delay_slots.add(addr)

    return res, delay_slots

def main():
    trace_log, delay_slots = load_trace()

    # data.bin is simply the binary assembled from trace,
    # starting on 0x400770
    project = angr.Project("./data.bin", load_options={
        'main_opts': {
            'backend': 'blob',
            'custom_base_addr': 0x400770, 
            'custom_arch': 'mipsel',
        },
    })

    state = project.factory.blank_state(addr=MAIN_START)
    state.memory.store(FLAG_LOCATION, state.solver.BVS("flag", 8*32))
    state.memory.store(FLAG_PTR_LOCATION, struct.pack("<I", FLAG_LOCATION))

    #sm = project.factory.simulation_manager(state) #why? not use it
    choices = [state]

    print("Tracing...")
    for i, addr in enumerate(trace_log):
        if addr in delay_slots:
            continue

        for s in choices:       #find the state of this address in the choices
            if s.addr == addr:
                break

        else:
            raise ValueError("couldn't advance to %08x, line %d" % (addr, i+1))

        if s.addr == MAIN_END:
            break

        # if command is a jump, it's followed by a delay slot
        # we need to advance by two instructions
        # https://github.com/angr/angr/issues/71
        if s.addr + 4 in delay_slots:
            choices = project.factory.successors(s, num_inst=2).successors
        else:
            choices = project.factory.successors(s, num_inst=1).successors

    state = s

    print("Running solver...")

    solution = state.solver.eval(state.memory.load(FLAG_LOCATION, 32), cast_to=str).rstrip(b'\0').decode('ascii')
    print("The flag is", solution)

    return solution

def test():
    assert main() == "0ctf{tr135m1k5l96551s9l5r}"

if __name__ == "__main__":
    main()

下面分析代码：
load_trace函数：将trace中所有的地址都记录下来，同时记录了指令move r1 r1的地址，这条指令的作用是一个延迟点（delay slot），关于delay shot与跳转的关系可以参考：http://blog.csdn.net/babyfans/article/details/6336476（大意就是为了加快处理器执行速度，前一条指令执行完时，后面的指令操作就已经开始，然后需要对跳转特殊处理，jump前面添加delay shot）
main函数执行流程：
先调用trace_log函数将所有的地址记录下来，然后将data.bin文件加载到angr.Project中，其中需要添加选项load_options，说明该文件是blob类型，架构为mipsel，起始地址为0x400770(不知道为什么是这个地址？？？)。

初始化一个blank_state这与以前获取entry_state不同，因为angr无法识别blob文件的入口点。这个blank_state状态的地址为trace文件的起始地址。

然后给这个状态的内存赋值，一个名为flag的8*32位的的符号变量，赋值的内存地址为：FLAG_LOCATION = 0x400D80。然后将FLAG_LOCATION地址按照高位顺序格式化取得一个long值存储在FLAG_PTR_LOCATION = 0x410EA0内存地址中。（不过我也没有找到怎么确定的这两个内存地址。）

然后初始化一个simulation_manager(按照以前对angr的分析，这个实际上就是一个pathGroup，获取从这个状态下的所有的stashes)，不过该程序初始化这个sm没有使用，把它注释掉也没有任何影响，应该是多写了。

以初始化的blank_state状态为起始状态，模拟运行。运行时，遇到delay shot指令就跳过，在状态地址与trace地址相同时跳出循环，在运行到trace的结束地址时也跳出循环。

运行时对跳转的指令需要特殊处理，如果是跳转指令，那么就需要前进两条指令。

运行结束后，求解此时FLAG_LOCATION地址的值，获取到flag。