Angr入门（二）- 一些CTF的Angr分析

原创已于 2023-02-17 23:01:01 修改 · 2.6k 阅读

14 ·

CC 4.0 BY-SA版权

文章标签：

#其他 #安全 #程序分析 #符号执行

于 2021-08-17 16:56:10 首次发布

程序分析同时被 3 个专栏收录

65 篇文章

订阅专栏

程序分析工具

20 篇文章

订阅专栏

符号执行

10 篇文章

订阅专栏

这篇博客通过一系列CTF题目，详细介绍了如何使用angr进行二进制分析，从基本的寻找特定地址到处理符号化寄存器、栈、内存和文件输入。讲解了claripy模块在符号执行中的作用，以及如何通过模拟内存、文件系统来解决复杂情况。此外，还展示了如何避免手动解析加密算法，实现自动化解密。

前面学习了一些angr的基本用法，我的最终目的是能用angr自动化的分析一些PE文件，所以肯定要学习一些进阶用法，CTF题目是个很好的学习方式。我练习的题目参考：AngrCTF

1. 00_angr_find

先看看main函数

在这里插入图片描述
输入8个字符，变换+校验。%8s这个参数保证输入最多8个字符，不会溢出。

不用angr的解法

首先要分析complex_function的功能

在这里插入图片描述
可以看到，首先校验是不是大写字母，不是推出。然后进行了一个变换（循环右移3位），逆算法为：(val - 3 * i - 65) % 26 + 65

解题脚本为：

def func(val, i: int):
    return chr((val - 3 * i - 65) % 26 + 65)


if __name__ == '__main__':
    s = b'JACEJGCS'
    flag = ''.join([func(val, i) for i, val in enumerate(s)])
    print(flag)

flag为：JXWVXRKX

可以看到解题的过程中需要手动分析加密算法以及手动编写解密算法，当加密算法很复杂时就很难受了，那么用angr如何解呢？

angr的解法

先贴上大佬的解题脚本：

import angr
import sys


def Go():
    path_to_binary = "examples/00_angr_find"
    project = angr.Project(path_to_binary, auto_load_libs=False)
    initial_state = project.factory.entry_state()
    simulation = project.factory.simgr(initial_state)

    print_good_address = 0x8048678
    simulation.explore(find=print_good_address)

    if simulation.found:
        solution_state = simulation.found[0]
        solution = solution_state.posix.dumps(sys.stdin.fileno()) # 大概意思是dump出输入
        print("[+] Success! Solution is: {}".format(solution.decode("utf-8")))
    else:
        raise Exception('Could not find the solution')


if __name__ == "__main__":
    Go()

print_good_address为puts("Good Job.")的地址
sys.stdin.fileno()为标准输入文件描述符，值为0。
simulation.explore可以参考API文档，其功能引用文档：
- looking for condition “find”, avoiding condition “avoid”.
- Stores found states into “find_stash’ and avoided states into “avoid_stash”.
其find和avoid参数可以为
- An address to find
- A set or list of addresses to find
- A function that takes a state and returns whether or not it matches，说白了就是写一个返回值Boolean类型并接收state作为参数的函数作为传入参数。
state.posix.dumps(0)代表该状态程序的所有输入，state.posix.dumps(1)代表该状态程序的所有输出。

结果：
在这里插入图片描述
符号执行最普遍的操作是找到能够到达某个地址的状态，同时丢弃其他不能到达这个地址的状态。Simulation Managers为使用这种执行模式提供了.explore()方法。当使用find参数启动.explore()方法时，程序将会一直执行，直到发现了一个和find参数指定的条件相匹配的状态。

第二种解法：

find输入参数为函数，实际中打印“ Good Job”的块，或“Try Again”的块很多。一个一个记录下来可能很费事。

import angr
import sys


def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    if b'Good Job.' in stdout_output:
        return True
    else: 
        return False

def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    if b'Try again.' in  stdout_output:
        return True
    else: 
        return False


def Go():
    path_to_binary = "examples/00_angr_find" 
    project = angr.Project(path_to_binary, auto_load_libs=False)
    initial_state = project.factory.entry_state()
    simulation = project.factory.simgr(initial_state)

    simulation.explore(find=is_successful, avoid=should_abort)
  
    if simulation.found:
        solution_state = simulation.found[0]
        solution = solution_state.posix.dumps(sys.stdin.fileno())
        print("[+] Success! Solution is: {}".format(solution.decode("utf-8")))
    else:
        raise Exception('Could not find the solution')

if __name__ == '__main__':
    Go()

将打印到标准输出的内容放入stdout_output变量中。而标准输出的不是字符串，而是字节对象，这意味着必须使用b'Good Job.'而不是仅"Good Job."来检查我们是否正确输出了。这么做就不用固定写死具体地址。

2. 03_angr_symbolic_registers

main函数
在这里插入图片描述

get_user_input函数

在这里插入图片描述
输入的三个变量被依次放入eax, ebx, edx。

main函数中get_user_input之后又将变量复制到内存。
在这里插入图片描述

complex_function_1函数

在这里插入图片描述 complex_function_2和complex_function_3也差不多，这道题之前用的angr脚本也可以使用。不过angr在处理复杂格式的字符串scanf()输入的时候不是很好。可以直接将符号之注入寄存器，也刚好学习一下符号化寄存器。

这里用到了claripy模块，claripy主要将变量符号化，生成约束式并求解约束式，这也是符号执行的核心所在，在angr中主要是利用微软提供的z3库去解约束式。

大佬的解题脚本

import sys
import angr
import claripy

def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    if b'Good Job.\n' in stdout_output:
        return True
    else:
        return False


def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    if b'Try again.\n' in stdout_output:
        return True
    else:
        return False


def Go():
    path_to_binary = "examples/03_angr_symbolic_registers"
    project = angr.Project(path_to_binary, auto_load_libs=False)
    start_address = 0x08048980
    initial_state = project.factory.blank_state(addr=start_address)

    passwd_size_in_bits = 32
    passwd0 = claripy.BVS('passwd0', passwd_size_in_bits)
    passwd1 = claripy.BVS('passwd1', passwd_size_in_bits)
    passwd2 = claripy.BVS('passwd2', passwd_size_in_bits)

    initial_state.regs.eax = passwd0
    initial_state.regs.ebx = passwd1
    initial_state.regs.edx = passwd2

    simulation = project.factory.simgr(initial_state)

    simulation.explore(find=is_successful, avoid=should_abort)

    if simulation.found:
        for i in simulation.found:
            solution_state = i
            solution0 = format(solution_state.solver.eval(passwd0), 'x')
            solution1 = format(solution_state.solver.eval(passwd1), 'x')
            solution2 = format(solution_state.solver.eval(passwd2), 'x')
            solution = solution0 + " " + solution1 + " " + solution2
            print("[+] Success! Solution is: {}".format(solution))
            # print(simgr.found[0].posix.dumps(0))
    else:
        raise Exception('Could not find the solution')

if __name__ == '__main__':
    Go()

这里可以不用从main函数的开头开始，直接跳过get_user_input()函数，在call get_user_input下一条指令处直接设置寄存器eax, ebx, edx。
initial_state = project.factory.blank_state(addr=start_address)这一句就是直接给start_address处代码赋予initial_state状态。
solution_state.solver.eval(passwd0)返回的是passwd0的一个十进制解，用format将其16进制化。这里：
- solver.eval(expression)：将会解出expression一个可行解。
- solver.eval_one(expression)：将会给出expression的可行解，若有多个可行解，则抛出异常。
- solver.eval_upto(expression, n)：将会给出最多n个可行解，如果不足n个就给出所有的可行解。
- solver.eval_exact(expression, n)：将会给出n个可行解，如果解的个数不等于n个，将会抛出异常。
- solver.min(expression)：将会给出最小可行解。
- solver.max(expression)：将会给出最大可行解。

解得正确输入为：b9ffd04e ccf63fe8 8fd4d959

3. 04_angr_symbolic_stack

符号化栈区

在这里插入图片描述

在这里插入图片描述

这里v1, v2都是栈空间上的参数，分别保存在ebp - 0x10 和ebp - 0xC的位置。

大佬的解题脚本

import angr
import claripy


def is_successful(state):
    stdout_output = state.posix.dumps(1)
    if b'Good Job.\n' in stdout_output:
        return True
    else:
        return False


def should_abort(state):
    stdout_output = state.posix.dumps(1)
    if b'Try again.\n' in stdout_output:
        return True
    else:
        return False


def Go():
    path_to_binary = "../examples/04_angr_symbolic_stack"
    project = angr.Project(path_to_binary, auto_load_libs=False)
    start_address = 0x8048697
    initial_state = project.factory.blank_state(addr=start_address)

    initial_state.regs.ebp = initial_state.regs.esp

    passwd_size_in_bits = 32
    passwd0 = claripy.BVS('passwd0', passwd_size_in_bits)
    passwd1 = claripy.BVS('passwd1', passwd_size_in_bits)

    padding_length_in_bytes = 0x8
    initial_state.regs.esp -= padding_length_in_bytes

    initial_state.stack_push(passwd0)
    initial_state.stack_push(passwd1)

    simulation = project.factory.simgr(initial_state)


    simulation.explore(find=is_successful, avoid=should_abort)

    if simulation.found:
        for i in simulation.found:
            solution_state = i
            solution0 = (solution_state.solver.eval(passwd0))
            solution1 = (solution_state.solver.eval(passwd1))
            print("[+] Success! Solution is: {0} {1}".format(solution0, solution1))
            # print(solution0, solution1)
    else:
        raise Exception('Could not find the solution')


if __name__ == "__main__":
    Go()

这里注入的是栈空间位置，所以开始状态设置为输入数据之后，即0x8048697处。
堆栈的符号化也是push而不是直接数组下标访问赋值的，所以需要先平衡（initial_state.regs.ebp = initial_state.regs.esp）。linux下栈是从高到低增长的，esp < ebp。
v1占用ebp - 0xD - ebp - 0x10 4个字节，v2占用ebp - 0x9 - ebp - 0xC 4个字节。ebp - ebp -0x8 8个字节需要padding。所以有padding_length_in_bytes = 0x8和initial_state.regs.esp -= padding_length_in_bytes这2句。之后才是push操作。
其余部分基本上与以前的脚本相同。

最终解得正确输入为：1704280884 2382341151

4. 05_angr_symbolic_memory

符号化.bss段

在这里插入图片描述

在这里插入图片描述
流程就是输入分4段输入一个长度32的字符串，循环移位后比对。输入字符串位于.bss段。通常存放未输出化的全局变量区。

在这里插入图片描述
这里选择0x08048601作为起点。

import angr
import claripy


def is_successful(state):
    stdout_output = state.posix.dumps(1)
    if b'Good Job.\n' in stdout_output:
        return True
    else:
        return False


def should_abort(state):
    stdout_output = state.posix.dumps(1)
    if b'Try again.\n' in stdout_output:
        return True
    else:
        return False


def Go():
    path_to_binary = "../examples/05_angr_symbolic_memory"
    project = angr.Project(path_to_binary, auto_load_libs=False)
    start_address = 0x8048601
    initial_state = project.factory.blank_state(addr=start_address)

    passwd_size_in_bits = 64
    passwd0 = claripy.BVS('passwd0', passwd_size_in_bits)
    passwd1 = claripy.BVS('passwd1', passwd_size_in_bits)
    passwd2 = claripy.BVS('passwd2', passwd_size_in_bits)
    passwd3 = claripy.BVS('passwd3', passwd_size_in_bits)

    passwd0_address = 0xA1BA1C0 # user_input addr

    initial_state.memory.store(passwd0_address, passwd0)
    initial_state.memory.store(passwd0_address + 0x8, passwd1)
    initial_state.memory.store(passwd0_address + 0x10, passwd2)
    initial_state.memory.store(passwd0_address + 0x18, passwd3)

    simulation = project.factory.simgr(initial_state)



    simulation.explore(find=is_successful, avoid=should_abort)

    if simulation.found:
        for i in simulation.found:
            solution_state = i
            solution0 = solution_state.solver.eval(passwd0, cast_to=bytes)
            solution1 = solution_state.solver.eval(passwd1, cast_to=bytes)
            solution2 = solution_state.solver.eval(passwd2, cast_to=bytes)
            solution3 = solution_state.solver.eval(passwd3, cast_to=bytes)
            solution = solution0 + b" " + solution1 + b" " + solution2 + b" " + solution3
            print("[+] Success! Solution is: {}".format(solution.decode("utf-8")))
            # print(solution0, solution1, solution2, solution3)
    else:
        raise Exception('Could not find the solution')


if __name__ == "__main__":
    Go()

这里用到的访问方式是state.memory.store和state.memory.load，可以用来访问一段连续的内存。一般 .load(addr, size) 和 .store(addr, val)。

解得正确输入为：NAXTHGNR JVSFTPWE LMGAUHWC XMDCPALU

5. 06_angr_symbolic_dynamic_memory

这里是堆内存的符号化

在这里插入图片描述
complex_function依旧是循环移位，就不粘贴代码了。

在这里插入图片描述
输入之后的指令是在0x08048699处，这也是initial state的位置。

大佬的解题脚本

import angr
import claripy

def is_successful(state):
    stdout_output = state.posix.dumps(1)
    if b'Good Job.\n' in stdout_output:
        return True
    else:
        return False


def should_abort(state):
    stdout_output = state.posix.dumps(1)
    if b'Try again.\n' in stdout_output:
        return True
    else:
        return False


def Go():
    path_to_binary = "../examples/06_angr_symbolic_dynamic_memory"
    project = angr.Project(path_to_binary, auto_load_libs=False)
    start_address = 0x8048699
    initial_state = project.factory.blank_state(addr=start_address)

    passwd_size_in_bits = 64
    passwd0 = claripy.BVS('passwd0', passwd_size_in_bits)
    passwd1 = claripy.BVS('passwd1', passwd_size_in_bits)

    fake_heap_address0 = 0xffffc93c
    pointer_to_malloc_memory_address0 = 0xabcc8a4
    fake_heap_address1 = 0xffffc94c
    pointer_to_malloc_memory_address1 = 0xabcc8ac
    initial_state.memory.store(pointer_to_malloc_memory_address0, fake_heap_address0,
                               endness=project.arch.memory_endness)
    initial_state.memory.store(pointer_to_malloc_memory_address1, fake_heap_address1,
                               endness=project.arch.memory_endness)

    initial_state.memory.store(fake_heap_address0, passwd0)
    initial_state.memory.store(fake_heap_address1, passwd1)

    simulation = project.factory.simgr(initial_state)


    simulation.explore(find=is_successful, avoid=should_abort)

    if simulation.found:
        for i in simulation.found:
            solution_state = i
            solution0 = solution_state.solver.eval(passwd0, cast_to=bytes)
            solution1 = solution_state.solver.eval(passwd1, cast_to=bytes)
            print("[+] Success! Solution is: {0} {1}".format(solution0.decode('utf-8'), solution1.decode('utf-8')))
            # print(solution0, solution1)
    else:
        raise Exception('Could not find the solution')


if __name__ == "__main__":
    Go()

buffer0和buffer1属于全局变量，因此在bss段。IDA查看其地址分别为0xabcc8a4和0xabcc8ac。
buffer0和buffer1存储的是申请到的堆内存地址，angr并没有真正“运行”二进制文件，它只是在模拟运行状态，因此它实际上不需要将内存分配到堆中，实际上可以伪造任何地址。而需要使用者做的就是选择两个地址存放的堆区地址，buffer0和buffer1就是可选项。0xffffc93c和0xffffc94c随机伪造的地址
.store参数endness 用于设置端序，angr默认为大端序，总共可选的值如下：
- LE – 小端序
- BE – 大端序
- ME – 中间序

实际内存和angr模拟内存对照如下：

BEFORE:
buffer0 -> malloc()ed address 0 -> string 0
buffer1 -> malloc()ed address 1 -> string 1

AFTER:
buffer0 -> fake address 0 -> symbolic bitvector 0
buffer1 -> fake address 1 -> symbolic bitvector 1

解得输入为：UBDKLMBV UNOERNYS

6. 07_angr_symbolic_file

在这里插入图片描述

complex_function为循环移位

程序使用fread函数从文件中读取字符串并加密比对。ignore_me 主要是把第一个读取的内容存入OJKSQYDP.txt，不用我们自己创建文件 ,然后从文件OJKSQYDP.txt读取数据存入buffer。

先贴上大佬的解题脚本

import angr
import claripy


def is_successful(state):
    stdout_output = state.posix.dumps(1)
    if b'Good Job.\n' in stdout_output:
        return True
    else:
        return False


def should_abort(state):
    stdout_output = state.posix.dumps(1)
    if b'Try again.\n' in stdout_output:
        return True
    else:
        return False


def Go():
    path_to_binary = "../examples/07_angr_symbolic_file"
    project = angr.Project(path_to_binary, auto_load_libs=False)
    start_address = 0x80488EA
    initial_state = project.factory.blank_state(addr=start_address)

    filename = 'OJKSQYDP.txt'
    symbolic_file_size_bytes = 64
    passwd0 = claripy.BVS('password', symbolic_file_size_bytes * 8)
    passwd_file = angr.storage.SimFile(filename, content=passwd0, size=symbolic_file_size_bytes)

    initial_state.fs.insert(filename, passwd_file)

    simulation = project.factory.simgr(initial_state)

    
    simulation.explore(find=is_successful, avoid=should_abort)

    if simulation.found:
        for i in simulation.found:
            solution_state = i
            solution0 = solution_state.solver.eval(passwd0, cast_to=bytes)
            print("[+] Success! Solution is: {0}".format(solution0.decode('utf-8')))
            # print(solution0)
    else:
        raise Exception('Could not find the solution')


if __name__ == "__main__":
    Go()

在这里插入图片描述
这里从memset(buffer, 0, 0x40u);之后开始，开始指令是0x080488EA。

这里用到了仿真文件系统-The Emulated Filesystem。在angr中与文件系统，套接字，管道或终端的任何交互的根源都是SimFile对象。可以从某个位置读取文件，可以在某个位置写入文件，可以询问文件中当前存储了多少字节，还可以具体化文件，并为其生成测试用例。

利用SimFile形成符号化的文件的格式：

simgr_file = angr.storage.SimFile(filename, content=xxxxxx, size=file_size)

然后利用fs选项以文件名的字典来预配置SimFile对象，也可以fs.insert是将文件插入到文件系统中，需要文件名与符号化的文件。

initial_state.fs.insert(filename, simgr_file)

这里文件里有0x40也就是64个字节。

解得，文件内容为：AZOMMMZM + 56个不可见字符。