Angr CTF 1
前面学习了一些angr的基本用法,我的最终目的是能用angr自动化的分析一些PE文件,所以肯定要学习一些进阶用法,CTF题目是个很好的学习方式。我练习的题目参考:AngrCTF
1. 00_angr_find
先看看main
函数
输入8个字符,变换+校验。%8s
这个参数保证输入最多8个字符,不会溢出。
不用angr的解法
首先要分析complex_function
的功能
可以看到,首先校验是不是大写字母,不是推出。然后进行了一个变换(循环右移3位),逆算法为:(val - 3 * i - 65) % 26 + 65
解题脚本为:
def func(val, i: int):
return chr((val - 3 * i - 65) % 26 + 65)
if __name__ == '__main__':
s = b'JACEJGCS'
flag = ''.join([func(val, i) for i, val in enumerate(s)])
print(flag)
flag为:JXWVXRKX
可以看到解题的过程中需要手动分析加密算法以及手动编写解密算法,当加密算法很复杂时就很难受了,那么用angr如何解呢?
angr的解法
先贴上大佬的解题脚本:
import angr
import sys
def Go():
path_to_binary = "examples/00_angr_find"
project = angr.Project(path_to_binary, auto_load_libs=False)
initial_state = project.factory.entry_state()
simulation = project.factory.simgr(initial_state)
print_good_address = 0x8048678
simulation.explore(find=print_good_address)
if simulation.found:
solution_state = simulation.found[0]
solution = solution_state.posix.dumps(sys.stdin.fileno()) # 大概意思是dump出输入
print("[+] Success! Solution is: {}".format(solution.decode("utf-8")))
else:
raise Exception('Could not find the solution')
if __name__ == "__main__":
Go()
-
print_good_address
为puts("Good Job.")
的地址 -
sys.stdin.fileno()
为标准输入文件描述符,值为0
。 -
simulation.explore
可以参考API文档,其功能引用文档:- looking for condition “
find
”, avoiding condition “avoid
”. - Stores found states into “
find_stash
’ and avoided states into “avoid_stash
”.
其
find
和avoid
参数可以为- An address to find
- A
set
orlist
of addresses to find - A function that takes a state and returns whether or not it matches,说白了就是写一个返回值
Boolean
类型并接收state
作为参数的函数作为传入参数。
- looking for condition “
-
state.posix.dumps(0)
代表该状态程序的所有输入,state.posix.dumps(1)
代表该状态程序的所有输出。
结果:
符号执行最普遍的操作是找到能够到达某个地址的状态,同时丢弃其他不能到达这个地址的状态。Simulation Managers为使用这种执行模式提供了.explore()
方法。当使用find
参数启动.explore()
方法时,程序将会一直执行,直到发现了一个和find
参数指定的条件相匹配的状态。
第二种解法:
find
输入参数为函数,实际中打印“ Good Job”的块,或“Try Again”的块很多。一个一个记录下来可能很费事。
import angr
import sys
def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
if b'Good Job.' in stdout_output:
return True
else:
return False
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
if b'Try again.' in stdout_output:
return True
else:
return False
def Go():
path_to_binary = "examples/00_angr_find"
project = angr.Project(path_to_binary, auto_load_libs=False)
initial_state = project.factory.entry_state()
simulation = project.factory.simgr(initial_state)
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
solution_state = simulation.found[0]
solution = solution_state.posix.dumps(sys.stdin.fileno())
print("[+] Success! Solution is: {}".format(solution.decode("utf-8")))
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
Go()
将打印到标准输出的内容放入stdout_output
变量中。而标准输出的不是字符串,而是字节对象,这意味着必须使用b'Good Job.'
而不是仅"Good Job.
"来检查我们是否正确输出了。这么做就不用固定写死具体地址。
2. 03_angr_symbolic_registers
main
函数
get_user_input
函数
输入的三个变量被依次放入eax, ebx, edx。
main
函数中get_user_input
之后又将变量复制到内存。
complex_function_1
函数
complex_function_2
和complex_function_3
也差不多,这道题之前用的angr脚本也可以使用。不过angr在处理复杂格式的字符串scanf()
输入的时候不是很好。可以直接将符号之注入寄存器,也刚好学习一下符号化寄存器。
这里用到了claripy模块,claripy主要将变量符号化,生成约束式并求解约束式,这也是符号执行的核心所在,在angr中主要是利用微软提供的z3库去解约束式。
大佬的解题脚本
import sys
import angr
import claripy
def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
if b'Good Job.\n' in stdout_output:
return True
else:
return False
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
if b'Try again.\n' in stdout_output:
return True
else:
return False
def Go():
path_to_binary = "examples/03_angr_symbolic_registers"
project = angr.Project(path_to_binary, auto_load_libs=False)
start_address = 0x08048980
initial_state = project.factory.blank_state(addr=start_address)
passwd_size_in_bits = 32
passwd0 = claripy.BVS('passwd0', passwd_size_in_bits)
passwd1 = claripy.BVS('passwd1', passwd_size_in_bits)
passwd2 = claripy.BVS('passwd2', passwd_size_in_bits)
initial_state.regs.eax = passwd0
initial_state.regs.ebx = passwd1
initial_state.regs.edx = passwd2
simulation = project.factory.simgr(initial_state)
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
for i in simulation.found:
solution_state = i
solution0 = format(solution_state.solver.eval(passwd0), 'x')
solution1 = format(solution_state.solver.eval(passwd1), 'x')
solution2 = format(solution_state.solver.eval(passwd2), 'x')
solution = solution0 + " " + solution1 + " " + solution2
print("[+] Success! Solution is: {}".format(solution))
# print(simgr.found[0].posix.dumps(0))
else:
raise Exception('Could not find the solution')
if __name__ == '__main__':
Go()
-
这里可以不用从
main
函数的开头开始,直接跳过get_user_input()
函数,在call get_user_input
下一条指令处直接设置寄存器eax, ebx, edx
。 -
initial_state = project.factory.blank_state(addr=start_address)
这一句就是直接给start_address
处代码赋予initial_state
状态。 -
solution_state.solver.eval(passwd0)
返回的是passwd0
的一个十进制解,用format
将其16进制化。这里:solver.eval(expression)
:将会解出expression
一个可行解。solver.eval_one(expression)
:将会给出expression
的可行解,若有多个可行解,则抛出异常。solver.eval_upto(expression, n)
:将会给出最多n个可行解,如果不足n个就给出所有的可行解。solver.eval_exact(expression, n)
:将会给出n个可行解,如果解的个数不等于n个,将会抛出异常。solver.min(expression)
:将会给出最小可行解。solver.max(expression)
:将会给出最大可行解。
解得正确输入为:b9ffd04e ccf63fe8 8fd4d959
3. 04_angr_symbolic_stack
符号化栈区
这里v1
, v2
都是栈空间上的参数,分别保存在ebp - 0x10
和ebp - 0xC
的位置。
大佬的解题脚本
import angr
import claripy
def is_successful(state):
stdout_output = state.posix.dumps(1)
if b'Good Job.\n' in stdout_output:
return True
else:
return False
def should_abort(state):
stdout_output = state.posix.dumps(1)
if b'Try again.\n' in stdout_output:
return True
else:
return False
def Go():
path_to_binary = "../examples/04_angr_symbolic_stack"
project = angr.Project(path_to_binary, auto_load_libs=False)
start_address = 0x8048697
initial_state = project.factory.blank_state(addr=start_address)
initial_state.regs.ebp = initial_state.regs.esp
passwd_size_in_bits = 32
passwd0 = claripy.BVS('passwd0', passwd_size_in_bits)
passwd1 = claripy.BVS('passwd1', passwd_size_in_bits)
padding_length_in_bytes = 0x8
initial_state.regs.esp -= padding_length_in_bytes
initial_state.stack_push(passwd0)
initial_state.stack_push(passwd1)
simulation = project.factory.simgr(initial_state)
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
for i in simulation.found:
solution_state = i
solution0 = (solution_state.solver.eval(passwd0))
solution1 = (solution_state.solver.eval(passwd1))
print("[+] Success! Solution is: {0} {1}".format(solution0, solution1))
# print(solution0, solution1)
else:
raise Exception('Could not find the solution')
if __name__ == "__main__":
Go()
-
这里注入的是栈空间位置,所以开始状态设置为输入数据之后,即
0x8048697
处。 -
堆栈的符号化也是
push
而不是直接数组下标访问赋值的,所以需要先平衡(initial_state.regs.ebp = initial_state.regs.esp
)。linux下栈是从高到低增长的,esp < ebp。 -
v1
占用ebp - 0xD
-ebp - 0x10
4个字节,v2
占用ebp - 0x9
-ebp - 0xC
4个字节。ebp
-ebp -0x8
8个字节需要padding。所以有padding_length_in_bytes = 0x8
和initial_state.regs.esp -= padding_length_in_bytes
这2句。之后才是push
操作。 -
其余部分基本上与以前的脚本相同。
最终解得正确输入为:1704280884 2382341151
4. 05_angr_symbolic_memory
符号化.bss段
流程就是输入分4段输入一个长度32的字符串,循环移位后比对。输入字符串位于.bss段。通常存放未输出化的全局变量区。
这里选择0x08048601作为起点。
import angr
import claripy
def is_successful(state):
stdout_output = state.posix.dumps(1)
if b'Good Job.\n' in stdout_output:
return True
else:
return False
def should_abort(state):
stdout_output = state.posix.dumps(1)
if b'Try again.\n' in stdout_output:
return True
else:
return False
def Go():
path_to_binary = "../examples/05_angr_symbolic_memory"
project = angr.Project(path_to_binary, auto_load_libs=False)
start_address = 0x8048601
initial_state = project.factory.blank_state(addr=start_address)
passwd_size_in_bits = 64
passwd0 = claripy.BVS('passwd0', passwd_size_in_bits)
passwd1 = claripy.BVS('passwd1', passwd_size_in_bits)
passwd2 = claripy.BVS('passwd2', passwd_size_in_bits)
passwd3 = claripy.BVS('passwd3', passwd_size_in_bits)
passwd0_address = 0xA1BA1C0 # user_input addr
initial_state.memory.store(passwd0_address, passwd0)
initial_state.memory.store(passwd0_address + 0x8, passwd1)
initial_state.memory.store(passwd0_address + 0x10, passwd2)
initial_state.memory.store(passwd0_address + 0x18, passwd3)
simulation = project.factory.simgr(initial_state)
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
for i in simulation.found:
solution_state = i
solution0 = solution_state.solver.eval(passwd0, cast_to=bytes)
solution1 = solution_state.solver.eval(passwd1, cast_to=bytes)
solution2 = solution_state.solver.eval(passwd2, cast_to=bytes)
solution3 = solution_state.solver.eval(passwd3, cast_to=bytes)
solution = solution0 + b" " + solution1 + b" " + solution2 + b" " + solution3
print("[+] Success! Solution is: {}".format(solution.decode("utf-8")))
# print(solution0, solution1, solution2, solution3)
else:
raise Exception('Could not find the solution')
if __name__ == "__main__":
Go()
这里用到的访问方式是state.memory.store
和state.memory.load
,可以用来访问一段连续的内存。一般 .load(addr, size)
和 .store(addr, val)
。
解得正确输入为:NAXTHGNR JVSFTPWE LMGAUHWC XMDCPALU
5. 06_angr_symbolic_dynamic_memory
这里是堆内存的符号化
complex_function
依旧是循环移位,就不粘贴代码了。
输入之后的指令是在0x08048699处,这也是initial state的位置。
大佬的解题脚本
import angr
import claripy
def is_successful(state):
stdout_output = state.posix.dumps(1)
if b'Good Job.\n' in stdout_output:
return True
else:
return False
def should_abort(state):
stdout_output = state.posix.dumps(1)
if b'Try again.\n' in stdout_output:
return True
else:
return False
def Go():
path_to_binary = "../examples/06_angr_symbolic_dynamic_memory"
project = angr.Project(path_to_binary, auto_load_libs=False)
start_address = 0x8048699
initial_state = project.factory.blank_state(addr=start_address)
passwd_size_in_bits = 64
passwd0 = claripy.BVS('passwd0', passwd_size_in_bits)
passwd1 = claripy.BVS('passwd1', passwd_size_in_bits)
fake_heap_address0 = 0xffffc93c
pointer_to_malloc_memory_address0 = 0xabcc8a4
fake_heap_address1 = 0xffffc94c
pointer_to_malloc_memory_address1 = 0xabcc8ac
initial_state.memory.store(pointer_to_malloc_memory_address0, fake_heap_address0,
endness=project.arch.memory_endness)
initial_state.memory.store(pointer_to_malloc_memory_address1, fake_heap_address1,
endness=project.arch.memory_endness)
initial_state.memory.store(fake_heap_address0, passwd0)
initial_state.memory.store(fake_heap_address1, passwd1)
simulation = project.factory.simgr(initial_state)
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
for i in simulation.found:
solution_state = i
solution0 = solution_state.solver.eval(passwd0, cast_to=bytes)
solution1 = solution_state.solver.eval(passwd1, cast_to=bytes)
print("[+] Success! Solution is: {0} {1}".format(solution0.decode('utf-8'), solution1.decode('utf-8')))
# print(solution0, solution1)
else:
raise Exception('Could not find the solution')
if __name__ == "__main__":
Go()
buffer0
和buffer1
属于全局变量,因此在bss段。IDA查看其地址分别为0xabcc8a4和0xabcc8ac。buffer0
和buffer1
存储的是申请到的堆内存地址,angr并没有真正“运行”二进制文件,它只是在模拟运行状态,因此它实际上不需要将内存分配到堆中,实际上可以伪造任何地址。而需要使用者做的就是选择两个地址存放的堆区地址,buffer0
和buffer1
就是可选项。0xffffc93c和0xffffc94c随机伪造的地址.store
参数endness
用于设置端序,angr默认为大端序,总共可选的值如下:LE
– 小端序BE
– 大端序ME
– 中间序
实际内存和angr模拟内存对照如下:
BEFORE:
buffer0 -> malloc()ed address 0 -> string 0
buffer1 -> malloc()ed address 1 -> string 1
AFTER:
buffer0 -> fake address 0 -> symbolic bitvector 0
buffer1 -> fake address 1 -> symbolic bitvector 1
解得输入为:UBDKLMBV UNOERNYS
6. 07_angr_symbolic_file
complex_function
为循环移位
程序使用fread
函数从文件中读取字符串并加密比对。ignore_me
主要是把第一个读取的内容存入OJKSQYDP.txt, 不用我们自己创建文件 ,然后从文件OJKSQYDP.txt读取数据存入buffer
。
先贴上大佬的解题脚本
import angr
import claripy
def is_successful(state):
stdout_output = state.posix.dumps(1)
if b'Good Job.\n' in stdout_output:
return True
else:
return False
def should_abort(state):
stdout_output = state.posix.dumps(1)
if b'Try again.\n' in stdout_output:
return True
else:
return False
def Go():
path_to_binary = "../examples/07_angr_symbolic_file"
project = angr.Project(path_to_binary, auto_load_libs=False)
start_address = 0x80488EA
initial_state = project.factory.blank_state(addr=start_address)
filename = 'OJKSQYDP.txt'
symbolic_file_size_bytes = 64
passwd0 = claripy.BVS('password', symbolic_file_size_bytes * 8)
passwd_file = angr.storage.SimFile(filename, content=passwd0, size=symbolic_file_size_bytes)
initial_state.fs.insert(filename, passwd_file)
simulation = project.factory.simgr(initial_state)
simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found:
for i in simulation.found:
solution_state = i
solution0 = solution_state.solver.eval(passwd0, cast_to=bytes)
print("[+] Success! Solution is: {0}".format(solution0.decode('utf-8')))
# print(solution0)
else:
raise Exception('Could not find the solution')
if __name__ == "__main__":
Go()
这里从memset(buffer, 0, 0x40u);
之后开始,开始指令是0x080488EA。
这里用到了仿真文件系统-The Emulated Filesystem。在angr中与文件系统,套接字,管道或终端的任何交互的根源都是SimFile对象。可以从某个位置读取文件,可以在某个位置写入文件,可以询问文件中当前存储了多少字节,还可以具体化文件,并为其生成测试用例。
利用SimFile
形成符号化的文件的格式:
simgr_file = angr.storage.SimFile(filename, content=xxxxxx, size=file_size)
然后利用fs
选项以文件名的字典来预配置SimFile
对象,也可以fs.insert
是将文件插入到文件系统中,需要文件名与符号化的文件。
initial_state.fs.insert(filename, simgr_file)
这里文件里有0x40也就是64个字节。
解得,文件内容为:AZOMMMZM + 56个不可见字符。