混淆技术研究-OLLVM混淆-控制流平坦化(FLA)

简介

控制流平坦化通过将程序中的条件分支语句转化为等价的平铺控制流来实现。通常,这包括将原始的分支语句(如if语句、switch语句)中的每个分支提取出来,并将它们放置在一系列连续的基本块中,然后使用一个状态变量或标志来选择要执行的基本块。这样,原本嵌套的条件分支结构就被展开成了一个扁平的基本块序列。

原理

具体来说,控制流平坦化的过程如下:

  1. 将原始的条件分支语句(如if语句)拆分为独立的基本块。
  2. 将这些基本块按照一定的顺序排列在一起,形成一个新的程序流程。
  3. 引入一个控制变量或标志,用于表示当前应该执行的基本块。
  4. 在程序中插入条件语句或跳转指令,根据控制变量或标志来选择执行哪个基本块。
  5. 在每个基本块末尾设置控制变量或标志的值,以确定下一个要执行的基本块。
  • 函数的开始地址为序言的地址
  • 序言的后继为主分发器
  • 后继为主分发器的块为预处理器
  • 后继为预处理器的块为真实块
  • 无后继的块为retn块
  • 剩下的为无用块

反混淆思路

  1. 如何识别出通用FLA混淆
    在这里插入图片描述
    或者借用bird的一张图
    在这里插入图片描述

实战

  • 利用angr去除FLA控制流混淆
    1. 构建出所有的blocks,并找出对应块之间的关系。目的: 找出真实块
    2. 符号执行,遍历所有blocks,并映射关系。 目的: 找出真实块的执行顺序
    3. 利用跳转指令/nop等patch程序。 目的: 还原控制流混淆
  • 代码参考
def symbolic_execution(project, relevant_block_addrs, start_addr, hook_addrs=None, modify_value=None, inspect=False):

    def retn_procedure(state):
        ip = state.solver.eval(state.regs.ip)
        project.unhook(ip)
        return

    def statement_inspect(state):
        expressions = list(
            state.scratch.irsb.statements[state.inspect.statement].expressions)
        if len(expressions) != 0 and isinstance(expressions[0], pyvex.expr.ITE):
            state.scratch.temps[expressions[0].cond.tmp] = modify_value
            state.inspect._breakpoints['statement'] = []

    if hook_addrs is not None:
        skip_length = 4
        if project.arch.name in ARCH_X86:
            skip_length = 5

        for hook_addr in hook_addrs:
            project.hook(hook_addr, retn_procedure, length=skip_length)

    state = project.factory.blank_state(addr=start_addr, remove_options={
                                        angr.sim_options.LAZY_SOLVES})
    if inspect:
        state.inspect.b(
            'statement', when=angr.state_plugins.inspect.BP_BEFORE, action=statement_inspect)
    sm = project.factory.simulation_manager(state)
    sm.step()
    while len(sm.active) > 0:
        for active_state in sm.active:
            if active_state.addr in relevant_block_addrs:
                return active_state.addr
        sm.step()

    return None


class FLAPass(object):
    """docstring for FLAPass"""
    def __init__(self, project):
        super(FLAPass, self).__init__()
        self.project = project
        self.target_function_supergraph = None

    def __fla_build_blocks(self):
        """ 流程图 https://security.tencent.com/uploadimg_dir/201701/b6d9662e5a216ac6e7a976dd8d814a79.png
            1. 函数的开始地址为序言的地址
            2. 序言的后继为主分发器
            3. 后继为主分发器的块为预处理器
            4. 后继为预处理器的块为真实块
            5. 无后继的块为retn块
            6. 剩下的为无用块
        """
        # 序言/返回块(retn)
        prologue_node = None
        for node in self.target_function_supergraph.nodes():
            if self.target_function_supergraph.in_degree(node) == 0:
                prologue_node = node
            if self.target_function_supergraph.out_degree(node) == 0:
                retn_node = node
        assert prologue_node is not None,"prologue node is None."
        assert prologue_node.addr == self.target_function_real_start_address,"[__build_flg_blocks] error:prologue node:0x{:08x}, fun start:0x{:08x}".format(prologue_node.addr,self.target_function_real_start_address)
        
        # 主分发器/预处理器
        pre_dispatcher_node = None
        main_dispatcher_node = list(self.target_function_supergraph.successors(prologue_node))[0]
        for node in self.target_function_supergraph.predecessors(main_dispatcher_node):
            print(node.addr,prologue_node.addr)
            if node.addr != prologue_node.addr:
                pre_dispatcher_node = node
                break
        print(pre_dispatcher_node)
        assert pre_dispatcher_node is not None,"predispatcher node is None."

        # 真实块/空白块
        relevant_nodes,self.nop_nodes = [],[]
        for node in self.target_function_supergraph.nodes():
            if self.target_function_supergraph.has_edge(node, pre_dispatcher_node) and node.size > 8:
                # XXX: use node.size is faster than to create a block 
                relevant_nodes.append(node)
                continue
            if node.addr in (prologue_node.addr, retn_node.addr, pre_dispatcher_node.addr):
                continue
            self.nop_nodes.append(node)

        if self.is_debug:
            print('*******************fla relevant blocks************************')
            print('prologue: %#x' % prologue_node.addr)
            print('main_dispatcher: %#x' % main_dispatcher_node.addr)
            print('pre_dispatcher: %#x' % pre_dispatcher_node.addr)
            print('retn: %#x' % retn_node.addr)
            self.relevant_block_addrs = [node.addr for node in relevant_nodes]
            print('relevant_blocks:', [hex(addr) for addr in self.relevant_block_addrs])

        self.relevants = relevant_nodes
        self.relevants.append(prologue_node)
        self.relevants_without_retn = list(self.relevants)
        self.relevants.append(retn_node)
        self.relevant_block_addrs.extend([prologue_node.addr, retn_node.addr])

    def __fla_symbolic_exec(self):
        self.flow = defaultdict(list)
        self.patch_instrs = {}
        for relevant in self.relevants_without_retn:
            print('-------------------dse %#x---------------------' % relevant.addr)
            block = self.project.factory.block(relevant.addr, size=relevant.size)
            has_branches = False
            hook_addrs = set([])
            for ins in block.capstone.insns:
                if self.project.arch.name in ARCH_X86:
                    if ins.insn.mnemonic.startswith('cmov'):
                        # only record the first one
                        if relevant not in self.patch_instrs:
                            self.patch_instrs[relevant] = ins
                            has_branches = True
                    elif ins.insn.mnemonic.startswith('call'):
                        hook_addrs.add(ins.insn.address)
                elif self.project.arch.name in ARCH_ARM:
                    if ins.insn.mnemonic != 'mov' and ins.insn.mnemonic.startswith('mov'):
                        if relevant not in self.patch_instrs:
                            self.patch_instrs[relevant] = ins
                            has_branches = True
                    elif ins.insn.mnemonic in {'bl', 'blx'}:
                        hook_addrs.add(ins.insn.address)
                elif self.project.arch.name in ARCH_ARM64:
                    if ins.insn.mnemonic.startswith('cset'):
                        if relevant not in self.patch_instrs:
                            self.patch_instrs[relevant] = ins
                            has_branches = True
                    elif ins.insn.mnemonic in {'bl', 'blr'}:
                        hook_addrs.add(ins.insn.address)

            if has_branches:
                tmp_addr = symbolic_execution(self.project, self.relevant_block_addrs,
                                                         relevant.addr, hook_addrs, claripy.BVV(1, 1), True)
                if tmp_addr is not None:
                    self.flow[relevant].append(tmp_addr)
                tmp_addr = symbolic_execution(self.project, self.relevant_block_addrs,
                                                         relevant.addr, hook_addrs, claripy.BVV(0, 1), True)
                if tmp_addr is not None:
                    self.flow[relevant].append(tmp_addr)
            else:
                tmp_addr = symbolic_execution(self.project, self.relevant_block_addrs,
                                                         relevant.addr, hook_addrs)
                if tmp_addr is not None:
                    self.flow[relevant].append(tmp_addr)
        if self.is_debug:
            print('************************flow******************************')
            for k, v in self.flow.items():
                print('%#x: ' % k.addr, [hex(child) for child in v])

    def __fla_patch(self):
        print("[*] start fla patch...")
        # patch irrelevant blocks
        for nop_node in self.nop_nodes:
            fill_nop(self.origin_data, nop_node.addr-self.so_base_address,
                     nop_node.size, self.project.arch)

        # remove unnecessary control flows
        for parent, childs in self.flow.items():
            if len(childs) == 1:
                parent_block = self.project.factory.block(parent.addr, size=parent.size)
                last_instr = parent_block.capstone.insns[-1]
                file_offset = last_instr.address - self.so_base_address
                # patch the last instruction to jmp
                if self.project.arch.name in ARCH_X86:
                    fill_nop(self.origin_data, file_offset,
                             last_instr.size, self.project.arch)
                    patch_value = ins_j_jmp_hex_x86(last_instr.address, childs[0], 'jmp')
                elif self.project.arch.name in ARCH_ARM:
                    patch_value = ins_b_jmp_hex_arm(last_instr.address, childs[0], 'b')
                    if self.project.arch.memory_endness == "Iend_BE":
                        patch_value = patch_value[::-1]
                elif self.project.arch.name in ARCH_ARM64:
                    # FIXME: For aarch64/arm64, the last instruction of prologue seems useful in some cases, so patch the next instruction instead.
                    if parent.addr == self.target_function_real_start_address:
                        file_offset += 4
                        patch_value = ins_b_jmp_hex_arm64(last_instr.address+4, childs[0], 'b')
                    else:
                        patch_value = ins_b_jmp_hex_arm64(last_instr.address, childs[0], 'b')
                    if self.project.arch.memory_endness == "Iend_BE":
                        patch_value = patch_value[::-1]
                patch_instruction(self.origin_data, file_offset, patch_value)
            else:
                instr = self.patch_instrs[parent]
                file_offset = instr.address - self.so_base_address
                # patch instructions starting from `cmovx` to the end of block
                fill_nop(self.origin_data, file_offset, parent.addr +
                         parent.size - self.so_base_address - file_offset, self.project.arch)
                if self.project.arch.name in ARCH_X86:
                    # patch the cmovx instruction to jx instruction
                    patch_value = ins_j_jmp_hex_x86(instr.address, childs[0], instr.mnemonic[len('cmov'):])
                    patch_instruction(self.origin_data, file_offset, patch_value)

                    file_offset += 6
                    # patch the next instruction to jmp instrcution
                    patch_value = ins_j_jmp_hex_x86(instr.address+6, childs[1], 'jmp')
                    patch_instruction(self.origin_data, file_offset, patch_value)
                elif self.project.arch.name in ARCH_ARM:
                    # patch the movx instruction to bx instruction
                    bx_cond = 'b' + instr.mnemonic[len('mov'):]
                    patch_value = ins_b_jmp_hex_arm(instr.address, childs[0], bx_cond)
                    if self.project.arch.memory_endness == 'Iend_BE':
                        patch_value = patch_value[::-1]
                    patch_instruction(self.origin_data, file_offset, patch_value)

                    file_offset += 4
                    # patch the next instruction to b instrcution
                    patch_value = ins_b_jmp_hex_arm(instr.address+4, childs[1], 'b')
                    if self.project.arch.memory_endness == 'Iend_BE':
                        patch_value = patch_value[::-1]
                    patch_instruction(self.origin_data, file_offset, patch_value)
                elif self.project.arch.name in ARCH_ARM64:
                    # patch the cset.xx instruction to bx instruction
                    bx_cond = instr.op_str.split(',')[-1].strip()
                    patch_value = ins_b_jmp_hex_arm64(instr.address, childs[0], bx_cond)
                    if self.project.arch.memory_endness == 'Iend_BE':
                        patch_value = patch_value[::-1]
                    patch_instruction(self.origin_data, file_offset, patch_value)

                    file_offset += 4
                    # patch the next instruction to b instruction
                    patch_value = ins_b_jmp_hex_arm64(instr.address+4, childs[1], 'b')
                    if self.project.arch.memory_endness == 'Iend_BE':
                        patch_value = patch_value[::-1]
                    patch_instruction(self.origin_data, file_offset, patch_value)

        assert len(self.origin_data) == self.origin_data_len, "Error: size of data changed!!!"    
        print("[*] fla patch end...")

参考

  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值