FirmUp: Precise Static Detection of Common Vulnerabilities in Firmware

最新推荐文章于 2023-06-07 20:02:31 发布

桃子小迷妹

最新推荐文章于 2023-06-07 20:02:31 发布

阅读量329

点赞数

分类专栏：论文

本文链接：https://blog.csdn.net/weixin_43846270/article/details/114524532

版权

论文专栏收录该内容

20 篇文章 1 订阅

订阅专栏

FirmUp : 对固件中常见漏洞的精确静态检测

firmware image : 固件映像

Problem definition Given a collection of executables $\mathcal{F} = {T_1, ..., T_n}$ (e.g., a firmware image), and a query executable $Q$ , containing a (vulnerable) procedure $q_v$ , our goal is to determine for each executable $T_i \in \mathcal{F}$ whether it contains a similar procedure to $q_v$ . We define two procedures as similar if they originate from the same source code. Similarity is subject to changes in source code (patching, versions) and declines as procedures differ semantically. Each of the executables in $\mathcal{F}$ may be compiled by any compiler, and can also be stripped of all name information.

给定一个可执行文件集合 $\mathcal{F} = {T_1, ..., T_n}$ （例如：一个固件映像），以及一个查询可执行文件 $Q$ ，包含一个漏洞函数 $q_v$ ，目标是确定 $\mathcal{F}$ 中的每一个可执行文件 $T_i $，是否包含一个与 $q_v$ 相似的函数。如果两个函数来自相同的源代码，我们将它们定义为相似的。相似度受源代码(补丁、版本)变化的影响，并且随着函数语义的不同而降低。 $\mathcal{F}$ 中的每一个可执行文件可能被任何编译器编译，并且可以被移除所有名称信息。

Efficiently matching procedures in the context of executables 在可执行文件的上下文中有效地匹配函数

Vulnerability detection requires not only pairwise similarity between procedures but also information about the relationships between procedures in the surrounding executable. This observation serves as the foundation for a novel technique that establishes a partial correspondence between procedures in the two binaries.

漏洞检测不仅需要函数之间的成对相似性，而且需要关于周围可执行程序中函数之间关系的信息。这一观察结果是一种新技术的基础，该技术在两个二进制文件中的程序之间建立了部分对应关系。

Furthermore, firmware is often customized and optimized to contain only parts of a library to match the particular device on which it runs.
此外，固件通常是定制和优化的，以只包含库的一部分，以匹配它所运行的特定设备。

Example

Given a query procedure and containing executable, our goal is to find a similar procedure in other stripped executables.

给定一个包含查询函数的可执行文件，目标是在其他剥离的可执行文件中找到类似的函数。

Pairwise Procedure Similarity

在这里插入图片描述

The syntactic gap 考虑 Figure 1 中的 MIPS 汇编代码片段。尽管语法上非常不同，而且没有共享任何代码行，但这些代码段属于Wget中同一个ftp_glob_retrieve() 函数的第一个基本块(BB)。语法上的差异源于该函数在不同的设置和不同的工具链下编译。Figure 1(a)是从Wget 1.15编译，使用 gcc 5.2，优化级别为2，而图1(b)的编译设置是未知的，因为它属于 NETGEAR 设备的剥离固件映像，从供应商的公共支持站点爬取。尽管它们的语法有变化，但这些片段在语义上有很多相同之处:

• Both snippets retrieve a value from the stack (line 5 in (a), line 1 in (b)). 两个代码段都从堆栈中取出一个值
• Both load the value 0x1F and use it in a jump comparison operation (lines 4 and 6 in (a), lines 5 and 6 in (b)). 两者都加载值 0x1F 并在跳转比较操作中使用它
• Both snippets call a procedure (line 1 in (a), line 3 in (b)). 两个代码段都调用一个函数

尽管在语义上并不等价，但这两个函数有很大的相似性，但是由于指令选择、顺序和寄存器使用的差异，以及不同的代码和数据布局(偏移量)，找到这种相似性非常困难。

Capturing semantic similarity 为了让我们的技术找到来自不同编译的二进制文件的相似性，我们使用了一种函数表示，并使其适应我们的需要。我们首先在 BB 级分解一个函数(一个BB是过程的CFG中的一个节点)，然后在 BB 级应用切片来生成聚焦的计算片段。然后，我们使用编译器优化器将片段(我们称之为 strands)简化为规范形式，同时还对寄存器名和基址偏移量进行规范化。将函数表示为一组 strands 可以捕获语义相似性，因为有关如何执行计算的具体细节通过转换到 strands 而抽象出来，这只反映了计算的内容。我们定义一对函数 $(q, t)$ 的相似性为这些函数共享的strands的数，并表示为 $S i m (q, t)$ 。

Efficient Partial Executable Matching
Similarity in the scope of a single procedure is inaccurate 同一函数范围内的相似性是不准确的
尽管函数级相似性可以达到相当的精度，但在某些场景中，使用包含可执行文件的附加信息可以显著减少错误匹配的数量。
在这里插入图片描述
Figure 2. 通过实际的搜索过程：在可执行文件 $Q = W g e t$ 中 $q_v= ftp\_glob\_retrieve()$ (Figure 1), 在一个野生固件映像 $\mathcal{F}$ ( 属于 NETGEAR 设备，如 Figure 1(b) 所示）中搜索。

首先，在 Figure 2 (a.0) 中，尝试将查询函数与来自 NETGEAR 固件映像的目标可执行文件的函数进行匹配。使用朴素的基于成对相似性的方法，查询函数将与目标可执行文件 $T$ 中与它共享最高的相似性分数 $S i m$ 的函数进行匹配，即 $sub\_443ee2()$ 过程。

当使用一个更广泛的，可执行文件级别的相似性计算方法，正如 Figure 2 (a.1-2) 中，这个选择被发现是不匹配的，只是因为 $sub_443ee2()$ 的大小更大，而且 $ftp\_glob\_retrieve()$ 和它的 true positive $sub\_4ea884()$ 经历了不同的编译，共享更少的链，因此Sim分数更低。我们注意到，简单地按照函数的大小来标准化分数并不能解决这个问题，因为函数的大小受许多因素影响，其中大多数与实际语义没有什么关系。例如，较低的优化级别可能导致创建大量代码，面向大小的优化将使其缩小，但非常高的优化级别可能再次通过循环展开膨胀代码大小。这种不匹配说明了以函数为中心的方法的局限性，例如，Gitz 会错误地将 $ftp\_glob\_retrieve()$ 与 $sub_443ee2()$ 函数匹配。

这种不匹配可以通过执行反向搜索来识别，即在可执行文件 (a.1) 中搜索与 $sub\_443ee2()$ 最相似的函数。这样做会导致匹配一个不同的函数—— $get\_ftp()$ ，表明原始匹配是不一致的，因为无论搜索方向如何，都会得到相同的结果。

(a , 1) $\rightarrow$ (a , 2)
$ftp\_glob\_retrieve() \rightarrow sub\_443ee2()$
(a , 2) $\rightarrow$ (a , 1)
$sub\_443ee2() \rightarrow ftp\_glob\_retrieve()$

解决这个问题的另一种方法是匹配所有函数，从而在可执行文件之间建立完整的匹配。虽然这是朝着正确的方向迈出的一步，但这种方法受到了严重的限制，因为它假设可执行文件的整个结构是相似的，而事实并非总是如此。可执行文件结构的主要差异通常是由所选的构建配置引起的。例如，在我们的例子中，查询可执行文件是使用默认设置编译的，导致 $skey\_resp()$ ，一个为了sftp 处理 OPIE认证方案的函数，被编译到它中，如图2(a.1)所示。由于我们不知道的原因，在这个特定的例子中，供应商使用 $- - d i s a b l e - o p i e$ 选项编译 $W g e t$ ，导致在目标可执行文件中省略了这个函数。此更改可能产生“多米诺效应”，导致多个函数不匹配。我们的方法侧重于查询函数，试图避免这种不一致和不准确的结果。

Procedure similarity in the scope of an executable using back-and-forth games Figure 2(b) 说明了从函数级相似度度量到可执行文件级相似度度量的转换，其中查询(b.1)和目标(b.2)可执行文件中的附加信息扩大了范围。这个结果是通过使用一个实现来回博弈的算法得到的，该算法建立并扩展了更合适的部分匹配，只受到必须包含查询过程这一要求的限制。

Outlining a matching from a two-player game Figure 2(b) 所示的匹配是两个参与者在一个 back-and-forth 游戏的移动，一个 player 和一个 rival。
在这里插入图片描述

$p l a y e r$ : 首先为 $q_v = ftp\_glob\_retrieve()$ 从 $T$ 中选择了 $t_1 = sub\_443ee2()$ 进行匹配 $\rightarrow \{(q_v, t_1)\}$
$r i v a l$ : 尝试为 $t_1$ 寻找一个更好的匹配，选择了 $q_1= get\_ftp() \in Q$ 作为 $t_1$ 的备选优先匹配，因为 $Sim(sub\_443ee2(),get\_ftp()) = 71 > Sim(sub\_443ee2(),ftp\_glob\_retrieve()) = 53$ 。要求 $p l a y e r$ 重选。
$p l a y e r$ :

Representing Firmware Binaries

1. Binary Lifting

From bits to intermediate representation (IR)
Using the procedure assembly, which can be extracted (relatively) easily using disassemblers , is problematic as assembly instructions are made to be succinct and not expressive.

使用函数汇编(可以使用反汇编器(相对地)很容易地提取出来)是有问题的，因为汇编指令被制作得很简洁，没有表达性。

For instance sub-parts of the same register will appear as differently named variables in the assembly $(m o v r a x, 0 v s . m o v e a x, 0)$ .
例如，同一个寄存器的子部分 (e.g. rax, eax) 将在汇编 $(m o v r a x, 0 v s . m o v e a x, 0)$ 中以不同命名的变量的形式出现。
Another example is the lack of side-effect expressiveness, for instance in a comparison operation $c m p r a x, r b x$ , the register flags affected by the operation do not appear in the instruction (or in any of the instructions).
另一个例子是缺乏 side-effect 表达性，例如在比较操作 $c m p r a x 、 r b x$ 中，受操作影响的 register flags 不会出现在指令中(或任何指令中)。

The VEX-IR contains full representation of the machine state, including side-effects, for each of the translated instructions.
VEX-IR包含了机器状态的完整表示，包括每个已翻译指令的 side-effects。
We used IDA Pro for the parsing and extraction of procedures and BBs from executables, as we noticed that overall it is more accurate when tasked with finding all procedures and blocks in the executable.
我们使用IDA Pro从可执行文件中解析和提取函数和BBs，因为我们注意到，总的来说，在可执行文件中查找所有函数和块时，IDA Pro更准确。

Translating embedded architectures
Handling four different target architectures, MIPS32, ARM32, PPC32 and Intel-x86, and specifically executables originating from real firmware images, holds some caveats.
处理四种不同的目标架构，MIPS32、ARM32、PPC32和Intel-x86，以及来自真实固件映像的可执行文件，需要注意一些事项。

First, many of the executables either had a corrupt Executable and Linkable Format (ELF) header, or were distributed with the wrong ELFCLASS. Specifically we found that the existence of MIPS64 executables (8-byte aligned instructions) with a ELFCLASS32 header is common in firmware. 首先，许多可执行文件要么有一个损坏的可执行文件和可链接格式(ELF)头文件，要么与错误的ELFCLASS一起分发。尤其我们发现， MIPS64 可执行文件(8字节对齐指令)带有 ELFCLASS32 头文件在固件中很常见。
Another caveat, in MIPS executables, is the use of a delay branch slot, which requires an additional instruction to follow any branch instruction. This additional instruction will be executed while the branch destination is being resolved. This results in the first instruction of the subsequent block being omitted from it and placed as part of the preceding block, which leads to strand discrepancy. 另一个注意事项是，在MIPS可执行文件中，使用了延迟分支槽，这需要在任何分支指令之后使用额外的指令。此附加指令将在解析分支目的地时执行。这导致后续块的第一个指令被省略，并作为前一个块的一部分放置，这导致串差异。
Finally, as mentioned, binary lifting tools may still fail to identify several blocks in some procedure, or even omit entire procedures altogether. This is exacerbated in the stripped scenario. 最后，如前所述，二进制提升工具仍然可能无法识别某些过程中的几个块，或者甚至完全忽略整个函数。这种情况在无约束情况下会加剧。

2. Procedure Decomposition

Procedures to strands based on CFG representation
A BB may contain instructions which relate to different executions but reside together due to compiler considerations.
一个BB可能包含与不同execution相关的指令，但由于编译器的考虑，这些指令会被放在一起。
Thus we further decompose BBs to independent units of execution by applying slicing.
因此，我们通过应用切片技术进一步将BBs分解为独立的 execution 单元 $\rightarrow Strands.$

We assume the BB is in Single Static Assignment (SSA) form, a property of the VEX-IR lifting we use.
我们假设BB是单静态赋值(SSA)形式，这是我们使用的 VEX-IR提升的一个属性。

2.1 Optimizing and Normalizing Strands

To overcome syntactic differences between different compilations of the same procedure, we further operate to bring semantically equivalent strands to the same syntactic form.
为了克服同一程序的不同编译之间的语法差异，我们进一步操作，使语义相同的 strands，有相同的语法形式。

Offset elimination The first step towards canonical form is the removal of offset values that pertain to the concrete structure of the binary file. 走向规范形式的第一步是删除与二进制文件的具体结构相关的偏移值。 We do not remove offsets which pertain to stack and struct manipulation, as they are more relevant to the semantics of the procedure, serving as a descriptor of the type of data the procedure handles. 我们不删除与堆栈和结构操作相关的偏移量，因为它们与过程的语义更相关，可以作为过程处理的数据类型的描述符。
Register folding
Compiler optimization ：a VEX-IR strand $\rightarrow$ a LLVM-IR function | Relevant optimizations include expression simplification, constant folding and propagation, instruction combining, common subexpression elimination and dead code elimination. 相关的优化包括表达式简化、常数折叠和传播、指令组合、公共子表达式消除和死代码消除。
Variable name normalization: To further advance a strand towards canonical form, we rename variables appearing in the optimized strand according to the order in which they appear. 为了使一条链进一步向规范形式发展，我们将优化链中出现的变量按照它们出现的顺序重命名。
We denote Strands( p ) as the set of canonical strands (each strand is represented as a string of its instructions) for a procedure p.
函数 p $\rightarrow$ Strands ( p )

在这里插入图片描述

3. Pairwise Procedure Similarity

We denote a query procedure, i.e., the procedure being searched for, as $q_v \in Q$ , where $Q$ is the set of all procedures in the containing query executable. The target procedure, i.e., the candidate procedure q is being compared to, is denoted by $\in T$ (similarly, $T$ being the containing target executable).
Given a pair (q,t), we define procedure similarity as follows:
$S i m (q, t) = ∣ S t r a n d s (q) \cap S t r a n d s (t) ∣$
To calculate Sim faster, we keep the procedure representation as a set of hashed strands (without consideration for hash counts).

Binary Similarity as a Back-and-Forth Game

1. Game Motivation

Procedure-centric matching is insufficient 以函数为中心的匹配是不够的
The goal is to transition from a local match, which relies on a local maximum of similarity score, to a global maximum. 目标是从局部匹配(依赖于局部相似性值的最大值)过渡到全局最大值。

在这里插入图片描述
Knowledge in surrounding executable leads to better matching

player 首先选择 $t_1$ 与 $q_1$ 匹配。然后 rival 尝试证明 player 是错误的，选择了 $q_2$ 与 $t_1$ 匹配，因为
$S i m (q 2, t 1) > S i m (q 1, t 1)$ 。
player 调整 $(q 2, t 1)$ 来修正自己的选择，选择另一个函数 $t2\in T$ 匹配 $q_1$ 。
rival不能选择一个一个更好的匹配

在这里插入图片描述
输入： $T$ ：目标可执行文件（函数集）
$Q$ ：查询可执行文件（函数集）包含 $q_v$
$q_v$ ： $Q$ 中的漏洞函数
输出：匹配 - 产生的部分匹配，包含(至少) $q_v$ 的一个匹配
Line 1 ：首先创建一个空的字典 $M a t c h e s$ 存储所有匹配的函数
Line 2 ：初始化一个堆栈 $T o M a t c h$ ，存储所有尝试匹配的函数，并将 $q_v$ 存入这个堆栈，因为这是我们需要匹配的主要函数。
在初始化之后，我们开始匹配游戏，表示为一个while循环，游戏结束时也就结束了。

在每次循环迭代，或者加入游戏，我们尝试匹配存储在 $T o M a t c h$ 堆栈的头部的函数 $M$ (Line 4)。通过检查 $M$ 是否是 $Q$ 或 $T$ 的一部分，并设置 $M y$ 和 $O t h e r$ ，我们决定将按哪个方向执行匹配。我们执行 $f o r w a r d m a t c h$ ，即在 $O t h e r$ 中搜索 $M$ 的最佳匹配，而忽略之前匹配的所有过程。这是通过调用 $G e t B e s t M a t c h ()$ 实现 (Line 9)。使用 $M$ 的最佳匹配 $F o r w a r d$ ，执行 $b a c k w a r d s m a t c h$ ，并将结果存储在Back中(Line 10)。

GameDidntEnd()

找到了 $q_v$ 的匹配项
$T o M a t c h$ 到达一个固定的状态。当没有新函数被存入 $T o M a t c h$ 在 $P u s h I f N o t E x i s t s ()$ (Line 15)，就会发生这种情况。在这种情况下，游戏将永远不会结束，这意味着匹配过程不会成功。
作为一种启发式，如果发现太多的匹配或匹配包含太多的函数，也可以停止游戏。