【ICSE 2021】ATVHUNTER: Reliable Version Detection of Third-Party Libraries for Vulnerability Identification in Android Applications
单位:1The Hong Kong Polytechnic University(香港理工大学), 2Nankai Univerisity(南开大学), 3Tianjin University(天津大学), 4Nanyang Technological University(南洋理工大学), 5Monash University(蒙纳士大学)
会议:ICSE 2021
论文链接:ATVHUNTER: Reliable Version Detection of Third-Party Libraries for Vulnerability Identification in Android Applications
论文源码:本论文未开源,但是提供了在线的检测工具:https://scantist.io
参考:https://github.com/Anonymous-Phunter/PHunter
ABSTRACT
该文章提出了ATVHUNTER (Android in-app Third-party library
Vulnerability Hunter),通过对安卓app中Third-party libraries (TPLs)的精确版本的检测和对TPL vulnerabilities信息的收集,提供输入app的TPLs和相关vulnerabilities的信息。其本质是一种有先验知识的similarity-based library detection方案。
在app分析方面,ATVHUNTER采用了two-phase detection approach来identify specific TPL versions: Control Flow Graphs(CFG) as the coarse-grained feature和opcode in each Basic Block of CFG as the fine-grained feature。
在reference database创建方面,ATVHUNTER创建的TPL database 包含189,545 unique TPLs with 3,006,676 versions;ATVHUNTER创建的TPL vulnerability database 包含了TPL中出现的1,180 CVEs and 224 security bugs。
作者对ATVHUNTER进行了Effectiveness、Efficiency和Obfuscation-resilient Capability方面的Evaluation;使用ATVHUNTER对104,446个top apps进行了Large-Scale Analysis,发现其中9,050个vulnerable apps,涉及到10,616 vulnerable TPLs中的53,337 known vulnerabilities 和 7,480 security bugs。
1. INTRODUCTION
1.1 TPL detection的意义
- Attackers can exploit the vulnerabilities in TPLs
- Attackers can inject backdoors in TPLs
- TPLs are scattered in different apps
- The information of TPL components in apps may be not transparent to app developers(due to many direct or transitive dependencies)
1.2 现有的TPL detection方案
-
无先验知识:
-
clustering-based methods:
LibRadar(ICSE 2016)、LibD(ICSE 2017)、LibExtractor(WiSec 2020)
-
-
有先验知识:
-
similarity-based methods:
LibScout(CCS 2016)、LibID(ISSTA 2019)
-
数据来源:Research on Third-Party Libraries in Android Apps: A Taxonomy and Systematic Literature Review (TSE 2021)
1.3 现有TPL detection方案的weaknesses
-
Clustering-based methods:
- require a considerable number of apps as input
- Low recall:only can identity commonly-used TPLs
- Labor-intensive:verifying the clustering results is labor-intensive
- Imprecise:inability of precise version identification
-
Similarity-based methods:
- require a predefined TPL database as the reference database
- Low recall:current published size of TPL database is far smaller than that in the actual market
- Imprecise:inability of precise version identification
2. ARCHITECTURE
2.1 TPL Detection
目的:根据TPL database中的数据,识别出app中包含哪些TPL
2.1.1 Preprocessing
-
Task 1:将apk反编译成bytecode并转换成IR(借助APKTOOL)
-
Task 2:删除apk中的 primary module
-
primary module:app开发者实现的代码
-
non-primary module:TPLs
-
实现方案:
-
根据AndroidManifest.xml找到包含MainActivity的package
-
例如:
< manifest …… package="com.cmic.sso.myapplication" …… >
-
-
删除package的namespace下面的文件
-
-
Side Effects:
-
Side Effect 1:package flattening & package renaming obfuscation 导致host code无法被删除
-
混淆前:
mycompany.myapplication.MyMainActivity mycompany.myapplication.Foo mycompany.myapplication.Bar mycompany.myapplication.extra.FirstExtra mycompany.myapplication.extra.SecondExtra mycompany.util.FirstUtil mycompany.util.SecondUtil
-
Proguard 默认混淆后:
mycompany.myapplication.MyMainActivity mycompany.myapplication.a mycompany.myapplication.b mycompany.myapplication.a.a mycompany.myapplication.a.b mycompany.a.a mycompany.a.b
-
-flattenpackagehierarchy 'myobfuscated'
混淆后:mycompany.myapplication.MyMainActivity mycompany.myapplication.a mycompany.myapplication.b myobfuscated.a.a myobfuscated.a.b myobfuscated.b.a myobfuscated.b.b
myobfuscated.a替代mycompany.myapplication.extra
导致mycompany.myapplication.extra.FirstExtra和mycompany.myapplication.extra.SecondExtra无法被删除
-
-
Side Effect 2:special package name 导致host code无法被删除
-
Side Effect 3:host app and TPLs have the same package namespace 导致TPLs被误删
-
-
Side Effect 1和2:不影响the accuracy of TPL identification
-
Side Effect 3:导致FN
-
2.1.2 Module Decoupling
目的:将TPLs拆分开
拆分方法:每个Class Dependency Graph (CDG)作为一个TPL candidate(借助Androguard)
class dependency relationship includes:
① class inheritance
② method call relationship
③ field reference relationship
2.1.3 Feature Generation
目的:提取每个TPL的fingerprint
方法:
(1) coarse-grained feature 粗粒度特征:
① 对candidate TPLs中的每个method提取CFG(借助soot),并为CFG中的每个节点(BB)编号(按照执行顺序先后,从小到大编号)
编号时,对于分支节点n的子节点:
- outgoing edges更多的node编号为n+1
- outgoing edges相同,statements更多的node编号为n+1
② 以nodeCount -> (child1,child2,…)的形式表示一个node
③ 以adjacency list的形式表示一个CFG(对应一个method)
adjacency list形如[parent1 -> (child1,child2,…), parent2-> …]
④ 对adjacency list计算hash值(每个adjacency list对应一个method)
⑤ 将TPL的所有method对应的hash值进行排序,并对排序后的序列计算hash值,将该hash值作为TPL的coarse-grained feature(T1)
(2) fine-grained feature 细粒度特征:
① 对每个CFG,按照adjacency list,提取其中的BB的opcode(借助soot)
② 对opcode sequence计算 Fuzzy Hash 值(借助ssdeep)
fuzzy hash的优势是:If one part of the feature changes due to code obfuscation, it would not cause a big difference to the final fingerprint.
2.1.4 TPL Database Construction
-
We crawled all Java TPLs from Maven Repository (189,545 unique TPLs with their 3,006,676 versions) to build our TPL database.
-
We store both coarse-grained and fine-grained features in a MongoDB database.
-
We spent more than one month to collect all the TPLs and another two months to generate the TPL feature database.
2.1.5 Library Identification
目的:尝试去找到app中的TPL candidate 对应的TPL和TPL version
(1) Potential TPL Identification
-
a) Search by package names
通过package name过滤掉一些不相关的TPL
- 当TPL candidate的package name未被混淆时:过滤掉不相关TPL
- 当TPL candidate的package name被混淆时:不进行任何过滤
-
b) Search by the number of classes
本质是通过the number of classes过滤掉一些不相关的TPL
两者中一方的class数量 < 另一方的class数量的40%时,不再进行后续比较
-
c) Search by coarse-grained features
-
coarse-grained feature(T1)完全相同,则认为匹配上
-
coarse-grained feature(T1)超过70%相同,则认为找到了potential TPL
只对potential TPL进行后续的Version Identification
-
(2) Version Identification
-
两个method之间的相似度
Method Similarity Score (MSS)
-
其中 d [ m a , m b ] d[m_a,m_b] d[ma,mb] 代表 m a m_a ma和 m b m_b mb 的fingerprint(adjacency list的hash值)之间的Edit Distance(借助ssdeep)
-
Edit Distance:the number of minimum edit operations (i.e., insertion, deletion, and substitution) that is required to modify one fingerprint to the other.
-
如果MSS的值 ≥ θ ( = 0.85 ) \ge\theta (= 0.85) ≥θ(=0.85),则认为两个method是matched
-
-
两个TPL之间的相似度
TPL Similarity Score (TSS)
-
t 1 t_1 t1:代表一个来自app的TPL
-
t 2 t_2 t2:代表一个来自 TPL DB的TPL
-
M ∣ t 2 ∣ M|t_2| M∣t2∣:t2中的method数量
-
M ∣ t 1 ∩ t 2 ∣ M|t_1 \cap t_2| M∣t1∩t2∣:满足以下条件的方法 m j m_j mj的数量
- m j m_j mj是 t 2 t_2 t2中的方法
- 存在 m i ∈ t 1 m_i \in t_1 mi∈t1, M S S ( m i , m j ) ≥ θ ( = 0.85 ) MSS(m_i,m_j) \ge \theta(=0.85) MSS(mi,mj)≥θ(=0.85)
- t 1 t_1 t1和 t 2 t_2 t2中至少存在一对MSS值为1的方法(完全matched的方法)
-
TSS值 ≥ δ = 0.95 \ge\delta=0.95 ≥δ=0.95时,认为两个方法匹配上(有多个matched方法时,取TSS值最大的作为最终结果)
-
2.2 Vulnerable TPL-V Identification
2.2.1 Database Construction
(1) Known TPL Vulnerability Collection
- 从TPL database中提取TPL的CPE名称
- CPE 2.3:
cpe:/<part>:<vendor>:<product>:<version>:<update>:<edition>:<language>
- CPE 2.3:
- 使用cve-search工具搜索TPL相关的vulnerability
- Finally, we collected 1,180 CVEs from 957 unique TPLs with 38,243 affected versions.
(2) Security Bug Collection
- We also obtain 224 security bugs from Github and Bitbucket.
- These bugs come from 152 open-source TPLs with their corresponding 4,533 versions.
2.2.2 Vulnerable TPL-V Identification
检查匹配上的TPL是否是vulnerable的
3. EVALUATION
衡量ATVHUNTER的有效性和性能
3.1 Preparation
3.1.1 Ground-truth Dataset Construction
- We first collect the latest versions of 500 open-source apps from F-Droid.
- For each app, we manually analyze it and get the in-app TPLs with their specific versions.
- We then download these TPLs with their versions from the
Maven repository. - We filter 144 apps out due to the incomplete versions of TPLs maintained in the Maven repository.
- We choose 356 apps and 189 unique TPLs with the complete 6,819
version files as the ground truth.
3.1.2 Threshold Selection
- We randomly select three groups (3 * 200) of apps except the aforementioned dataset to decide appropriate thresholds for MSS and TSS.
3.2 Effectiveness Evaluation
3.3 Efficiency Evaluation
3.4 Obfuscation-resilient Capability
4. LARGE-SCALE ANALYSIS
使用ATVHUNTER来reveal real world中TPL vulnerability的impact
- We collected commercial Android apps from Google Play based on the number of installations.
- We finally collected 104,446 apps and found 72% of them (73,110/104,446) use TPLs.
- 9,050/73,110 of apps include vulnerable TPLs, involving 53,337 vulnerabilities and 7,480 security bugs.
- vulnerabilities are from 166 TPLs with 10,362 versions
- security bugs are from 27 TPLs with 284 versions
5. DISCUSSION
Limitations:
(1)About native libraries:hash的方案可能不奏效
(2)About app packing