网络安全事件预测 - 论文整理(一)

写在前面:

  • 有任何问题/疑惑/错误/建议/链接失效欢迎留言
  • 为了防止出现理解上的偏差,大部分内容来自论文原文内容
  • 其中论文要点部分不属于论文原文,是我自己的一个理解概括,如果出现错误,请一定指出,十分感谢
  • 本文同步发在这里

1、Cloudy with a Chance of Breach: Forecasting Cyber Security Incidents

作者: Yang Liu, Armin Sarabi, Jing Zhang, and Parinaz Naghizadeh, University of Michigan; Manish Karir, QuadMetrics, Inc.; Michael Bailey, University of Illinois at Urbana-Champaign; Mingyan Liu, University of Michigan and QuadMetrics, Inc.
论文来源: USENIX,2015
论文链接: https://www.usenix.org/system/files/conference/usenixsecurity15/sec15-paper-liu.pdf
Abstract: In this study we characterize the extent to which cyber security incidents, such as those referenced by Verizon in its annual Data Breach Investigations Reports (DBIR), can be predicted based on externally observable properties of an organization’s network. We seek to proactively forecast an organization’s breaches and to do so without cooperation of the organization itself. To accomplish this goal, we collect 258 externally measurable features about an organization’s network from two main categories: mismanagement symptoms, such as misconfigured DNS or BGP within a network, and malicious activity time series, which include spam, phishing, and scanning activity sourced from these organizations. Using these features we train and test a Random Forest (RF) classifier against more than 1,000 incident reports taken from the VERIS community database, Hackmageddon, and the Web Hacking Incidents Database that cover events from mid-2013 to the end of 2014. The resulting classifier is able to achieve a 90% True Positive (TP) rate, a 10% False Positive (FP) rate, and an overall 90% accuracy.

2、RiskTeller: Predicting the Risk of Cyber Incidents

作者: Yufei Han and Matteo Dell’Amico
论文来源: CCS,2017
论文链接: https://dl.acm.org/doi/10.1145/3133956.3134022
Abstract: The current evolution of the cyber-threat ecosystem shows that no system can be considered invulnerable. It is therefore important to quantify the risk level within a system and devise risk prediction methods such that proactive measures can be taken to reduce the damage of cyber attacks. We present RiskTeller, a system that analyzes binary file appearance logs of machines to predict which machines are at risk of infection months in advance. Risk prediction models are built by creating, for each machine, a comprehensive profile capturing its usage patterns, and then associating each profile to a risk level through both fully and semi-supervised learning methods. We evaluate RiskTeller on a year-long dataset containing information about all the binaries appearing on machines of 18 enterprises. We show that RiskTeller can use the machine profile computed for a given machine to predict subsequent infections with the highest prediction precision achieved to date.
Contributions:

  • We propose RiskTeller, a system that leverages both supervised and semi-supervised learning methodologies to predict which machines are at risk with highest accuracy achived to date.
  • We design 89 features that are extracted from per-machine file appearance logs to produce machine profiles, which capture the machine’s patterns of usage and security awareness of enterprise users.
  • We design a semi-supervised machine learning algorithm, which leverages on profile similarity to infer fuzzy labels for unlabeled machines based on similar labeled machines to enrich the ground truth.
  • We perform a comprehensive evaluation that shows how RiskTeller can predict which machines will get infected with high precision, and that the former two steps are fundamental in getting high-quality results.

3、Tiresias: Predicting Security Events Through Deep Learning

作者: Yun Shen, Enrico Mariconti, Pierre-Antoine Vervier and Gianluca Stringhini
论文来源: CCS,2018
论文链接: https://seclab.bu.edu/people/gianluca/papers/tiresias-ccs2018.pdf
Abstract: With the increased complexity of modern computer attacks, there is a need for defenders not only to detect malicious activity as it happens, but also to predict the specific steps that will be taken by an adversary when performing an attack. However this is still an open research problem, and previous research in predicting malicious events only looked at binary outcomes (eg. whether an attack would happen or not), but not at the specific steps that an attacker would undertake. To fill this gap we present Tiresias, a system that leverages Recurrent Neural Networks (RNNs) to predict future events on a machine, based on previous observations. We test Tiresias on a dataset of 3.4 billion security events collected from a commercial intrusion prevention system, and show that our approach is effective in predicting the next event that will occur on a machine with a precision of up to 0.93. We also show that the models learned by Tiresias are reasonably stable over time, and provide a mechanism that can identify sudden drops in precision and trigger a retraining of the system. Finally, we show that the long-term memory typical of RNNs is key in performing event prediction, rendering simpler methods not up to the task.
论文要点: 这篇论文对于安全事件的预测不仅仅是预测安全事件是否会发生(不是一个二分类任务),而是去预测在进行攻击时攻击者会采取的具体行动,比如在多步攻击中攻击者会使用的CVE,或者在早起的攻击发生时就可以评估攻击的潜在严重性。
本文作者按照安全事件发生的时间顺序建立安全事件序列,使用已知的安全事件序列来预测未来要发生的安全事件。
在这里插入图片描述

4、Predicting Cyber Security Incidents Using Feature-Based Characterization of Network-Level Malicious Activities

作者: Liu, Y., Zhang, J., Sarabi, A., Liu, M., Karir, M., Bailey, M.
论文来源: IWSPA,2015
论文链接: https://www.researchgate.net/publication/295351303_Predicting_Cyber_Security_Incidents_Using_Feature-Based_Characterization_of_Network-Level_Malicious_Activities
Abstract: This study offers a first step toward understanding the extent to which we may be able to predict cyber security incidents (which can be of one of many types) by applying machine learning techniques and using externally observed malicious activities associated with network entities, including spamming, phishing, and scanning, each of which may or may not have direct bearing on a specific attack mechanism or incident type. Our hypothesis is that when viewed collectively, malicious activities originating from a network are indicative of the general cleanness of a network and how well it is run, and that furthermore, collectively they exhibit fairly stable and thus predictive behavior over time. To test this hypothesis, we utilize two datasets in this study: (1) a collection of commonly used IP address-based/host reputation blacklists (RBLs) collected over more than a year, and (2) a set of security incident reports collected over roughly the same period. Specifically, we first aggregate the RBL data at a prefix level and then introduce a set of features that capture the dynamics of this aggregated temporal process. A comparison between the distribution of these feature values taken from the incident dataset and from the general population of prefixes shows distinct differences, suggesting their value in distinguishing between the two while also highlighting the importance of capturing dynamic behavior (second order statistics) in the malicious activities. These features are then used to train a support vector machine (SVM) for prediction. Our preliminary results show that we can achieve reasonably good prediction performance over a forecasting window of a few months.

  • 2
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值