安全siem_当时和现在的安全相关性是关于siem的可悲事实

最新推荐文章于 2024-09-20 15:47:43 发布

weixin_26636643

最新推荐文章于 2024-09-20 15:47:43 发布

阅读量245

点赞数

文章标签：安全 java https

原文链接：https://medium.com/anton-on-security/security-correlation-then-and-now-a-sad-truth-about-siem-fc5a1afb1001

版权

安全siem

We all know David Bianco Pyramid of Pain, a classic from 2013. The focus of this famous visual is on indicators that you “latch onto” in your detection activities. This post will reveal a related mystery connected to SIEM detection evolution and its current state. So, yeah, this is another way of saying that a very small number of people are perhaps very passionate about it …

我们都知道Pain的David Bianco金字塔，这是2013年的经典作品。这种著名的视觉效果的重点在于您在检测活动中“锁定”的指标。这篇文章将揭示一个与SIEM检测演变及其当前状态有关的谜团。所以，是的，这是另一种说法，即极少数人可能对此非常热情……

But who am I kidding? I plan to present a dangerously long rant about the state of detection content today. So, yes, of course there will be jokes, but ultimately that is a serious thing that had been profoundly bothering me lately.

但是我在跟谁开玩笑？我计划今天就检测内容的状态进行一次危险的漫长辩论。因此，是的，当然会有笑话，但最终这是一件严重的事，最近一直困扰着我。

First, let’s travel to 1999 for a brief minute. Host IDS is very much a thing (but the phrase “something is a thing” has not yet been born), the term “SIEM” is barely a twinkle in a Gartner analyst eye. However, some vendors are starting to develop and sell “SIM” and “SEM” appliances (It is 1999! Appliances are HOT!).

首先，让我们短暂地走到1999年。主机IDS很大程度上是一回事(但“东西是一件事”还没有诞生)，“ SIEM”一词在Gartner分析师眼中只是一闪而过。但是，一些厂商开始开发和销售“ SIM”和“ SEM”设备(1999年！设备很热！)。

Some of the first soon-to-be-called-SIEM tools have very basic “correlation” rules (really, just aggregation and counting of a single attribute like username or source IP) and have rules like “many connections to the same port across many destinations”, “Cisco PIX log message containing SYNflood, repeated 50 times” and “SSH login failure.” Most of these rules are very fragile i.e. a tiny deviation in attacker activities will cause it to not trigger. They are also very device dependent (i.e. you need to write such rules for every firewall device, for example). So the SIM / SEM vendor had to load up many hundreds of these rules. And customers had to suffer through enabling/disabling and tuning them. Yuck!

一些第一个即将被称为SIEM的工具具有非常基本的“关联”规则(实际上，仅是对诸如用户名或源IP之类的单个属性进行汇总和计数)，并且具有诸如“跨多个连接到同一端口的许多连接”规则。许多目标”，“包含SYNflood的Cisco PIX日志消息，重复50次”和“ SSH登录失败”。这些规则中的大多数都是非常脆弱的，即攻击者活动中的微小偏差将导致它不会触发。它们也非常依赖于设备(例如，您需要为每个防火墙设备编写此类规则)。因此，SIM / SEM供应商必须加载数百条这些规则。客户不得不通过启用/禁用和调整它们而遭受痛苦。！

While we are still in 1999, a host IDS like say Dragon Squire, a true wonder of 1990s security technology, scoured logs for things like “FTP:NESSUS-PROBE” and “FTP:USER-NULL-REFUSED.” For this post, I reached deep into my log archives and actually reviewed some ancient (2002) Dragon HIDS logs to refresh my memory, and got into the vibe of that period (no, I didn’t do it on a Blackberry or using Crystal Reports — I am not that dedicated).

当我们还在1999年时，像Dragon Squire这样的主机IDS是1990年代安全技术的真正奇迹，它在日志中搜索了诸如“ FTP：NESSUS-PROBE”和“ FTP：USER-NULL-REFUSED”之类的内容。在这篇文章中，我深入了我的日志档案，并实际上回顾了一些古老的(2002年)Dragon HIDS日志以刷新我的记忆，并进入了那个时期的氛围(不，我没有在Blackberry或使用Crystal上进行此操作报告-我不是那么专心)。

Now fast forward to about 2003–2004 — and the revolution happened! SIEM products unleashed normalized events and event taxonomies. I spent some of that time categorizing device event IDs (where does Windows Event ID 1102 go?) into SIEM taxonomy event types, and then writing detection rules on them. SIEM detection content writing became substantially more fun!

现在快进到2003-2004年左右-革命发生了！ SIEM产品释放了标准化的事件和事件分类法 。我花了一些时间将设备事件ID(Windows事件ID 1102到哪里去了)分类为SIEM分类事件类型，然后在它们上编写检测规则。 SIEM检测内容的编写变得更加有趣！

This huge advance in SIEM gave us the famous correlation rules like “Several Events of The Exploit Category Followed By an Event of Remote Access Category to Same Destination” that delivered generic detection logic across devices. Life was becoming great! These rules were supposed to be a lot more resilient (such as “any Exploit” and “any Remote Access” vs a specific attack and, say, VNC access). They also worked across devices — write it once, was the promise, and then even if you change the type of the firewall you use, your correlation still detects badness.

SIEM的这一巨大进步为我们提供了著名的关联规则，例如“利用类别的多个事件，然后是同一目标的远程访问类别的事件”，该规则在设备之间提供了通用的检测逻辑。生活变得越来越好！这些规则应该具有更大的弹性(例如，“任何漏洞利用”和“任何远程访问”与特定的攻击(例如VNC访问)相比)。他们还可以跨设备工作-只需编写一次，这就是承诺，然后，即使您更改了使用的防火墙的类型，您的关联仍会检测到不良。

Wow, magic! Now you can live (presumably) with dozens of good rules without digging deep into regexes and substrings and device event IDs across 70 system and OS version types deployed. This was (then) perceived as essential progress of security products, like perhaps a horse-and-buggy to a car evolution.

哇，魔术！现在，您可以(大概)遵循许多好的规则，而无需深入研究已部署的70个系统和OS版本类型的正则表达式和子字符串以及设备事件ID。 (当时)这被认为是安全产品的重要进步，就像马车到汽车的进化。

Further, some of us became very hopeful our Common Event Expression (CEE) initiative will take off. So, we worked hard to make a global log taxonomy and schema real and useful (circa 2005).

此外，我们中的一些人对我们的通用事件表达(CEE)计划将开始抱有很大希望。因此，我们努力使全球日志分类法和架构真实而有用(大约2005年)。

But you won’t believe what happened next!

但是您不会相信接下来会发生什么！

Now, let’s fast forward to today — 2020 is almost here. Most of the detection content I see today is in fact written in the 1990s style of exact and narrow matching to raw logs. Look at all the sexy Sigma content, will you? A fellow Network Intelligence enVision SIM user from 1998 will recognized many of the detections! Sure, we have ATT&CK today, but it is about solving a different problem.

现在，让我们前进到今天-2020年即将到来。我今天看到的大多数检测内容实际上都是1990年代与原始日志精确和狭窄匹配的样式编写的。看看所有性感的Sigma内容，对吗？从1998年开始， Network Intelligence enVision SIM的一位用户将认识到许多检测结果！当然，我们今天有ATT＆CK ，但这是解决另一个问题。

An extra bizarre angle here is that as machine learning and analytics rise, the need for clean, structured data rises if we were to crack more security use cases, not just detection. Instead, we just get more data overall, but less data that you can feed your pet ML unicorn with. We need more clean, enriched data, not merely more data!

另一个奇怪的角度是，随着机器学习和分析技术的兴起，如果我们要破解更多的安全用例，而不仅仅是检测，那么对干净，结构化数据的需求也将增加。取而代之的是，我们总体上只会获得更多的数据，但是可以用来喂养宠物ML独角兽的数据却更少。我们需要更多干净，丰富的数据，而不仅仅是更多数据！

To me, this feels like the evolution got us from a horse and buggy to a car, then a better car, then a modern car — and then again a horse and buggy ...

对我来说，感觉就像进化使我们从马车到汽车，再到更好的汽车，再到现代汽车，再到马车……

So, my question is WHY? What happened?

所以，我的问题是为什么？发生了什么？

I’ve been polling a lot of my industry peers about it, ranging from old ArcSight hands that did correlation magic 15 years ago (and who can take a good joke about kurtosis) and people who run detection teams today on modern tools [I am happy to provide shout-outs, please ping me if I missed somebody, because I very likely did due to some of you saying that you want to NOT be mentioned]

我一直在调查很多业内同行，从15年前做过相关魔术的老ArcSight手(可以对峰度讲个好笑话)到今天使用现代工具运行检测团队的人[我是很高兴提供喊叫，如果我想念某人，请ping我，因为由于某些人说你不想被提及，我很可能做到了]

But first, before we get to the answer I finally arrived at, after much agonizing, let’s review some of the things I’ve heard during my data gathering efforts:

但是首先，在苦苦思索之后，让我们回顾一下我最终得出的答案，让我们回顾一下我在数据收集工作中听到的一些事情：

Products that either lack event normalization or do it poorly (or lazily rely on clients to do this work) won the market battle for unrelated reasons (such as overall volume of data collected), and a new generation of SOC analysts have never seen anything else. So they get by with what they have. Let’s call this line of reasoning “the raw search won.”
缺乏事件规范化或做得不好(或懒惰地依靠客户来完成这项工作)的产品由于不相关的原因(例如收集的数据总量)赢得了市场大战，新一代的SOC分析人员再也没有看到其他东西了。这样他们就可以拥有自己的东西。让我们将这种推理方式称为“原始搜索赢了”。
Threat hunters beat up the traditional detection guys because “hunting is cool” and chucked them out of the window. Now, they try to detect the same way they hunt — by searching for random bits of knowledge of the attack they’ve heard of. Let’s call this line of thinking “the hunters won.”
威胁猎人击败了传统的侦查人员，因为“狩猎很酷”，并将他们赶出了窗外。现在，他们尝试通过搜索自己听说过的攻击的随机知识，来检测与狩猎相同的方式。我们称这种思路为“猎人赢了”。
Another thought was that tolerance for “false positives” (FP) has decreased (due to growing talent shortages) and so writing more narrow detections with lower FP rates became more popular (‘“false negatives” be damned — we can just write more rules to cover them’). These narrow rules are also easier to test. Let’s calls this “false positives won.”
另一个想法是，由于人才短缺的增加，对“误报”(FP)的容忍度降低了，因此以较低的FP率编写更狭窄的检测结果变得越来越普遍(““误报””该死—我们可以写更多的规则掩盖他们))。这些狭窄的规则也更容易测试。我们称其为“赢得了误报”。
Another hypothesis was related to the greater diversity of modern threats and also a greater variety of data being collected. This supposedly left the normalized and taxonomized events behind since we needed to detect more things of more types. Let’s call this one “the data/threat diversity won.”
另一个假设与现代威胁的多样性和收集的数据种类更多有关。由于我们需要检测更多类型的事物，因此这可能会遗漏规范化和分类事件。我们称其为“赢得数据/威胁多样性”。

So, what do you think? Are you seeing the same in your detection work?

所以你怎么看？您在检测工作中看到相同的内容吗？

Now, to me all the above explanations left something to be desired — so I kept digging and agonizing. Frankly, they sort of make some sense, but my experience and intuition suggested that the magic was still missing…

现在，对我来说，以上所有解释都给我们留下了一些希望–因此，我一直在苦苦挣扎。坦白说，它们有些道理，但根据我的经验和直觉，魔术仍然不见了……

What do I think really happened? I did arrive at a very sad thought, the one I was definitely in denial about, but the one that ultimately “clicked” and many puzzle pieces slid into place!

我认为真正发生了什么？我确实想到了一个非常悲伤的想法，我绝对否认这一想法，但是最终被“点击”了，许多拼图碎片滑入了位子！

The normalized and taxonomized approach in SIEM never actually worked! It didn’t work back in 2003 when it was invented, and it didn’t work in any year since then. And it still does not work now. It probably cannot work in today’s world unless some things change in a big way.

SIEM中的标准化和分类方法实际上从未奏效！ 它在2003年发明时就没有用过，此后的任何一年都没有用过。 而且它现在仍然无法正常工作。 除非某些事情发生重大变化，否则它在当今世界可能无法正常工作。

When I realized this, I cried a bit. Given how much I invested in building, evangelizing and improving it, then actually trying to globally standardize it (via CEE), it feels kinda sad…

当我意识到这一点时，我哭了一下。鉴于我投入了多少钱来建设，推广和改进它，然后实际上是尝试使其全球标准化(通过CEE )，这让人感到很难过……

Now, is this really true? Sadly, I think so! SIEM event taxonomization is …

现在，这是真的吗？可悲的是，我想是这样！ SIEM事件分类是…

always behind the times and more behind now than ever
永远落后于时代，如今比以往任何时候都落后
inconsistent across events and log sources — for every vendor today
跨事件和日志源不一致—对于今天的每个供应商
remains to be seriously different between vendors — and hence cannot be learned once
供应商之间仍然存在严重的差异-因此无法一次学习
contains an ever-increasing number of errors and omissions that accumulate over time
随着时间的流逝，包含越来越多的错误和遗漏
is impossible to test effectively vs real threats people face today.
无法有效测试当今人们面临的实际威胁。

So, I cannot even say “SIEM event taxonomy is dead”, because it seems like it was never really alive. For example, “Authentication Failure” event category from a SIEM vendor may miss events from a new version of software (such as a new event type introduced in a Windows update), miss events from an uncommon log source (SAP login failed), or miss events erroneously mapped to something else (say to “Other Authentication” category).

因此，我什至不能说“ SIEM事件分类法已死” ，因为它似乎从未真正存在过。例如，来自SIEM供应商的“身份验证失败”事件类别可能会错过新版本软件的事件(例如Windows更新中引入的新事件类型)，错过来自罕见日志源的事件(SAP登录失败)或未命中事件错误地映射到其他事件(例如，“其他身份验证”类别)。

In essence, people write stupid string-matching and regex-based content because they trust it. They do not — en masse — trust the event taxonomies if their lives and breach detections depend on it. And they do.

本质上，人们写愚蠢的字符串匹配和基于正则表达式的内容是因为他们信任它。如果他们的生命和违规检测依赖于事件分类法，他们将不-完全信任事件分类法。他们做到了。

What can we do? Well, I am organizing my thinking about it, so wait for another post, will you?

我们可以做什么？好吧，我正在整理我的想法，所以等下另一个帖子，好吗？