贝叶斯网络建模

最新推荐文章于 2024-03-28 23:14:39 发布

weixin_26713521

最新推荐文章于 2024-03-28 23:14:39 发布

阅读量1.9k

点赞数

文章标签： python java

原文链接：https://towardsdatascience.com/modeling-with-bayesian-networks-c7ebf28a8b6b

版权

I am feeling sick. Fever. Cough. Stuffy nose. And it’s wintertime. Do I have the flu? Likely. Plus I have muscle pain. More likely. 我感到恶心。发热。咳嗽。鼻塞。现在是冬天。我有流感吗？可能吧另外我有肌肉疼痛。更倾向于。 Bayesian netwo...

摘要由CSDN通过智能技术生成

I am feeling sick. Fever. Cough. Stuffy nose. And it’s wintertime. Do I have the flu? Likely. Plus I have muscle pain. More likely.

我感到恶心。发热。咳嗽。鼻塞。现在是冬天。我有流感吗？可能吧另外我有肌肉疼痛。更倾向于。

Bayesian networks are great for these types of inferences. We have variables, some whose values have been fixed. We are interested in the probabilities of some free variables given these fixed values.

贝叶斯网络非常适合这些类型的推断。我们有变量，有些变量的值是固定的。给定这些固定值，我们对一些自由变量的概率感兴趣。

In our example, we want the probability that we have the flu, given some symptoms we have observed, and the season we are in.

在我们的示例中，鉴于我们观察到的某些症状以及我们所处的季节，我们希望获得流感的可能性。

So far it looks like reasoning with conditional probabilities. Is there more to it? Yes. A lot more. Let’s scale up this example and it will come out.

到目前为止，它看起来像是带有条件概率的推理。还有更多吗？是。多很多。让我们扩大这个例子，它就会出来。

Towards A Large-scale Bayes Network

迈向大规模贝叶斯网络

Imagine that our network models every possible symptom, every possible disease, outcomes of every possible medical test, and every possible external factor that might potentially affect the probability of some disease. External factors break down into behavioral ones (smoking, being a couch potato, eating too much), physiological ones ( weight, gender, age), and others. For good measure, let’s also throw in treatments. And side-effects.

想象一下，我们的网络对每种可能的症状，每种可能的疾病，每种可能的医学检查的结果以及每种可能影响某种疾病发生概率的外部因素进行建模。外部因素可分为行为因素(吸烟，吃土豆，进食过多)，生理因素(体重，性别，年龄)等。好的，让我们也进行一些治疗。和副作用。

By now there is enough and useful medical knowledge to capture tens of thousands of variables (at the very least) and their interactions. For any set of symptoms, together with the values of some of the behavioral, physiological, and other external factors, we could estimate the probabilities of various diseases. And more. For a given disease, we could ask it to give us the most likely symptoms. And way more. Such as I have a cough and high fever but the flu has been diagnosed out, what other diseases are likely? For a given diagnosis, and our particular symptoms, and possibly additional factors such as our gender and age, we could ask it to recommend treatments.

到目前为止，已有足够且有用的医学知识可以捕获成千上万的变量(至少)和它们之间的相互作用。对于任何一组症状，以及某些行为，生理和其他外部因素的价值，我们可以估计各种疾病的可能性。和更多。对于给定的疾病，我们可以要求它给我们最可能的症状。还有更多。例如我咳嗽和高烧，但已经诊断出流感，还有什么其他疾病可能 ？对于给定的诊断，我们的特殊症状以及可能的其他因素，例如我们的性别和年龄，我们可以要求其推荐治疗方法。

Now we are getting somewhere. How does all this magic work? This is what we will explore here.

现在我们到了某个地方。 所有这些魔术如何起作用？ 这就是我们将在这里探讨的内容。

Connectivity

连接性

First question, where does the network come in? In modeling the interactions among the tens of thousands of variables.

第一个问题，网络从哪里来？在建模中数以万计的变量之间的相互作用。

Modeling all possible interactions among that-many variables is nearly impossible. It is the network that gives us a mechanism to cut through this complexity. By letting us specify which interactions to model. The aim is to seek a model that is rich enough. But not overly complex.

对这多个变量之间所有可能的相互作用进行建模几乎是不可能的。正是网络为我们提供了一种消除这种复杂性的机制。通过让我们指定要建模的交互。目的是寻求足够丰富的模型。但不要过于复杂。

Speaking of interactions, how do we decide which ones to model? Typically via domain knowledge. In our case, leveraging the collective knowledge of the medical field acquired over millennia of clinical practice and research.

说到交互，我们如何确定要建模的模型？通常通过领域知识。在我们的案例中，利用了几千年来临床实践和研究获得的医学领域的集体知识。

What would our Bayes net look like? Structurally, a giant directed graph with nodes for the various symptoms, diseases, medical tests, behavioral factors, physiological factors, and treatment options. With suitably chosen (or inferred) arcs to model significant interactions among them. Such as among specific symptoms and specific diseases.

我们的贝叶斯网会是什么样？在结构上，一个巨型有向图，其节点包含各种症状，疾病，医学检查，行为因素，生理因素和治疗选择。使用适当选择(或推断)的弧来模拟它们之间的重要交互。例如特定的症状和特定的疾病。

Connectivity Refined

完善的连通性

A Bayes network is structurally a directed graph, an acyclic one at that. Directed means that edges have a direction to them, which is why they are called arcs. Acyclic means there are no directed cycles. Here is an example of a directed cycle: A → B → C → A.

贝叶斯网络在结构上是有向图，此时是无环图。导演意味着边缘有一个方向给他们，这就是为什么他们被称为弧。 非循环意味着没有定向循环。这是一个有向循环的示例： A → B → C → A 。

Apart from the acyclicity constraint, the modeler has full control over what nodes to connect with arcs and how to orient them. That said, in complex real-world use cases such as the one we are discussing here (medical diagnosis) there is an appealing guiding principle.

除了非循环性约束之外，建模者还可以完全控制要与弧连接的节点以及如何定向弧。就是说，在复杂的实际用例(例如我们在这里讨论的用例)(医学诊断)中，有一个吸引人的指导原则。

Choose arcs to model direct causes. Orient them in the direction of causality.

选择弧以模拟直接原因。 使他们朝向因果关系的方向 。

So if A is a direct cause of B, we would add the arc A → B. Such a network is called a causal Bayes network.

因此，如果A是B的直接原因，我们将添加弧A → B 。这样的网络称为因果贝叶斯网络。

A causal network’s structure is only as accurate as its variables and the fidelity of the causal relationships. For instance, the truth might be that A causes B and B causes C. But we might not even know of B’s existence. So the best we would be able to do is to model this via the arc A → C.

因果网络的结构仅取决于其变量和因果关系的保真度。例如，事实可能是A导致B且B导致C。但是我们甚至可能不知道B的存在。因此，我们最好的办法是通过弧A → C对此进行建模。

Causal Modeling

因果模型

Okay, so let’s think causally in the medical setting. This is what we come up with.

好吧，让我们在医疗环境中考虑一下。这就是我们想出的。

Variable Type A causes Variable Type B        Exampledisease         causes symptom            flu causes you to coughbehavior        causes disease            smoking causes lung cancerphysiological   causes disease            aging “causes” various    
factor                                    diseasestreatment       "causes" disease          chemotherapy reduces 
                                          cancertreatment       causes side-effect        chemotherapy causes 
                                          hair-loss

Before closing this section, let’s note that we shouldn’t worry too much about getting a few causal arcs wrong. (Of course, we prefer not to.) The consequences are not severe. In fact, we’ll likely have quite a new non-causal arcs in the network anyhow. To model correlations whose links to causation are unclear or non-existent. In fact, the network can’t even distinguish between casual and non-casual arcs. Not in our use case.

在关闭本节之前，让我们注意，我们不要太担心弄错一些因果关系。 (当然，我们不愿意这样做。)后果并不严重。实际上，无论如何，我们很可能会在网络中出现一个新的非因果弧。建模与因果关系不清楚或不存在的关联。实际上，网络甚至无法区分临时弧和非临时弧。不在我们的用例中。

Take this example. Say A and B are strongly correlated. Say you thought A causes B, so modeled this with the arc A → B. But you were wrong. Adding this arc is still a good thing, as it models the correlation. The next section discusses non-causal arcs in more detail.

举这个例子。说A和B是高度相关的。假设您认为A导致B ，所以用弧A → B对此建模。但是你错了。添加弧线仍然是一件好事，因为它可以对相关性进行建模。下一节将更详细地讨论非因果弧。

Non-causal Arcs

非因果弧

Causality is a compelling guiding principle in the network’s design. However, it is not sufficient. That is, adding non-causal arcs can improve the model further.

因果关系是网络设计中令人信服的指导原则。但是，这还不够。也就是说，添加非因果弧可以进一步改善模型。

Consider correlations among variables. Such as among a set of symptoms or a set of diseases. Causal relationships within the set may not be known or even exist. We do want to model the correlations though. So we should add suitable “non-causal” arcs.

考虑变量之间的相关性。如一组症状或一组疾病。集合内的因果关系可能未知，甚至不存在。我们确实想对相关性进行建模。因此，我们应该添加合适的“非因果”弧。

Here is a simple example. Say there is strong belief or evidence that dry cough and irritated throat are correlated. Say these are the only two variables in the network. Connecting them with an arc in either direction will capture this correlation. Leaving the arc out will treat them as independent. We don’t want that.

这是一个简单的例子。说有强烈的信念或证据表明干咳和喉咙发炎是相关的。假设这些是网络中仅有的两个变量。将它们与任一方向的弧形连接将捕获此相关性。放任不管，将它们视为独立的。我们不想要那个。

The Network’s Master Equation

网络的主要方程式

At some juncture, just like a picture can reveal a vista, so can math. We are at that point. So here goes.

在某个关头，就像图片可以展现远景一样，数学也可以展现远景。我们到了这一点。所以去。

Formally, a Bayes Network is a directed acyclic graph on n nodes. The nodes, call them X1, X2, …, Xn, model random variables. The arcs model interactions among them.

形式上，贝叶斯网络是n个节点上的有向无环图。节点称它们为X 1， X 2，…， X n，对随机变

最低0.47元/天解锁文章

weixin_26713521

关注

0
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
贝叶斯网络建模

I am feeling sick. Fever. Cough. Stuffy nose. And it’s wintertime. Do I have the flu? Likely. Plus I have muscle pain. More likely. 我感到恶心。发热。咳嗽。鼻塞。现在是冬天。我有流感吗？可能吧另外我有肌肉疼痛。更倾向于。 Bayesian netwo...
复制链接

扫一扫