Proability and Bayes’ NET

最新推荐文章于 2022-05-29 20:17:18 发布

@yuqing_wang

最新推荐文章于 2022-05-29 20:17:18 发布

阅读量652

点赞数

分类专栏： Introduction to AI

本文链接：https://blog.csdn.net/weixin_43199124/article/details/110401634

版权

Introduction to AI 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

Probabilistic Inference

compute a desired probabilities from others known probabilities
我们通常计算条件概率
each possible state for the world has its own possibility

model:
joint distribution: table captures the likelihood of each outcome or assignment

inference by enumeration:（IBE）

queue variables: Q
evidence variables: e
hidden variables:h
选择和证据变量一致的，然后sum out消去隐藏变量，最后归一化
drawback：
storage 大，经验估计难

General situation of Uncertainty

obseved variables(evidence)
Agent knows certain things about the state of the world
unobserved variables
Agent needs to reason about other aspects
model
Agent knows something about how the known variables relate to the unknown variables
a probabilistic model is a joint distribution over a set of random variables
assignments are called outcomes
events can be partial assignments or complete assignments
conditional distribution: enumeration with normalization，select the joint probability matching the evidence and normalization
eg. $p(W=s|T=c)=\frac{p(W=s,T=c)}{p(T=c)}$
用 $p (T = c)$ normalization
有时候我们有条件分布，我们想要联合分布，贝叶斯rule
chain rule: $p(x_1,x_2,x_3)=p(x_1)p(x_2|x_1)p(x_3|x_2,x_1)$
independence: is something like from CSPS, a structure
unconditional independence is very rare, conditional independence is our basic and robust assumption
if and only if 条件
$p (x ∣ y, z) = p (x ∣ z)$
$p (x, y ∣ z) = p (x ∣ z) p (y ∣ z)$
存储空间从product 变成了sum

Bayes Nets (representation)

joint probability table： $o(d^n)$ ,存储空间占据过大，而且难以估计

bayes’ net(graphic models):
a directed acyclic graph with a local proobability table
每个node都存储一个条件概率表，conditioned on parents(n+1+1)列
node： can be assigned or unassigned
arc：interactions
each node is conditionally independent of all its ancestors node in the graph, given its parents

build a Bayes’ net

numbering nodes from 1-N
add nodes from smallest number in the graph
add directed link from the existing nodes to the new one if there is an interaction between them(so no cycle)

causality

BN need not actually be causal only represent conditional independence，reflect correlation
be causal will simpler and easier

complexity

space complexity: $o(N*2^{k+1})$ 最多k parents

Bayes Nets(inference)

case: evidence variable, query variable, hidden variable

eliminate variables one by one

为了消去变量x，我们

join all factors involving x
sum out x
factor: an unnormalized probability

inference by enumeration

steps: 1. select the entries consistent with the evidence
2.sum out hidden variable to get joint q and e
3. normalize
drawback: storage, and hard to estimate the probability empirically for multiple variables at a time(limited samples)
time complexity: $o(d^n)$ 把所有变量都消去
space complexity: $o(d^n)$ store the joint distribution
inference: calculating some useful quantity from a joint distribution
enumeration
在这里插入图片描述
factor: an unnormalized probability
elimination

join
elimination: marginalization
interleave join and elimination
if evidence starts out with factors that select that evidence
eliminate variables one by one:（pick a hidden variable H, join all factors mention H, eliminate H）
put all remain variable together and normalize

factor summary：
在这里插入图片描述

在这里插入图片描述
计算复杂度和空间复杂度取决于variable elimination 过程中，largest factor
没有特定顺序的factor

Bayes Nets( sampling)

从分布中生成样本，计算近似的后验概率，要看convergence
inference：computation 比生成样本花费的时间多
learning:get samples from distribution you don’t know

every CPT participates principle

生成样本的分布和联合概率分布相同
prior sampling:
缺点：对于不太可能发生的事件，需要产生大量的sample，会浪费很多的样本
consistent
在这里插入图片描述

rejection sampling:
如果sample不符合我们的evidence，我们不生成，其实只是减少了生成样本的时间在这里插入图片描述

likelihood weighting: (most computionally efficient)
将变量设为证据变量，但这不能保证分布和我们原始分布一致
我们的sample只会和no evidence的乘积一致
因为相当于：
$P\left(Z_{1} \ldots Z_{p}, E_{1} \ldots E_{m}\right)=\prod_{i}^{p} P\left(Z_{i}\right) \mid \text { Parent } s\left(Z_{i}\right)$

解决方法：
sample a value if the value is not an evidence variable, ortherwise, change the weight of the sample by multiplying P(e|q)
e.g.
在这里插入图片描述

evidence 支能影响之后的样本选择，并不能对之前的样本选择产生影响
我们希望每次sample都把evidence考虑在内
Gibbs sampling:
要用证据变量一致的start
首先随机将所有变量赋值，然后repetedly 选取一个变量，对其进行重新赋值，given other 变量的情况下
也可以收敛
这样所有的额variable 产生都基于evidence
resample的时候，我们只用考虑含用resampled part的table

Bayes Nets(D-separation)

in BN, are two variables independent(given evidence)?

causal chains

$\rightarrow y \rightarrow z$

$x, z$ independent? not
$x\perp z|y$

common cause

$\leftarrow y \rightarrow z$

$x, z$ independent? no
$x\perp z|y$

common effect

$\rightarrow y \leftarrow z$

$x\perp z$
$x\perp z|y$ not
observed the descendent of y has the same effect

genenral case and d-seperation

d-seperation： $z_1,\cdots,z_k$ d-seperate x and y , which means $x\bot y|z_1,\cdots,z_k$
在这里插入图片描述
Markov Blanket: a node is conditionally independent of all other nodes in the network,given its parents, children and children’s parents

D-seperation（另一种形式）

path: any consecutive of edges, disregarding their directions
unblock nodes: no collider
collider: head to head(发生信息交换)
rule1: x and y are d-connected, if there is an unblocked path between them
rule2:an unblocked path 经过 obseved z，then d separated
rule3: colliders are observed or descendent, then d-connected

中文意思：没有collider且没有observe路径active，没有collider 的路径被观测 inactive，有collider的路径，collider或collider的descendent 被观测，active

topology limits distributions

given some graph topology, only certain joint distribution can be encode
图结构保证了一定的独立性，kennel存在更多的独立性，full condition可以表示任何的distribution

题目

注意贝叶斯net无环

@yuqing_wang

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Proability and Bayes’ NET

Probability and Bayes’ NETProbabilistic ReasoningGeneral situation of Uncertaintyobseved variables(evidence)Agent knows certain things about the state of the worldunobserved variablesAgent needs to reason about other aspectsmodelAgent knows someth
复制链接

扫一扫