认知的概率模型（ESSLLI教程） - 第三部分译文

3. 推理模式：筛选和解释了

3.1 从因果依赖到统计依赖

我们的概率程序将知识编码成因果模型，通过对因果关系的直觉思考有助于我们理解结构和函数之间的关系。因果关系是本地的、模块的和有向的。两个任意的事件大多数是无关的或独立的，如果它们相关或不独立，这种关系只是太弱而易于被我们的思维所忽略。许多事件是间接相关的：它们只是通过一个因果关系链的几个环节连接在一起，这是一系列中间的、更本地的依赖关系。基本上依赖关系是非对称的：当我们说A引起B，这和B引起A很不同，信息沿着一个因果关系两个方向流动--我们希望学习任何一方的情况，都能让我们了解其他一方--但是因果影响信息流只有一个方向：我们期望改变A将改变B的状态，而不是相反。

让我们再仔细研究一下因果依赖这个概念，A因果上依赖于B是什么意思呢？从概率程序的视角来看认知的话，最基本的因果依赖概念存在于表达这个因果模型的程序结构和它的计算执行流中间。如果计算A需要计算B，那么我们说表达式A因果上依赖于表达式B（再确切一点，是否在不确定B值的前提下不太可能确定A的值）。注意由于表达式只给出了部分顺序关系，所以，该因果依赖关系比时间上顺序关系要弱。（这种因果依赖的概念大致和函数式编程语言的控制流依赖关系概念相同。）例如，看下医疗诊断场景的一个简单变化：

(define samples

(mh-query 200 100

(define smokes (flip 0.2))

(define lung-disease (or (flip 0.001) (and smokes (flip 0.1))))

(define cold (flip 0.02))

(define cough (or (and cold (flip 0.5)) (and lung-disease (flip 0.5)) (flip 0.01)))

(define fever (or (and cold (flip 0.3)) (flip 0.01)))

(define chest-pain (or (and lung-disease (flip 0.2)) (flip 0.01)))

(define shortness-of-breath (or (and lung-disease (flip 0.2)) (flip 0.01)))

(list cold lung-disease)

cough

)

(hist (map first samples) "cold")

(hist (map second samples) "lung-disease")

(hist samples "cold, lung-disease")

这里，咳嗽因果依赖于肺癌和感冒，发烧因果依赖于发烧而不是肺癌，我们看到咳嗽间接因果依赖于吸烟：咳嗽不能直接推导出由抽烟引起，为了评估病人为何咳嗽，我们先要计算肺癌的表达式，再算抽烟的。（“直接“因果依赖的概念有点含糊：我们要说的是咳嗽直接依赖于感冒还是只是在表达式(or (and cold (flip 0.5))中呢？这可以用几种方式来解决，都会有类似直觉效果）如果审视一个程序，因果依赖的结构并不是显而易见的，特别是当函数调用很复杂时。

还有一些更糟糕的场合，例如表达式A是否需要B将取决于表达式C的值，下面写的是一个特殊的带有噪声的逻辑与门程序：

(define C (flip))

(define B (flip))

(define A (if C (if B (flip 0.85) false) false)

A依赖于C，但只在C返回真时才计算B。上面的定义中A依赖于B和依赖于C，我们还可以更精确的说A只在C为真的场合因果依赖于B。

当一个表达式在函数内部时，在程序执行时它会被计算好几次，在这些场合应该说在两个表达式的特定计算值之间存在因果依赖关系。（但是请注意，如果一个特定的A值依赖于一个特定的B值，那么任何其他的A值将取决于一些特定的B值。）

人们经常听到这样的告诫：“相关并不意味着因果关系“。所谓“相关“是指事件或函数之间的另外一种不同的依赖，有时也被称为“统计的依赖“，以区别于“因果依赖“。事实上，我们混淆这些概念的话是要被提醒的，因为实际上他们就是相关的。甚至可以说他们是“因果相关“，在某种意义上说，因果依赖关系引起统计相关性。一般来说，如果A引起B，那么A和B会统计上依赖。当我们观察或推理统计依赖的事件时会发现统计依赖的信息流动是对称的，如果A和B在统计上依赖，那么，学习A会告诉我们一些B的信息，反之亦然。在Church语言中：使用query函数断言A会改变B的期望值，反过来也一样。使用上面的代码可以验证这种信息对称，而不像因果依赖那样。例如，改变感冒这个条件应该会影响咳嗽的条件分布，反之亦然；还可验证由一个间接的因果链连接的统计依赖关系的事件具有对称性，如吸烟和咳嗽。

当然相关性不只是因果关系的对称版本，两个事件可能会统计依赖即使它们之间不存在因果，但只要他们有一个共同的起因（不管直接还是间接的）。

在Church语言模型中，如果两个表达式之间直接或间接互相调用，或者在它们的计算过程中都引用了相同的一些其它表达式，那么，它们就是统计依赖的。

在上面的例子，咳嗽和发烧虽然没有因果依赖，但他们在统计上依赖，因为它们都调用感冒，同样原因胸口痛和气急也一样，它们都调用肺病。在这里，我们可以从程序定义中读出这些事实，但更普遍的，这些关系都可以通过使用query函数推断出来。

成功的因果模型学习和推理通常取决于利用因果关系和相关性的紧密耦合，这是因为通常因果关系是不可观察的，而相关性可以得到观察数据。我们注意到，相关性的模式往往是因果学习的开始，也就是发现是什么原因导致什么发生的。凭借已有的因果模型，该模型隐含的统计依赖能让我们来预测世界的许多方面而不是只能观察。当然真正使用时--也就是操纵事件时--那么我们必须要小心，不能混淆统计依赖和因果依赖。

3.2 从先验依赖到条件依赖

当我们研究观测效果时，因果结构和相关性结构或统计依赖之间的关系会显得特别有趣和微妙。先验统计依赖（有时也称为边际依赖）的事件在某些观察条件下可能成独立的事件，这个现象常常被称为筛选（screening off），有时也叫成特定上下文下的独立。同样的，先验统计独立（边际独立）的事件在其它观察下可能成为不独立的事件，这个现象被称为解释了（explaining away）。筛选和解释了的动态性对我们理解因果模型下的推理模式非常重要—推理和学习。

3.2.1 筛选

“筛选“是在科学推理或者直觉推理中非常普遍的一种统计推理模式。

如果两个事件A和B只是间接的统计相关，中间严格地由一个或多个事件C传递，那么，以C为条件（观察C时）时应给予甲乙统计独立。如果A和B由一个或多个因果链条相连接，所有这些连接链中都有事件集C，或C通常是A、B起因的组成部分，那么，上述现象就能产生。就拿我们这个简单的医疗场景为例，吸烟与咳嗽，胸部疼痛，气短几种症状相关（统计依赖），因果链中间是肺病，我们可以通过以这些症状为条件查询抽烟来看：

(define samples

(mh-query 200 100

(define smokes (flip 0.2))

(define lung-disease (or (flip 0.001) (and smokes (flip 0.1))))

(define cold (flip 0.02))

(define cough (or (and cold (flip 0.5)) (and lung-disease (flip 0.5)) (flip 0.001)))

(define fever (or (and cold (flip 0.3)) (flip 0.01)))

(define chest-pain (or (and lung-disease (flip 0.2)) (flip 0.01)))

(define shortness-of-breath (or (and lung-disease (flip 0.2)) (flip 0.01)))

smokes

(and cough chest-pain shortness-of-breath)

)

(hist samples "smokes")

抽烟的条件概率为0.2远远高于基本概率，因为，观察到所有症状给出了吸烟的有力证据，可以调出一些症状看看该症状的贡献度，例如，以咳嗽和胸痛为条件或者只以咳嗽为条件进行查询，你可以看到观察到的症状越少抽烟的条件概率就越低。

现在，假设我们以知识因果链的中间环节：肺病为条件，如果我们知道肺病的条件是真还是假的话，那么，对那些不同症状的依赖还存在吗？在下面的查询中，尝试添加和删除各种症状（咳嗽，胸痛，气短），但保持肺病不变：

(define samples
(mh-query 500 100
    (define smokes (flip 0.2))
    (define lung-disease (or (flip 0.001) (and smokes (flip 0.1))))
    (define cold (flip 0.02))
    (define cough (or (and cold (flip 0.5)) (and lung-disease (flip 0.5)) (flip 0.001)))
    (define fever (or (and cold (flip 0.3)) (flip 0.01)))
    (define chest-pain (or (and lung-disease (flip 0.2)) (flip 0.01)))
    (define shortness-of-breath (or (and lung-disease (flip 0.2)) (flip 0.01)))
     smokes
     (and lung-disease
          (and cough chest-pain shortness-of-breath)
      )
)
)
(hist samples "smokes")

保持不是肺病不变，再实验尝试添加和删除各种症状（咳嗽，胸痛，气短）：

(define samples
(mh-query 500 100
    (define smokes (flip 0.2))
    (define lung-disease (or (flip 0.001) (and smokes (flip 0.1))))
    (define cold (flip 0.02))
    (define cough (or (and cold (flip 0.5)) (and lung-disease (flip 0.5)) (flip 0.001)))
    (define fever (or (and cold (flip 0.3)) (flip 0.01)))
    (define chest-pain (or (and lung-disease (flip 0.2)) (flip 0.01)))
    (define shortness-of-breath (or (and lung-disease (flip 0.2)) (flip 0.01)))
     smokes
     (and (not lung-disease)
          (and cough chest-pain shortness-of-breath)
      )
)
)
(hist samples "smokes")

您现在应该已知道看到一个人是否有肺病对吸烟条件推理影响 – 如果一个人有肺部疾病，那么，他比没有肺病的人被判定为吸烟者的可能性大幅增加 –

但是知道是否有肺病并不能影响证明胸部疼痛，呼吸或咳嗽这些症状的存在。我们说中间条件肺病筛选掉了咳嗽，胸部疼痛和呼吸急促这些更远一点的判定吸烟的根本原因。

这里所定义的筛选实际是一个纯粹的统计现象，当我们观察事件A和B中间的事件C时，在我们的模型中A和B还是因果依赖的：只是关于A和B的一些知识状态变成了不相关。因果现象也是类似的，如果我们能操作或干预一个因果系统，将C设置成一些已知的值，那么A和B都变成统计独立和因果独立了，可以使用上述Church程序用（define lung-cancer）试验出来。

3.2.2 解释了

“解释了“指的是统计推断的一个补充模式，比“筛选”更微妙。如果两个事件A和B统计独立（因而也是因果独立），但它们都是由一个或多个事件C的原因，那么，在以（观察）C为条件时一般会看到A和B统计依赖。和筛选一样，我们这里只讨论统计依赖，而不是因果依赖：当我们观察C，A和B在我们的模型中依然因果独立，只是关于A和B的知识状态变成为相关了。

下面是我们在医疗场景中的一个具体例子。

先验的，感冒和肺病是在统计和因果关系上都是独立的。但是它们都是一些症状的诱因，如咳嗽，如果我们观察到了咳嗽症状，那么感冒和肺病就会变得统计依赖，也就是说，知道一个病人是否有感冒或肺病，在它们的共同作用下导致了咳嗽，将会带来其它一些条件信息，我们可以说感冒和肺癌是边际（或无条件）独立，但却在咳嗽条件下依赖。

为了说明这一点，在咳嗽条件下，观察感冒，肺病的概率的变化。

(define samples

(mh-query 500 100

(define smokes (flip 0.2))

(define lung-disease (or (flip 0.001) (and smokes (flip 0.1))))

(define cold (flip 0.02))

(define cough (or (and cold (flip 0.5)) (and lung-disease (flip 0.5)) (flip 0.001)))

(define fever (or (and cold (flip 0.3)) (flip 0.01)))

(define chest-pain (or (and lung-disease (flip 0.2)) (flip 0.01)))

(define shortness-of-breath (or (and lung-disease (flip 0.2)) (flip 0.01)))

(list cold lung-disease)

cough

)

(hist (map first samples) "cold")

(hist (map second samples) "lung-disease")

(hist samples "cold, lung-disease")

感冒和肺病现在的基本概率使得它们变得更有可能性：感冒的概率从2％提高到约40％；肺病从1000个中1个增加到百分之几的概率。咳嗽症状下，感冒是更合理的解释因为它的先验概率更高。

现在假设我们知道，病人没有感冒。

(define samples

(mh-query 500 100

(define smokes (flip 0.2))

(define lung-disease (or (flip 0.001) (and smokes (flip 0.1))))

(define cold (flip 0.02))

(define cough (or (and cold (flip 0.5)) (and lung-disease (flip 0.5)) (flip 0.001)))

(define fever (or (and cold (flip 0.3)) (flip 0.01)))

(define chest-pain (or (and lung-disease (flip 0.2)) (flip 0.01)))

(define shortness-of-breath (or (and lung-disease (flip 0.2)) (flip 0.01)))

(list cold lung-disease)

(and cough (not cold))

)

(hist (map first samples) "cold")

(hist (map second samples) "lung-disease")

(hist samples "cold, lung-disease")

现在有肺病的概率大大增加，相反，如果我们观察到病人确实有感冒，那么患肺病的概率又回到非常低的1/1000的基本概率。

(define samples

(mh-query 500 100

(define smokes (flip 0.2))

(define lung-disease (or (flip 0.001) (and smokes (flip 0.1))))

(define cold (flip 0.02))

(define cough (or (and cold (flip 0.5)) (and lung-disease (flip 0.5)) (flip 0.001)))

(define fever (or (and cold (flip 0.3)) (flip 0.01)))

(define chest-pain (or (and lung-disease (flip 0.2)) (flip 0.01)))

(define shortness-of-breath (or (and lung-disease (flip 0.2)) (flip 0.01)))

(list cold lung-disease)

(and cough cold)

)

(hist (map first samples) "cold")

(hist (map second samples) "lung-disease")

(hist samples "cold, lung-disease")

这是在咳嗽信息下，肺病和感冒之间的条件依赖：知道病人感冒“解释了“咳嗽现象，所以将另一个选择肺病可能性降得很低--大约回至1000人中1人；如果另一方面，我们知道病人没有感冒的话，那么，可以解释咳嗽现象的最大可能是肺病，这大大提高了肺病的条件概率。作为练习，如果我们去掉咳嗽，观察有没有感冒对肺病的影响，这种影响就变成纯粹的两个起因的条件概率了。

如果我们考虑这个病人咳嗽还吸烟，那么，寒冷和肺病的患者的可能大致相当也能解释了。

(define samples

(mh-query 500 100

(define smokes (flip 0.2))

(define lung-disease (or (flip 0.001) (and smokes (flip 0.1))))

(define cold (flip 0.02))

(define cough (or (and cold (flip 0.5)) (and lung-disease (flip 0.5)) (flip 0.001)))

(define fever (or (and cold (flip 0.3)) (flip 0.01)))

(define chest-pain (or (and lung-disease (flip 0.2)) (flip 0.01)))

(define shortness-of-breath (or (and lung-disease (flip 0.2)) (flip 0.01)))

(list cold lung-disease)

(and smokes cough)

)

(hist (map first samples) "cold")

(hist (map second samples) "lung-disease")

(hist samples "cold, lung-disease")

如果再加上患者有胸痛，那么，肺病和感冒相比变成一个更有可能的条件，这些设置将使“解释了”的效果更明显。

修改上面的程序，在咳嗽、吸烟、也许还有胸痛下，观察病人是否有感冒，然后，比较这些条件：

(and smokes cough) with (and smokes cough cold) or

                         (and smokes cough (not cold))

(and smokes chest-pain cough) with (and smokes chest-pain cough cold) or

                                   (and smokes chest-pain cough (not cold))

我们注意到病人是否感冒的信息可以影响是否有肺病的信念。

“解释了”的影响可以间接的，对于肺病除了咳嗽之外的另一个起因感冒，我们可以简单地用感冒的症状发烧来替代。在Church程序中比较这些条件，看看“解释”发烧和肺病之间的条件依赖。

(and smokes chest-pain cough) with (and smokes chest-pain cough fever) or

                                   (and smokes chest-pain cough (not fever))

在这种情况下，知道该病人有没有发烧对我们认定该病人是否有肺病的相当重要...尽管发烧本身并不是肺病的症状，而且它们之间没有因果关系。

我们可以用下面的Church query原理图来表示一般的“解释了”现象：

(query

  (define a ...)

  (define b ...)

...

  (define data (... a... b...))

  (and (equal? data some-value) (equal? a some-other-value)))

我们已经用两个独立变量a和b来定义我们的数据data，如果我们以data和a为条件，那么，b的后验分布将依赖于a：观测到a的信息会改变我们对b的结论。

在因果推理中，“解释了”最典型的模式是反向相关：当观测到某个现象时，引起这个现象的两个原因的概率就会上升，但它们是条件反向相关的，也就是观测到一个原因的信息会使另外一个原因的可能性降低。

然而，这种成对的原因在一般的事件中并不总是反向相关的，这取决于这些原因之间的相互作用的性质。

当原因的相互作用是一个大致的析取或叠加的形式时，会出现“解释了”的反向相关情况：例如，由A或者B引起； 由A和B持续影响加起来超过某个阈值而引起。下面简单的数学例子展示了其它一些可能性，假设我们在观察两个整数的和，而每个整数的随机值为0到9的均匀分布。

(define take-sample (lambda () 
   (rejection-query
     (define A (uniform-draw (iota 10)))
     (define B (uniform-draw (iota 10)))
     (list A B)
     (equal? (+ A B) 9)
   )))

(define sample (repeat 500 take-sample))

(hist sample "A, B")

(define r-port (open-r-port))

(r r-port "png" "corrplot.png")

(r r-port "plot" (map first sample) (map second sample) "main=" "Sampled values for A and B, conditioned on A + B = 9" "xlab=" "A" "ylab=" "B"  "pch=" 19)

(close-r-port r-port)

这是以A和B为条件进行推理的一个完美反向相关实例。但是假设我们观察到的A和B是相等的。

(define take-sample (lambda () 
   (rejection-query
     (define A (uniform-draw (iota 10)))
     (define B (uniform-draw (iota 10)))
     (list A B)
     (equal? A B)
   )))

(define sample (repeat 500 take-sample))

(hist sample "A, B")

(define r-port (open-r-port))

(r r-port "png" "corrplot.png")

(r r-port "plot" (map first sample) (map second sample) "main=" "Sampled values for A and B, conditioned on A = B" "xlab=" "A" "ylab=" "B"  "pch=" 19)

(close-r-port r-port)

当然，现在A和B从先验独立变成在条件分布上完美的相关了。试试这些其它条件，来看看先验独立的变成条件依赖的其它可能模式。

(< (abs (- A B)) 2)

(and (>= (+ A B) 9) (<= (+ A B) 11))

(equal? 3 (abs (- A B)))

(equal? 3 (modulo (- A B) 10))

(equal? (modulo A 2) (modulo B 2))

(equal? (modulo A 5) (modulo B 5))

3.3 非单调推理

“解释了”是概率推理的一个重要现象，因为它是非单调推理的一个例子。在形式逻辑中，一个理论被认为是单调的，如果增加一个假设（或公式）的话，从来不会减少以前假设成立的结果集。大多数（例如一阶）传统逻辑是单调的，但人类的推理似乎并不如此。举例来说，如果我告诉你翠儿是一只鸟，那么你会认为它能飞，如果我现在告诉你翠儿是一个鸵鸟的话，那么你会收回它能飞的结论。多年来许多非单调逻辑被引入人类的推理模型中，其中基于贝叶斯网络的概率推理被认为是人工智能的第一个重要途径，因为它可以精确地捕捉到这些推理的模式。

考虑单调性的另一个角度是看看在一个特定的命题中，当我们获得了更多相关信息时，我们的判断的轨迹。在传统的逻辑，一个想法只有三个状态：真，假，和未知的（既无法证明是真还是假），当我们更了解世界时，我们的意见需要保持逻辑一致性，因此，任何命题只能从未知状态变到真或假的状态。就是我们对任何结论的“信心”都只能增长的（只有这样，才能从未知到真或假的巨大飞跃）。

相反，在概率方法中，有一种相信程度的概念，我们可以把相信作为我们的判断离均匀分布多远的度量，或如何接近0或1两个极端。概率推理和传统逻辑不一样，我们对一个命题的相信程度既可以增加也可以减少。在下来的例子中我们会看到，即便相当简单的概率模型，当条件集扩展时，会诱发复杂的“解释了”现象，多次扭转我们对一个命题的相信程度的调整方向。

3.4 例子：特征属性

通常我们要对不同类型的实体和它们之间的相互作用进行推理，各实体之间关系的高度相关将产生非常具有挑战性的“解释了”问题。虽然这种推理对于人来说很自然，但是，却是难以计算的，这表明大脑有专门的特长来解决这些重要问题。

一个熟悉的例子来自于对课堂上学生成功或失败原因的推理，试想一下自己处于一个有趣的旁观者角度--父母，另一位老师，辅导员或一个大学招生人员-- 在思考这些条件推理，如果一个学生没有通过考试，你能说出他为什么会失败？也许他没有做功课，也许考试不公平，或者也许他只是不走运？

(define num-samples 1000)
(define samples
(mh-query
    num-samples 10
    (define exam-fair (flip .8))
    (define does-homework (flip .8))
    (define pass? (flip (if exam-fair
                            (if does-homework 0.9 0.6)
                            (if does-homework 0.4 0.2))))
   (list does-homework exam-fair)
    (not pass?)
)
)
(hist samples "Joint: Student Does Homework?, Exam Fair?")
(hist (map first samples) "Student Does Homework")
(hist (map second samples) "Exam Fair")

现在，假设你知道几个学生和几个考试的一些情况？

(define num-samples 1000)
(define samples
(mh-query
    num-samples 10
    (define exam-fair-prior .8)
    (define does-homework-prior .8)
    (define exam-fair? (mem (lambda (exam) (flip exam-fair-prior))))
    (define does-homework? (mem (lambda (student) (flip does-homework-prior))))
    (define (pass? student exam) (flip (if (exam-fair? exam)
                                           (if (does-homework? student) 0.9 0.6)
                                           (if (does-homework? student) 0.4 0.2))))
   (list (does-homework? 'bill) (exam-fair? 'exam1))
    (not (pass? 'bill 'exam1))
   )
)
(hist samples "Joint: Bill Does His Homework?, Exam 1 Fair?")
(hist (map first samples) "Bill Does His Homework?")
(hist (map second samples) "Exam 1 Fair?")

刚开始我们观察到比尔未能通过考试1，而且先验地，我们假设大多数学生做作业而且大多数考试是公平的，但上述观察使得要么学生没有学习或者考试不公平变成可能。

补充比尔和考试1的观察数据或其他学生或考试的数据，看看关于比尔和考试1的条件推理如何变化。

粘贴下面每个数据集来代替现在的条件（not （pass 'bill 'exam1））。试着解释每阶段推理结果的动态变化，这些每个大的新数据集对你对比尔和考试1的直觉判断有何影响？

(and (not (pass? 'bill 'exam1)) (not (pass? 'bill 'exam2)))

 (and (not (pass? 'bill 'exam1))

      (not (pass? 'mary 'exam1))

      (not (pass? 'tim 'exam1)))

(and (not (pass? 'bill 'exam1)) (not (pass? 'bill 'exam2))

      (not (pass? 'mary 'exam1))

      (not (pass? 'tim 'exam1)))

 (and (not (pass? 'bill 'exam1))

      (not (pass? 'mary 'exam1)) (pass? 'mary 'exam2) (pass? 'mary 'exam3) (pass? 'mary 'exam4) (pass? 'mary 'exam5)

      (not (pass? 'tim 'exam1)) (pass? 'tim 'exam2) (pass? 'tim 'exam3) (pass? 'tim 'exam4) (pass? 'tim 'exam5))

 (and (not (pass? 'bill 'exam1))

      (pass? 'mary 'exam1)

      (pass? 'tim 'exam1))

 (and (not (pass? 'bill 'exam1))

      (pass? 'mary 'exam1) (pass? 'mary 'exam2) (pass? 'mary 'exam3) (pass? 'mary 'exam4) (pass? 'mary 'exam5)

      (pass? 'tim 'exam1) (pass? 'tim 'exam2) (pass? 'tim 'exam3) (pass? 'tim 'exam4) (pass? 'tim 'exam5))

 (and (not (pass? 'bill 'exam1)) (not (pass? 'bill 'exam2))

      (pass? 'mary 'exam1) (pass? 'mary 'exam2) (pass? 'mary 'exam3) (pass? 'mary 'exam4) (pass? 'mary 'exam5)

      (pass? 'tim 'exam1) (pass? 'tim 'exam2) (pass? 'tim 'exam3) (pass? 'tim 'exam4) (pass? 'tim 'exam5))

 (and (not (pass? 'bill 'exam1)) (not (pass? 'bill 'exam2)) (pass? 'bill 'exam3) (pass? 'bill 'exam4) (pass? 'bill 'exam5)

      (not (pass? 'mary 'exam1)) (not (pass? 'mary 'exam2)) (not (pass? 'mary 'exam3)) (not (pass? 'mary 'exam4)) (not (pass? 'mary 'exam5))

      (not (pass? 'tim 'exam1)) (not (pass? 'tim 'exam2)) (not (pass? 'tim 'exam3)) (not (pass? 'tim 'exam4)) (not (pass? 'tim 'exam5)))

3.5 模块化例子：表面亮度和颜色的视觉感知

视觉感知有着丰富的条件推理，包括筛选和“解释了”。中级视觉研究人员已经构建了一些非常精彩的表面结构的视觉感知表演，如丹克斯滕、大卫科尼尔、特德阿德尔森、巴特安德森、肯中山等的工作。最引人注目的是有时条件推理会改变或冲突视觉处理的模块化结构。神经科学家已经逐步了解了处理不同视觉刺激的主要视觉系统--颜色，形状，运动，立体—是在大脑的不同区域。这一观点至少与认知心理学家早期设想的--不同的刺激体不是统一处理，而是模块化方式处理的--相一致。然而，视觉的核心是将投射到视网膜上的各种光线构建成一个统一和连贯的三维场景，也就是说，视觉是一个大规模的因果推理，它的输出是对一些对象丰富描述，包括表面特性和对象之间的关系，但这些并不是大脑直接掌握的，却是视网膜刺激的真正原因，解决这个难题需要在大量潜在的筛选和“解释了”影响下整合一个图像的许多外观特征。

在视觉上，物体表面的亮度取决于两个因素，表面亮度（光线怎样照射它）和它的反射，实际的亮度是两个因素共同作用，因此，亮度本质上是模凌两可的。视觉系统需要能确定哪些比例的亮度是由于反射而哪些比例是由于现场照明。下面是由特德阿德尔森发现而闻名的检查阴影错觉的著名演示。

演示的效果是图像中标有A的方块和B的方块实际上是相同的灰色阴影。

圆柱体的存在提供了证据表明，方块B的光照其实比方块A要少，这样我们感知到方块B的反射较强因为它的亮度和方块A一样尽管它的光照要少。下面程序实现了此场景的简单版本。

(define noisy=
   (lambda (target value variance)
     (= 0 (gaussian (- target value) variance))))
(define samples
   (mh-query
    100 100
    (define reflectance (gaussian 1 1))
    (define illumination (gaussian 3 0.5))
    (define luminance (* reflectance illumination))
     reflectance
     (noisy= 3.0 luminance 0.1)
   ))
(truehist samples "Reflectance")

这里我们引入第三类原始随机过程--高斯--它在flip（输出二进制值）和beta（输出在区间[0,1]的值）基础上输出实数，实现了著名的高斯分布或正态分布，它有两个参数：一个均值和一个方差。

此外注意，我们在这里使用抽象的lambda算法定义了辅助函数noisy=，我们将在后面再详细讨论这个算法。现在，让我们以圆柱体为条件看看。

(define noisy=
   (lambda (target value variance)
     (= 0 (gaussian (- target value) variance))))
(define samples
   (mh-query
    100 100
    (define reflectance (gaussian 1 1))
    (define illumination (gaussian 3 0.5))
    (define luminance (* reflectance illumination))
    reflectance
    (and (noisy= 3.0 luminance 0.1) (noisy= 0.5 illumination 0.1))
   ))
(truehist samples "Reflectance")

条件推理考虑了通过生成过程产生的所有的不同路径。一旦我们以数据出来的路径为条件，那么，两个不同因果路径的变量可以因此相互依赖。最重要的一点是，随机变量reflectance和illumination在生成模型中是独立的，但我们以luminance为条件时，它们会相互依赖：改变一个会影响到另一个的可能性。这种现象对认知科学的建模有着重要影响，虽然我们的知识世界模型和语言模型有隐含的条件独立性，但只要我们开始使用模型对一些数据（如语言分析或学习）进行条件推理，前面模块化隔离的变量就能变成依赖的。

3.6 其它视觉的例子（还未完成）

柯尔斯顿“有色马赫卡”演示了一种有“解释了”和筛选现象的视觉感知的完美例子，以及以辅助变量为条件进行推理时两种模式之间的切换。

http://vision.psych.umn.edu/users/kersten/kersten-lab/Mutual_illumination/BlojKerstenHurlbertDemo99.pdf

这取决于我们如何感知一个中间折叠起来的表面的几何形状--无论凹还是凸—感知到的表面颜色会变化，因为视觉系统会折扣（“解释了”）或忽视（筛选）表面之间反射的影响。

两个圆柱体的柯尔斯顿演示是另一个很好的“解释了”例子。左、右两边图像的灰色阴影图案是一样的，但左边的阴影被感知有反射差异，而右边（两个圆柱体）同样的阴影被感知其反射率是均匀分布的。

转载于:https://www.cnblogs.com/limitplus/archive/2011/05/01/2034117.html