Gaussian Process(高斯过程) GPSS暑校笔记总计四(英文版)——多输出高斯过程

本文介绍了多输出高斯过程,包括工作场景、不同过程之间的依赖、内在协区域化模型(ICM)、半参数潜因子模型(SLFM)和线性模型的协区域化(LMC)。通过样本分析和模型解释,展示了如何处理多个输出之间的关系和依赖,以及如何构建和应用不同的模型来解决实际问题。
摘要由CSDN通过智能技术生成

Multiple-Output Gaussian Process

Working Situation

ZEgmxe.png

As the picture shows, we want to learn from the three sensors (with complete signal information) to recover the fourth one.

Dependencies between processes

Multiple-independent Output GP

ZEgMqA.png
f 1 ( x ) ∼ G P ( 0 , k 1 ( x , x ′ ) )                      f 2 ( x ) ∼ G P ( 0 , k 2 ( x , x ′ ) ) D 1 = { ( x i , 1 , y 1 ( x i , 2 ) ) ∣ i = 1 , … , N 1 }                      D 2 = { ( x i , 2 , y 2 ( x i , 2 ) ) ∣ i = 1 , … , N 2 } y 1 ∼ N ( 0 , K 1 + σ 1 2 )                      y 2 ∼ N ( 0 , K 2 + σ 2 2 l ) \begin{aligned} f_{1}(\mathbf{x}) \sim \mathcal{G} \mathcal{P}\left(0, k_{1}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)\right) &\;\;\;\;\;\;\;\;\;\; f_{2}(\mathbf{x}) \sim \mathcal{G} \mathcal{P}\left(0, k_{2}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)\right) \\ D_{1}=\left\{\left(\mathbf{x}_{i, 1}, y_{1}\left(\mathbf{x}_{i, 2}\right)\right) | i=1, \ldots, N_{1}\right\} & \;\;\;\;\;\;\;\;\;\;\mathcal{D}_{2}=\left\{\left(\mathbf{x}_{i, 2}, y_{2}\left(\mathbf{x}_{i, 2}\right)\right) | i=1, \ldots, N_{2}\right\} \\ \mathbf{y}_{1} \sim \mathcal{N}\left(\mathbf{0}, \mathbf{K}_{1}+\sigma_{1}^{2}\right) & \;\;\;\;\;\;\;\;\;\;\mathbf{y}_{2} \sim \mathcal{N}\left(\mathbf{0}, \mathbf{K}_{2}+\sigma_{2}^{2} \mathbf{l}\right) \end{aligned} f1(x)GP(0,k1(x,x))D1={(xi,1,y1(xi,2))i=1,,N1}y1N(0,K1+σ12)f2(x)GP(0,k2(x,x))D2={(xi,2,y2(xi,2))i=1,,N2}y2N(0,K2+σ22l)

[ y 1 y 2 ] ∼ N ( [ 0 0 ] , [ K 1 0 0 K 2 ] + [ σ 1 2 l 0 0 σ 2 2 l ] ) \left[\begin{array}{l}{\mathbf{y}_{1}} \\ {\mathbf{y}_{2}}\end{array}\right] \sim \mathcal{N}\left(\left[\begin{array}{l}{\mathbf{0}} \\ {\mathbf{0}}\end{array}\right],\left[\begin{array}{cc}{\mathbf{K}_{1}} & {\mathbf{0}} \\ {\mathbf{0}} & {\mathbf{K}_{2}}\end{array}\right]+\left[\begin{array}{cc}{\sigma_{1}^{2} \mathbf{l}} & {\mathbf{0}} \\ {\mathbf{0}} & {\sigma_{2}^{2} \mathbf{l}}\end{array}\right]\right) [y1y2]N([00],[K100K2]+[σ12l00σ22l])

How to find the independences for kernel design

K f , f = [ K 1 ? ? K 2 ] \mathbf{K}_{\mathbf{f}, \mathbf{f}}=\left[\begin{array}{cc}{\mathbf{K}_{1}} & {?} \\ {?} & {\mathbf{K}_{2}}\end{array}\right] Kf,f=[K1??K2]

Build a cross-covariance function c o v [ f 1 ( x ) , f 2 ( x ′ ) ] cov[f_1(x), f_2(x^{'})] cov[f1(x),f2(x)] such that K f , f K_{f,f} Kf,f is positive semi-definite.

Different input configurations of data

ZEcz2F.md.png
D 1 = { ( x i , f 1 ( x i ) ) i = 1 N }            D 1 = { ( x i , 1 , f 1 ( x i , 1 ) ) i = 1 N 1 } D 2 = { ( x i , f 2 ( x i ) ) i = 1 N }            D 2 = { ( x i , 2 , f 2 ( x i , 2 ) ) i = 1 N 2 } \begin{array}{ll}{\mathcal{D}_{1}=\left\{\left(\mathbf{x}_{i}, f_{1}\left(\mathbf{x}_{i}\right)\right)_{i=1}^{N}\right\}} &\;\;\;\;\; {\mathcal{D}_{1}=\left\{\left(\mathbf{x}_{i, 1}, f_{1}\left(\mathbf{x}_{i, 1}\right)\right)_{i=1}^{N_{1}}\right\}} \\ {\mathcal{D}_{2}=\left\{\left(\mathbf{x}_{i}, f_{2}\left(\mathbf{x}_{i}\right)\right)_{i=1}^{N}\right\}} & \;\;\;\;\;{\mathcal{D}_{2}=\left\{\left(\mathbf{x}_{i, 2}, f_{2}\left(\mathbf{x}_{i, 2}\right)\right)_{i=1}^{N_{2}}\right\}}\end{array} D1={(xi,f1(xi))i=1N}D2={(xi,f2(xi))i=1N}D1={(xi,1,f1(xi,1))i=1N1}D2={(xi,2,f2(xi,2))i=1N2}

Intrinsic Coregionalization Model

Two outputs

Sample Once

Consider two outputs $f_1(x) $ f 2 ( x ) f_{2}(x) f2(x) with x ∈ R p x\in \mathcal{R}^{p} xRp.

  1. Sample from a GP u ( x ) ∼ G P ( 0 , k ( x , x ′ ) ) u(\mathbf{x}) \sim \mathcal{G P}\left(0, k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)\right) u(x)GP(0,k(x,x)) to obtain u 1 ( x ) u^{1}(\mathbf{x}) u1(x)
  2. Obtain $f_1(x) $ and f 2 ( x ) f_{2}(x) f2(x) by linearly transforming:

f 1 ( x ) = a 1 1 u 1 ( x ) f 2 ( x ) = a 2 1 u 1 ( x ) \begin{aligned} f_{1}(\mathbf{x}) &=a_{1}^{1} u^{1}(\mathbf{x}) \\ f_{2}(\mathbf{x}) &=a_{2}^{1} u^{1}(\mathbf{x}) \end{aligned} f1(x)f2(x)=a11u1(x)=a21u1(x)

For a fixed value x x x. we can group f 1 ( x ) f_1(x) f1(x) and f 2 ( x ) f_2(x) f2(x) in a vector:
f ( x ) = [ f 1 ( x ) f 2 ( x ) ] \mathbf{f}(\mathbf{x})=\left[\begin{array}{l}{f_{1}(\mathbf{x})} \\ {f_{2}(\mathbf{x})}\end{array}\right] f(x)=[f1(x)f2(x)]
and this vector will be refer as a v e c t o r − v a l u e d    f u n c t i o n \bf{vector-valued \; function} vectorvaluedfunction.

The covariance for f ( x ) f(x) f(x) is computed as:
cov ⁡ ( f ( x ) , f ( x ′ ) ) = E { f ( x ) [ f ( x ′ ) ] ⊤ } − E { f ( x ) } [ E { f ( x ′ ) } ] ⊤ \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right)=\mathbb{E}\left\{\mathbf{f}(\mathbf{x})\left[\mathbf{f}\left(\mathbf{x}^{\prime}\right)\right]^{\top}\right\}-\mathbb{E}\{\mathbf{f}(\mathbf{x})\}\left[\mathbb{E}\left\{\mathbf{f}\left(\mathbf{x}^{\prime}\right)\right\}\right]^{\top} cov(f(x),f(x))=E{f(x)[f(x)]}E{f(x)}[E{f(x)}]

E { [ f 1 ( x ) f 2 ( x ) ] [ f 1 ( x ′ ) f 2 ( x ′ ) ] } = [ E { f 1 ( x ) f 1 ( x ′ ) } E { f 1 ( x ) f 2 ( x ′ ) } E { f 2 ( x ) f 1 ( x ′ ) } E { f 2 ( x ) f 2 ( x ′ ) } ] E { f 1 ( x ) f 1 ( x ′ ) } = E { a 1 1 u 1 ( x ) a 1 1 u 1 ( x ′ ) } = ( a 1 1 ) 2 E { u 1 ( x ) u 1 ( x ′ ) } E { f 1 ( x ) f 2 ( x ′ ) } = E { a 1 1 u 1 ( x ) a 2 1 ( x ′ ) } = a 1 1 a 2 1 E { u 1 ( x ) u 1 ( x ′ ) } E { f 2 ( x ) f 2 ( x ′ ) } = E { a 2 1 u 1 ( x ) a 2 1 u 1 ( x ′ ) } = ( a 2 1 ) 2 E { u 1 ( x ) u 1 ( x ′ ) } \mathbb{E}\left\{\left[\begin{array}{c}{f_{1}(\mathbf{x})} \\ {f_{2}(\mathbf{x})}\end{array}\right]\left[\begin{array}{ll}{f_{1}\left(\mathbf{x}^{\prime}\right)} & {f_{2}\left(\mathbf{x}^{\prime}\right) ]}\end{array}\right\}=\left[\begin{array}{cc}{\mathbb{E}\left\{f_{1}(\mathbf{x}) f_{1}\left(\mathbf{x}^{\prime}\right)\right\}} & {\mathbb{E}\left\{f_{1}(\mathbf{x}) f_{2}\left(\mathbf{x}^{\prime}\right)\right\}} \\ {\mathbb{E}\left\{f_{2}(\mathbf{x}) f_{1}\left(\mathbf{x}^{\prime}\right)\right\}} & {\mathbb{E}\left\{f_{2}(\mathbf{x}) f_{2}\left(\mathbf{x}^{\prime}\right)\right\}}\end{array}\right]\right.\\ \begin{aligned} \mathbb{E}\left\{f_{1}(\mathbf{x}) f_{1}\left(\mathbf{x}^{\prime}\right)\right\} &=\mathbb{E}\left\{a_{1}^{1} u^{1}(\mathbf{x}) a_{1}^{1} u^{1}\left(\mathbf{x}^{\prime}\right)\right\}=\left(a_{1}^{1}\right)^{2} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\} \\ \mathbb{E}\left\{f_{1}(\mathbf{x}) f_{2}\left(\mathbf{x}^{\prime}\right)\right\} &=\mathbb{E}\left\{a_{1}^{1} u^{1}(\mathbf{x}) a_{2}^{1}\left(\mathbf{x}^{\prime}\right)\right\}=a_{1}^{1} a_{2}^{1} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\} \\ \mathbb{E}\left\{f_{2}(\mathbf{x}) f_{2}\left(\mathbf{x}^{\prime}\right)\right\} &=\mathbb{E}\left\{a_{2}^{1} u^{1}(\mathbf{x}) a_{2}^{1} u^{1}\left(\mathbf{x}^{\prime}\right)\right\}=\left(a_{2}^{1}\right)^{2} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\} \end{aligned} E{[f1(x)f2(x)][f1(x)f2(x)]}=[E{f1(x)f1(x)}E{f2(x)f1(x)}E{f1(x)f2(x)}E{f2(x)f2(x)}]E{f1(x)f1(x)}E{f1(x)f2(x)}E{f2(x)f2(x)}=E{a11u1(x)a11u1(x)}=(a11)2E{u1(x)u1(x)}=E{a11u1(x)a21(x)}=a11a21E{u1(x)u1(x)}=E{a21u1(x)a21u1(x)}=(a21)2E{u1(x)u1(x)}

So that term could be written as:
E { f ( x ) [ f ( x ′ ) ] ⊤ } = [ ( a 1 1 ) 2 E { u 1 ( x ) u 1 ( x ′ ) } a 1 1 a 2 1 E { u 1 ( x ) u 1 ( x ′ ) } a 1 a 2 E { u 1 ( x ) u 1 ( x ′ ) } ( a 2 1 ) 2 E { u 1 ( x ) u 1 ( x ′ ) } ] = [ ( a 1 1 ) 2 a 1 1 a 2 1 a 1 1 a 2 1 ( a 2 1 ) 2 ] E { u 1 ( x ) u 1 ( x ′ ) } \mathbb{E}\left\{\mathbf{f}(\mathbf{x})\left[\mathbf{f}\left(\mathbf{x}^{\prime}\right)\right]^{\top}\right\} =\left[\begin{array}{cc}{\left(a_{1}^{1}\right)^{2} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}} & {a_{1}^{1} a_{2}^{1} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}} \\ {a^{1} a^{2} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}} & {\left(a_{2}^{1}\right)^{2} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}}\end{array}\right]\\ =\left[\begin{array}{cc}{\left(a_{1}^{1}\right)^{2}} & {a_{1}^{1} a_{2}^{1}} \\{a_{1}^{1} a_{2}^{1}} & {\left(a_{2}^{1}\right)^{2}}\end{array}\right] \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\} E{f(x)[f(x)]}=[(a11)2E{u1(x)u1(x)}a1a2E{u1(x)u1(x)}a11a21E{u1(x)u1(x)}(a21)2E{u1(x)u1(x)}]=[(a11)2a11a21a11a21(a21)2]E{u1(x)u1(x)}
The term E { f ( x ) } \mathbb{E}\{\mathbf{f}(\mathbf{x})\} E{f(x)} is computed as:
E { [ f 1 ( x ) f 2 ( x ) ] } = [ E { f 1 ( x ) } E { f 1 ( x ) } ] = [ E { a 1 1 u 1 ( x ) } E { a 2 1 u 1 ( x ) } ] ] = [ a 1 1 a 2 1 ] E { u 1 ( x ) } \mathbb{E}\left\{\left[\begin{array}{c}{f_{1}(\mathbf{x})} \\ {f_{2}(\mathbf{x})}\end{array}\right]\right\}=\left[\begin{array}{c}{\mathbb{E}\left\{f_{1}(\mathbf{x})\right\}} \\ {\mathbb{E}\left\{f_{1}(\mathbf{x})\right\}}\end{array}\right]=\left[\begin{array}{c}{\mathbb{E}\left\{a_{1}^{1} u^{1}(\mathbf{x})\right\}} \\ {\mathbb{E}\left\{a_{2}^{1} u^{1}(\mathbf{x})\right\}}\end{array}\right] ]=\left[\begin{array}{c}{a_{1}^{1}} \\ {a_{2}^{1}}\end{array}\right] \mathbb{E}\left\{u^{1}(\mathbf{x})\right\} E{[f1(x)f2(x)]}=[E{f1(x)}E{f1(x)}]=[E{a11u1(x)}E{a21u1(x)}]]=[a11a21]E{u1(x)}
Putting them together, the covariance for f ( x ′ ) f(x^{'}) f(x) follows as:
[ ( a 1 1 ) 2 a 1 1 a 2 1 a 1 1 a 2 1 ( a 2 1 ) 2 ] E { u 1 ( x ) u 1 ( x ′ ) } − [ a 1 1 a 2 1 ] [ a 1 1 a 2 1 ] { u 1 ( x ) } E { u 1 ( x ′ ) } \left[\begin{array}{cc}{\left(a_{1}^{1}\right)^{2}} & {a_{1}^{1} a_{2}^{1}} \\ {a_{1}^{1} a_{2}^{1}} & {\left(a_{2}^{1}\right)^{2}}\end{array}\right] \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}-\left[\begin{array}{c}{a_{1}^{1}} \\ {a_{2}^{1}}\end{array}\right]\left[\begin{array}{cc}{a_{1}^{1}} & {a_{2}^{1} ]}\end{array}\left\{u^{1}(\mathbf{x})\right\} \mathbb{E}\left\{u^{1}\left(\mathbf{x}^{\prime}\right)\right\}\right. [(a11)2a11a21a11a21(a21)2]E{u1(x)u1(x)}[a11a21][a11a21]{u1(x)}E{u1(x)}
Defining a = [ a 1 1 a 2 1 ] ⊤ \mathbf{a}=\left[\begin{array}{ll}{a_{1}^{1}} & {a_{2}^{1}}\end{array}\right]^{\top} a=[a11a21],
cov ⁡ ( f ( x ) , f ( x ′ ) ) = a a ⊤ E { u 1 ( x ) u 1 ( x ′ ) } − a a ⊤ E { u 1 ( x ) } E { u 1 ( x ′ ) } = a a ⊤ [ E { u 1 ( x ) u 1 ( x ′ ) } − E { u 1 ( x ) } E { u 1 ( x ′ ) } ] ⎵ k ( x , x ′ ) = a a ⊤ k ( x , x ′ ) \begin{aligned} \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right) &=\mathbf{a a}^{\top} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}-\mathbf{a a}^{\top} \mathbb{E}\left\{u^{1}(\mathbf{x})\right\} \mathbb{E}\left\{u^{1}\left(\mathbf{x}^{\prime}\right)\right\} \\ &=\mathbf{a a}^{\top} \underbrace{\left[\mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}-\mathbb{E}\left\{u^{1}(\mathbf{x})\right\} \mathbb{E}\left\{u^{1}\left(\mathbf{x}^{\prime}\right)\right\}\right]}_{k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)} \\ &=\mathbf{a} \mathbf{a}^{\top} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right) \end{aligned} cov(f(x),f(x))=aaE{u1(x)u1(x)}aaE{u1(x)}E{u1(x)}=aak(x,x) [E{u1(x)u1(x)}E{u1(x)}E{u1(x)}]=aak(x,x)
We define B = a a ⊤ \mathbf{B}=\mathbf{a a}^{\top} B=aa, leading to
cov ⁡ ( f ( x ) , f ( x ′ ) ) = B k ( x , x ′ ) = [ b 11 b 12 b 21 b 22 ] k ( x , x ′ ) \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right)=\mathbf{B} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\left[\begin{array}{ll}{b_{11}} & {b_{12}} \\ {b_{21}} & {b_{22}}\end{array}\right] k\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov(f(x),f(x))=Bk(x,x)=[b11b21b12b22]k(x,x)
and the B \bf{B} B has rank one, since it is the result of the multiplication of two column-vector.

Sample Twice

Sample twice from a GP u ( x ) ∼ G P ( 0 , k ( x , x ′ ) ) u(\mathbf{x}) \sim \mathcal{G} \mathcal{P}\left(0, k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)\right) u(x)GP(0,k(x,x)) to obtain u 1 ( x )  and  u 2 ( x ) u^{1}(\mathbf{x}) \text { and } u^{2}(\mathbf{x}) u1(x) and u2(x).

Adding a scaled transformation.:
f 1 ( x ) = a 1 1 u 1 ( x ) + a 1 2 u 2 ( x ) f 2 ( x ) = a 2 1 u 1 ( x ) + a 2 2 u 2 ( x ) \begin{array}{l}{f_{1}(\mathbf{x})=a_{1}^{1} u^{1}(\mathbf{x})+a_{1}^{2} u^{2}(\mathbf{x})} \\ {f_{2}(\mathbf{x})=a_{2}^{1} u^{1}(\mathbf{x})+a_{2}^{2} u^{2}(\mathbf{x})}\end{array} f1(x)=a11u1(x)+a12u2(x)f2(x)=a21u1(x)+a22u2(x)
ZEcx8U.md.png**

Notice that the u 1 u_1 u1 and u 2 u_2 u2 are independent, although they share the same covariance k k k.
f ( x ) = [ ( a 1 1 ) a 1 2 a 2 1 ( a 2 2 ) ] [ u 1 u 2 ] \mathbf{f}(\mathbf{x}) = \left[\begin{array}{cc}{\left(a_{1}^{1}\right)^{}} & {a_{1}^{2} } \\ {a_{2}^{1} } & {\left(a_{2}^{2}\right)^{}}\end{array}\right] \left[\begin{array}{l}{u^{1}} \\ {u^{2}}\end{array}\right] f(x)=[(a11)a21a12(a22)][u1u2]
The vector-valued function can be written as f ( x ) f(x) f(x), where a 1 = [ a 1 1      a 2 1 ] ⊤  and  a 2 = [ a 1 2      a 2 2 ] ⊤ \mathbf{a}^{1}=\left[a_{1}^{1 } \;\;a_{2}^{1}\right]^{\top} \text { and } \mathbf{a}^{2}=\left[a_{1}^{2}\;\; a_{2}^{2}\right]^{\top} a1=[a11a21] and a2=[a12a22]

The covariance for f ( x ) f(x) f(x) is computed as:
cov ⁡ ( f ( x ) , f ( x ′ ) ) = a 1 ( a 1 ) ⊤ cov ⁡ ( u 1 ( x ) , u 1 ( x ′ ) ) + a 2 ( a 2 ) ⊤ cov ⁡ ( u 2 ( x ) , u 2 ( x ′ ) ) = a 1 ( a 1 ) ⊤ k ( x , x ′ ) + a 2 ( a 2 ) ⊤ k ( x , x ′ ) = [ a 1 ( a 1 ) ⊤ + a 2 ( a 2 ) ⊤ ] k ( x , x ′ ) \begin{aligned} \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right) &=\mathbf{a}^{1}\left(\mathbf{a}^{1}\right)^{\top} \operatorname{cov}\left(u^{1}(\mathbf{x}), u^{1}\left(\mathbf{x}^{\prime}\right)\right)+\mathbf{a}^{2}\left(\mathbf{a}^{2}\right)^{\top} \operatorname{cov}\left(u^{2}(\mathbf{x}), u^{2}\left(\mathbf{x}^{\prime}\right)\right) \\ &=\mathbf{a}^{1}\left(\mathbf{a}^{1}\right)^{\top} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)+\mathbf{a}^{2}\left(\mathbf{a}^{2}\right)^{\top} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right) \\ &=\left[\mathbf{a}^{1}\left(\mathbf{a}^{1}\right)^{\top}+\mathbf{a}^{2}\left(\mathbf{a}^{2}\right)^{\top}\right] k\left(\mathbf{x}, \mathbf{x}^{\prime}\right) \end{aligned} cov(f(x),f(x))=a1(a1)cov(u1(x),u1(x))+a2(a2)cov(u2(x),u2(x))=a1(a1)k(x,x)+a2(a2)k(x,x)=[a1(a1)+a2(a2)]k(x,x)
notice that u 1 u_1 u1 and u 2 u_2 u2 are independent, so their variance could be added directly.

we define B = a 1 ( a 1 ) ⊤ + a 2 ( a 2 ) ⊤ \mathbf{B}=\mathbf{a}^{1}\left(\mathbf{a}^{1}\right)^{\top}+\mathbf{a}^{2}\left(\mathbf{a}^{2}\right)^{\top} B=a1(a1)+a2(a2), leading to:
cov ⁡ ( f ( x ) , f ( x ′ ) ) = B k ( x , x ′ ) = [ b 11 b 12 b 21 b 22 ] k ( x , x ′ ) \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right)=\mathbf{B} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\left[\begin{array}{ll}{b_{11}} & {b_{12}} \\ {b_{21}} & {b_{22}}\end{array}\right] k\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov(f(x),f(x))=Bk(x,x)=[b11b21b12b22]k(x,x)
Notice that B B B has rank two.

Observed Data:

ZEg9KJ.md.png
[ f 1 f 2 ] = [ f 1 ( x 1 ) ⋮ f 1 ( x N ) f 2 ( x 1 ) ⋮ f 2 ( x N ) ] ∼ N ( [ 0 0 ] , [ b 11 K b 12 K b 21 K b 22 K ] ) \left[\begin{array}{c}{\mathbf{f}_{1}} \\ {\mathbf{f}_{2}}\end{array}\right]=\left[\begin{array}{c}{f_{1}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{1}\left(\mathbf{x}_{N}\right)} \\ {f_{2}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{2}\left(\mathbf{x}_{N}\right)}\end{array}\right] \sim \mathcal{N}\left(\left[\begin{array}{l}{\mathbf{0}} \\ {\mathbf{0}}\end{array}\right],\left[\begin{array}{cc}{b_{11} \mathbf{K}} & {b_{12} \mathbf{K}} \\ {b_{21} \mathbf{K}} & {b_{22} \mathbf{K}}\end{array}\right]\right) [f1f2]=f1(x1)f1(xN)f2(x1)f2(xN)N([00],[b11Kb21Kb12Kb22K])
The matrix k ∈ R N ∗ N \bf{k} \in \mathcal{R}^{N*N} kRNN has elements k ( x i , x j ) k(x_i,x_j) k(xi,xj).

If we use Kronecker product we would get:
[ f 1 f 2 ] = [ f 1 ( x 1 ) ⋮ f 1 ( x N ) f 2 ( x 1 ) ⋮ f 2 ( x N ) ] ∼ N ( [ 0 0 ] , B ⊗ K ) \left[\begin{array}{c}{\mathbf{f}_{1}} \\ {\mathbf{f}_{2}}\end{array}\right]=\left[\begin{array}{c}{f_{1}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{1}\left(\mathbf{x}_{N}\right)} \\ {f_{2}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{2}\left(\mathbf{x}_{N}\right)}\end{array}\right] \sim \mathcal{N}\left(\left[\begin{array}{l}{\mathbf{0}} \\ {\mathbf{0}}\end{array}\right], \mathbf{B} \otimes \mathbf{K}\right) [f1f2]=f1(x1)f1(xN)f2(x1)f2(xN)N([00],BK)

General Case

Consider a set of functions { f d ( x ) } d = 1 D \left\{f_{d}(\mathbf{x})\right\}_{d=1}^{D} {fd(x)}d=1D.

In the ICM,
f d ( x ) = ∑ i = 1 R a d i u i ( x ) f_{d}(\mathbf{x})=\sum_{i=1}^{R} a_{d}^{i} u^{i}(\mathbf{x}) fd(x)=i=1Radiui(x)
where the functions u i ( x ) u_i(x) ui(x) are GPs sampled independently, and share the same covariance function k ( x , x ′ ) k(x, x^{'}) k(x,x).

For f ( x ) = [ f 1 ( x ) ⋯ f D ( x ) ] ⊤ \mathbf{f}(\mathbf{x})=\left[f_{1}(\mathbf{x}) \cdots f_{D}(\mathbf{x})\right]^{\top} f(x)=[f1(x)fD(x)], the covariance is given as:
cov ⁡ [ f ( x ) , f ( x ′ ) ] = A A ⊤ k ( x , x ′ ) = B k ( x , x ′ ) \operatorname{cov}\left[\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right]=\mathbf{A} \mathbf{A}^{\top} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\mathbf{B} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov[f(x),f(x)]=AAk(x,x)=Bk(x,x)
where
A = [ a 1 a 2 ⋯ a R ] \mathbf{A}=\left[\mathbf{a}^{1} \mathbf{a}^{2} \cdots \mathbf{a}^{R}\right] A=[a1a2aR]
and the Rank of B B B is given by R R R.

ICM: autokrigeability

If the outputs are considered to be noise-free, prediction using the ICM under an isotopic data case is equivalent to independent prediction over each output. This circumstance is also known as autokrigeability.

The prove:

Assume that we only have two outputs: f 1 , f 2 f_1,f_2 f1,f2

the predicated mean could be written as:
μ = K f ∗ , f ( K f , f ) − 1 f K f , f = B ⊗ K \mu = K_{f_{*},f} (K_{f,f})^{-1}f\\ K_{f,f} = B \otimes K μ=Kf,f(Kf,f)1fKf,f=BK

μ = B ⊗ K ∗ ( B ⊗ K ) − 1 f = B ⊗ K ∗ ( B − 1 ⊗ K − 1 ) f = B B − 1 ⊗ K ∗ K − 1 f = I ⊗ K ∗ K − 1 f = [ K ∗ K − 1 0 0 K ∗ K − 1 ] [ f 1 f 2 ] \begin{aligned} \mu &= B \otimes K_{*} (B \otimes K)^{-1} f\\ &= B \otimes K_{*} (B^{-1} \otimes K^{-1})f\\ &= BB^{-1}\otimes K_{*}K^{-1}f\\ &=I \otimes K_{*}K^{-1}f \\ &=\begin{bmatrix} K_{*}K^{-1} & 0\\ 0 & K_{*}K^{-1} \end{bmatrix}\begin{bmatrix} f_{1}\\ f_{2} \end{bmatrix}\end{aligned} μ=BK(BK)1f=BK(B1K1)f=BB1KK1f=IKK1f=[KK100KK1][f1f2]

it means, the prediction of f 1 f_{1} f1 only depends on the data set for f 1 f_{1} f1

Semiparametric Latent Factor Model (SLFM)

ICM uses R samples u i ( x ) u^{i}(x) ui(x) from u ( x ) u(x) u(x) with the same covariance function. SLFM uses Q samples from u q u_{q} uq processes with different covariance functions.

Two Outputs

  1. Sample from a GP G P ( 0 , k 1 ( x , x ′ ) ) \mathcal{G P}\left(0, k_{1}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)\right) GP(0,k1(x,x)) to obtain u 1 ( x ) u_1(x) u1(x).

  2. Sample from a GP G P ( 0 , k 2 ( x , x ′ ) ) \mathcal{G P}\left(0, k_{2}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)\right) GP(0,k2(x,x)) to obtain u 2 ( x ) u_2(x) u2(x).

  3. Adding a scaled versions:
    f 1 ( x ) = a 1 , 1 u 1 ( x ) + a 1 , 2 u 2 ( x ) f 2 ( x ) = a 2 , 1 u 1 ( x ) + a 2 , 2 u 2 ( x ) \begin{array}{l}{f_{1}(\mathbf{x})=a_{1,1} u_{1}(\mathbf{x})+a_{1,2} u_{2}(\mathbf{x})} \\ {f_{2}(\mathbf{x})=a_{2,1} u_{1}(\mathbf{x})+a_{2,2} u_{2}(\mathbf{x})}\end{array} f1(x)=a1,1u1(x)+a1,2u2(x)f2(x)=a2,1u1(x)+a2,2u2(x)

ZEcvCT.md.png

Similar, it can be written as:
f ( x ) = a 1 u 1 ( x ) + a 2 u 2 ( x ) \mathbf{f}(\mathbf{x})=\mathbf{a}_{1} u_{1}(\mathbf{x})+\mathbf{a}_{2} u_{2}(\mathbf{x}) f(x)=a1u1(x)+a2u2(x)
with a 1 = [ a 1 , 1 a 2 , 1 ] ⊤  and  a 2 = [ a 1 , 2 a 2 , 2 ] ⊤ \mathbf{a}_{1}=\left[a_{1,1} a_{2,1}\right]^{\top} \text { and } \mathbf{a}_{2}=\left[a_{1,2} a_{2,2}\right]^{\top} a1=[a1,1a2,1] and a2=[a1,2a2,2]

The covariance for f ( x ) f(x) f(x) is computed as:
cov ⁡ ( f ( x ) , f ( x ′ ) ) = a 1 ( a 1 ) ⊤ cov ⁡ ( u 1 ( x ) , u 1 ( x ′ ) ) + a 2 ( a 2 ) ⊤ cov ⁡ ( u 2 ( x ) , u 2 ( x ′ ) ) = a 1 ( a 1 ) ⊤ k 1 ( x , x ′ ) + a 2 ( a 2 ) ⊤ k 2 ( x , x ′ ) \begin{aligned} \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right) &=\mathbf{a}_{1}\left(\mathbf{a}_{1}\right)^{\top} \operatorname{cov}\left(u_{1}(\mathbf{x}), u_{1}\left(\mathbf{x}^{\prime}\right)\right)+\mathbf{a}_{2}\left(\mathbf{a}_{2}\right)^{\top} \operatorname{cov}\left(u_{2}(\mathbf{x}), u_{2}\left(\mathbf{x}^{\prime}\right)\right) \\ &=\mathbf{a}_{1}\left(\mathbf{a}_{1}\right)^{\top} k_{1}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)+\mathbf{a}_{2}\left(\mathbf{a}_{2}\right)^{\top} k_{2}\left(\mathbf{x}, \mathbf{x}^{\prime}\right) \end{aligned} cov(f(x),f(x))=a1(a1)cov(u1(x),u1(x))+a2(a2)cov(u2(x),u2(x))=a1(a1)k1(x,x)+a2(a2)k2(x,x)
We define B 1 = a 1 ( a 1 ) ⊤  and  B 2 = a 2 ( a 2 ) ⊤ \mathbf{B}_{1}=\mathbf{a}_{1}\left(\mathbf{a}_{1}\right)^{\top} \text { and } \mathbf{B}_{2}=\mathbf{a}_{2}\left(\mathbf{a}_{2}\right)^{\top} B1=a1(a1) and B2=a2(a2), leading to:
cov ⁡ ( f ( x ) , f ( x ′ ) ) = B 1 k 1 ( x , x ′ ) + B 2 k 2 ( x , x ′ ) \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right)=\mathbf{B}_{1} k_{1}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)+\mathbf{B}_{2} k_{2}\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov(f(x),f(x))=B1k1(x,x)+B2k2(x,x)
Notice that $B_{1} $ and B 2 B_{2} B2 have rank one.

ZEcX5V.png
[ f 1 f 2 ] = [ f 1 ( x 1 ) ⋮ f 1 ( x N ) f 2 ( x 1 ) ⋮ f 2 ( x N ) ] ∼ N ( [ 0 0 ] , B 1 ⊗ K 1 + B 2 ⊗ K 2 ) \left[\begin{array}{c}{\mathbf{f}_{1}} \\ {\mathbf{f}_{2}}\end{array}\right]=\left[\begin{array}{c}{f_{1}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{1}\left(\mathbf{x}_{N}\right)} \\ {f_{2}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{2}\left(\mathbf{x}_{N}\right)}\end{array}\right] \sim \mathcal{N}\left(\left[\begin{array}{l}{\mathbf{0}} \\ {\mathbf{0}}\end{array}\right], \mathbf{B}_{1} \otimes \mathbf{K}_{1}+\mathbf{B}_{2} \otimes \mathbf{K}_{2}\right) [f1f2]=f1(x1)f1(xN)f2(x1)f2(xN)N([00],B1K1+B2K2)

General Case:

Consider a set of functions { f d ( x ) } d = 1 D \left\{f_{d}(\mathbf{x})\right\}_{d=1}^{D} {fd(x)}d=1D

In the SLFM,
f d ( x ) = ∑ q = 1 Q a d , q u q ( x ) f_{d}(\mathbf{x})=\sum_{q=1}^{Q} a_{d, q} u_{q}(\mathbf{x}) fd(x)=q=1Qad,quq(x)
where the functions u q ( x ) u_{q}(x) uq(x) are GPs with covariance functions k q ( x , x ′ ) k_{q}(x,x^{'}) kq(x,x).

For f ( x ) = [ f 1 ( x ) ⋯ f D ( x ) ] ⊤ \mathbf{f}(\mathbf{x})=\left[f_{1}(\mathbf{x}) \cdots f_{D}(\mathbf{x})\right]^{\top} f(x)=[f1(x)fD(x)], the covariance is given as:
cov ⁡ [ f ( x ) , f ( x ′ ) ] = ∑ q = 1 Q A q A q ⊤ k q ( x , x ′ ) = ∑ q = 1 Q B q k q ( x , x ′ ) \operatorname{cov}\left[\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right]=\sum_{q=1}^{Q} \mathbf{A}_{q} \mathbf{A}_{q}^{\top} k_{q}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\sum_{q=1}^{Q} \mathbf{B}_{q} k_{q}\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov[f(x),f(x)]=q=1QAqAqkq(x,x)=q=1QBqkq(x,x)
where A q = a q A_{q} = a_{q} Aq=aq.

The rank of each B q B_{q} Bq is one.

Linear model of coregionalization (LMC)

The LMC generalizes the ICM and the SLFM allowing several independent samples from GPs with different covariances.

Consider a set of functions { f d ( x ) } d = 1 D \left\{f_{d}(\mathbf{x})\right\}_{d=1}^{D} {fd(x)}d=1D
f d ( x ) = ∑ q = 1 Q ∑ i = 1 R q a d , q i u q i ( x ) f_{d}(\mathbf{x})=\sum_{q=1}^{Q} \sum_{i=1}^{R_{q}} a_{d, q}^{i} u_{q}^{i}(\mathbf{x}) fd(x)=q=1Qi=1Rqad,qiuqi(x)
where the functions u q i u_{q}^{i} uqi are GPs with zero means and covariance functions:
cov ⁡ [ u q i ( x ) , u q ′ i ′ ( x ′ ) ] = k q ( x , x ′ ) \operatorname{cov}\left[u_{q}^{i}(\mathbf{x}), u_{q^{\prime}}^{i^{\prime}}\left(\mathbf{x}^{\prime}\right)\right]=k_{q}\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov[uqi(x),uqi(x)]=kq(x,x)
if i = i ′ i = i^{'} i=i and q = q ′ q = q^{'} q=q

There are Q Q Q groups of samples. For each group, there are R q R_{q} Rq samples obtained independently from the same GP with covariance k q ( x , x ′ ) k_q(x,x^{'}) kq(x,x).

ZEgSv4.md.png

The LMC corresponds to the sum of Q ICMs.

Suppose we have D = 2, Q = 2, and R q R_q Rq=2. According to LMC:
f 1 ( x ) = a 1 , 1 1 u 1 1 ( x ) + a 1 , 1 2 u 1 2 ( x ) + a 1 , 2 1 u 2 1 ( x ) + a 1 , 2 2 u 2 2 ( x ) f 2 ( x ) = a 2 , 1 1 u 1 1 ( x ) + a 2 , 1 2 u 1 2 ( x ) + a 2 , 2 1 u 2 1 ( x ) + a 2 , 2 2 u 2 2 ( x ) \begin{array}{l}{f_{1}(\mathbf{x})=a_{1,1}^{1} u_{1}^{1}(\mathbf{x})+a_{1,1}^{2} u_{1}^{2}(\mathbf{x})+a_{1,2}^{1} u_{2}^{1}(\mathbf{x})+a_{1,2}^{2} u_{2}^{2}(\mathbf{x})} \\ {f_{2}(\mathbf{x})=a_{2,1}^{1} u_{1}^{1}(\mathbf{x})+a_{2,1}^{2} u_{1}^{2}(\mathbf{x})+a_{2,2}^{1} u_{2}^{1}(\mathbf{x})+a_{2,2}^{2} u_{2}^{2}(\mathbf{x})}\end{array} f1(x)=a1,11u11(x)+a1,12u12(x)+a1,21u21(x)+a1,22u22(x)f2(x)=a2,11u11(x)+a2,12u12(x)+a2,21u21(x)+a2,22u22(x)
For f ( x ) = [ f 1 ( x ) ⋯ f D ( x ) ] ⊤ \mathbf{f}(\mathbf{x})=\left[f_{1}(\mathbf{x}) \cdots f_{D}(\mathbf{x})\right]^{\top} f(x)=[f1(x)fD(x)], the covariance cov ⁡ [ f ( x ) , f ( x ′ ) ] \operatorname{cov}\left[\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right] cov[f(x),f(x)] is given as:
cov ⁡ [ f ( x ) , f ( x ′ ) ] = ∑ q = 1 Q A q A q ⊤ k q ( x , x ′ ) = ∑ q = 1 Q B q k q ( x , x ′ ) \operatorname{cov}\left[\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right]=\sum_{q=1}^{Q} \mathbf{A}_{q} \mathbf{A}_{q}^{\top} k_{q}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\sum_{q=1}^{Q} \mathbf{B}_{q} k_{q}\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov[f(x),f(x)]=q=1QAqAqkq(x,x)=q=1QBqkq(x,x)
where A q = [ a q 1 a q 2 ⋯ a q R q ] \mathbf{A}_{q}=\left[\mathbf{a}_{q}^{1} \mathbf{a}_{q}^{2} \cdots \mathbf{a}_{q}^{R_{q}}\right] Aq=[aq1aq2aqRq].

The rank of each B q B_{q} Bq is R q R_{q} Rq.

The matrices B q B_{q} Bq are known as the coregionalization matrices.

ZEgCr9.md.png
[ f 1 f 2 ] = [ f 1 ( x 1 ) ⋮ f 1 ( x N ) f 2 ( x 1 ) ⋮ f 2 ( x N ) ] ∼ N ( [ 0 0 ] , ∑ q = 1 Q B q ⊗ K q ) \left[\begin{array}{c}{\mathbf{f}_{1}} \\ {\mathbf{f}_{2}}\end{array}\right]=\left[\begin{array}{c}{f_{1}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{1}\left(\mathbf{x}_{N}\right)} \\ {f_{2}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{2}\left(\mathbf{x}_{N}\right)}\end{array}\right] \sim \mathcal{N}\left(\left[\begin{array}{l}{\mathbf{0}} \\ {\mathbf{0}}\end{array}\right], \sum_{q=1}^{Q} \mathbf{B}_{q} \otimes \mathbf{K}_{q}\right) [f1f2]=f1(x1)f1(xN)f2(x1)f2(xN)N([00],q=1QBqKq)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值