Multiple-Output Gaussian Process

Working Situation


As the picture shows, we want to learn from the three sensors (with complete signal information) to recover the fourth one.

Dependencies between processes

Multiple-independent Output GP

f 1 ( x ) ∼ G P ( 0 , k 1 ( x , x ′ ) )                      f 2 ( x ) ∼ G P ( 0 , k 2 ( x , x ′ ) ) D 1 = { ( x i , 1 , y 1 ( x i , 2 ) ) ∣ i = 1 , … , N 1 }                      D 2 = { ( x i , 2 , y 2 ( x i , 2 ) ) ∣ i = 1 , … , N 2 } y 1 ∼ N ( 0 , K 1 + σ 1 2 )                      y 2 ∼ N ( 0 , K 2 + σ 2 2 l ) \begin{aligned} f_{1}(\mathbf{x}) \sim \mathcal{G} \mathcal{P}\left(0, k_{1}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)\right) &\;\;\;\;\;\;\;\;\;\; f_{2}(\mathbf{x}) \sim \mathcal{G} \mathcal{P}\left(0, k_{2}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)\right) \\ D_{1}=\left\{\left(\mathbf{x}_{i, 1}, y_{1}\left(\mathbf{x}_{i, 2}\right)\right) | i=1, \ldots, N_{1}\right\} & \;\;\;\;\;\;\;\;\;\;\mathcal{D}_{2}=\left\{\left(\mathbf{x}_{i, 2}, y_{2}\left(\mathbf{x}_{i, 2}\right)\right) | i=1, \ldots, N_{2}\right\} \\ \mathbf{y}_{1} \sim \mathcal{N}\left(\mathbf{0}, \mathbf{K}_{1}+\sigma_{1}^{2}\right) & \;\;\;\;\;\;\;\;\;\;\mathbf{y}_{2} \sim \mathcal{N}\left(\mathbf{0}, \mathbf{K}_{2}+\sigma_{2}^{2} \mathbf{l}\right) \end{aligned} f1(x)GP(0,k1(x,x))D1={(xi,1,y1(xi,2))i=1,,N1}y1N(0,K1+σ12)f2(x)GP(0,k2(x,x))D2={(xi,2,y2(xi,2))i=1,,N2}y2N(0,K2+σ22l)

[ y 1 y 2 ] ∼ N ( [ 0 0 ] , [ K 1 0 0 K 2 ] + [ σ 1 2 l 0 0 σ 2 2 l ] ) \left[\begin{array}{l}{\mathbf{y}_{1}} \\ {\mathbf{y}_{2}}\end{array}\right] \sim \mathcal{N}\left(\left[\begin{array}{l}{\mathbf{0}} \\ {\mathbf{0}}\end{array}\right],\left[\begin{array}{cc}{\mathbf{K}_{1}} & {\mathbf{0}} \\ {\mathbf{0}} & {\mathbf{K}_{2}}\end{array}\right]+\left[\begin{array}{cc}{\sigma_{1}^{2} \mathbf{l}} & {\mathbf{0}} \\ {\mathbf{0}} & {\sigma_{2}^{2} \mathbf{l}}\end{array}\right]\right) [y1y2]N([00],[K100K2]+[σ12l00σ22l])

How to find the independences for kernel design

K f , f = [ K 1 ? ? K 2 ] \mathbf{K}_{\mathbf{f}, \mathbf{f}}=\left[\begin{array}{cc}{\mathbf{K}_{1}} & {?} \\ {?} & {\mathbf{K}_{2}}\end{array}\right] Kf,f=[K1??K2]

Build a cross-covariance function c o v [ f 1 ( x ) , f 2 ( x ′ ) ] cov[f_1(x), f_2(x^{'})] cov[f1(x),f2(x)] such that K f , f K_{f,f} Kf,f is positive semi-definite.

Different input configurations of data
D 1 = { ( x i , f 1 ( x i ) ) i = 1 N }            D 1 = { ( x i , 1 , f 1 ( x i , 1 ) ) i = 1 N 1 } D 2 = { ( x i , f 2 ( x i ) ) i = 1 N }            D 2 = { ( x i , 2 , f 2 ( x i , 2 ) ) i = 1 N 2 } \begin{array}{ll}{\mathcal{D}_{1}=\left\{\left(\mathbf{x}_{i}, f_{1}\left(\mathbf{x}_{i}\right)\right)_{i=1}^{N}\right\}} &\;\;\;\;\; {\mathcal{D}_{1}=\left\{\left(\mathbf{x}_{i, 1}, f_{1}\left(\mathbf{x}_{i, 1}\right)\right)_{i=1}^{N_{1}}\right\}} \\ {\mathcal{D}_{2}=\left\{\left(\mathbf{x}_{i}, f_{2}\left(\mathbf{x}_{i}\right)\right)_{i=1}^{N}\right\}} & \;\;\;\;\;{\mathcal{D}_{2}=\left\{\left(\mathbf{x}_{i, 2}, f_{2}\left(\mathbf{x}_{i, 2}\right)\right)_{i=1}^{N_{2}}\right\}}\end{array} D1={(xi,f1(xi))i=1N}D2={(xi,f2(xi))i=1N}D1={(xi,1,f1(xi,1))i=1N1}D2={(xi,2,f2(xi,2))i=1N2}

Intrinsic Coregionalization Model

Two outputs

Sample Once

Consider two outputs $f_1(x) $ f 2 ( x ) f_{2}(x) f2(x) with x ∈ R p x\in \mathcal{R}^{p} xRp.

  1. Sample from a GP u ( x ) ∼ G P ( 0 , k ( x , x ′ ) ) u(\mathbf{x}) \sim \mathcal{G P}\left(0, k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)\right) u(x)GP(0,k(x,x)) to obtain u 1 ( x ) u^{1}(\mathbf{x}) u1(x)
  2. Obtain $f_1(x) $ and f 2 ( x ) f_{2}(x) f2(x) by linearly transforming:

f 1 ( x ) = a 1 1 u 1 ( x ) f 2 ( x ) = a 2 1 u 1 ( x ) \begin{aligned} f_{1}(\mathbf{x}) &=a_{1}^{1} u^{1}(\mathbf{x}) \\ f_{2}(\mathbf{x}) &=a_{2}^{1} u^{1}(\mathbf{x}) \end{aligned} f1(x)f2(x)=a11u1(x)=a21u1(x)

For a fixed value x x x. we can group f 1 ( x ) f_1(x) f1(x) and f 2 ( x ) f_2(x) f2(x) in a vector:
f ( x ) = [ f 1 ( x ) f 2 ( x ) ] \mathbf{f}(\mathbf{x})=\left[\begin{array}{l}{f_{1}(\mathbf{x})} \\ {f_{2}(\mathbf{x})}\end{array}\right] f(x)=[f1(x)f2(x)]
and this vector will be refer as a v e c t o r − v a l u e d    f u n c t i o n \bf{vector-valued \; function} vectorvaluedfunction.

The covariance for f ( x ) f(x) f(x) is computed as:
cov ⁡ ( f ( x ) , f ( x ′ ) ) = E { f ( x ) [ f ( x ′ ) ] ⊤ } − E { f ( x ) } [ E { f ( x ′ ) } ] ⊤ \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right)=\mathbb{E}\left\{\mathbf{f}(\mathbf{x})\left[\mathbf{f}\left(\mathbf{x}^{\prime}\right)\right]^{\top}\right\}-\mathbb{E}\{\mathbf{f}(\mathbf{x})\}\left[\mathbb{E}\left\{\mathbf{f}\left(\mathbf{x}^{\prime}\right)\right\}\right]^{\top} cov(f(x),f(x))=E{f(x)[f(x)]}E{f(x)}[E{f(x)}]

E { [ f 1 ( x ) f 2 ( x ) ] [ f 1 ( x ′ ) f 2 ( x ′ ) ] } = [ E { f 1 ( x ) f 1 ( x ′ ) } E { f 1 ( x ) f 2 ( x ′ ) } E { f 2 ( x ) f 1 ( x ′ ) } E { f 2 ( x ) f 2 ( x ′ ) } ] E { f 1 ( x ) f 1 ( x ′ ) } = E { a 1 1 u 1 ( x ) a 1 1 u 1 ( x ′ ) } = ( a 1 1 ) 2 E { u 1 ( x ) u 1 ( x ′ ) } E { f 1 ( x ) f 2 ( x ′ ) } = E { a 1 1 u 1 ( x ) a 2 1 ( x ′ ) } = a 1 1 a 2 1 E { u 1 ( x ) u 1 ( x ′ ) } E { f 2 ( x ) f 2 ( x ′ ) } = E { a 2 1 u 1 ( x ) a 2 1 u 1 ( x ′ ) } = ( a 2 1 ) 2 E { u 1 ( x ) u 1 ( x ′ ) } \mathbb{E}\left\{\left[\begin{array}{c}{f_{1}(\mathbf{x})} \\ {f_{2}(\mathbf{x})}\end{array}\right]\left[\begin{array}{ll}{f_{1}\left(\mathbf{x}^{\prime}\right)} & {f_{2}\left(\mathbf{x}^{\prime}\right) ]}\end{array}\right\}=\left[\begin{array}{cc}{\mathbb{E}\left\{f_{1}(\mathbf{x}) f_{1}\left(\mathbf{x}^{\prime}\right)\right\}} & {\mathbb{E}\left\{f_{1}(\mathbf{x}) f_{2}\left(\mathbf{x}^{\prime}\right)\right\}} \\ {\mathbb{E}\left\{f_{2}(\mathbf{x}) f_{1}\left(\mathbf{x}^{\prime}\right)\right\}} & {\mathbb{E}\left\{f_{2}(\mathbf{x}) f_{2}\left(\mathbf{x}^{\prime}\right)\right\}}\end{array}\right]\right.\\ \begin{aligned} \mathbb{E}\left\{f_{1}(\mathbf{x}) f_{1}\left(\mathbf{x}^{\prime}\right)\right\} &=\mathbb{E}\left\{a_{1}^{1} u^{1}(\mathbf{x}) a_{1}^{1} u^{1}\left(\mathbf{x}^{\prime}\right)\right\}=\left(a_{1}^{1}\right)^{2} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\} \\ \mathbb{E}\left\{f_{1}(\mathbf{x}) f_{2}\left(\mathbf{x}^{\prime}\right)\right\} &=\mathbb{E}\left\{a_{1}^{1} u^{1}(\mathbf{x}) a_{2}^{1}\left(\mathbf{x}^{\prime}\right)\right\}=a_{1}^{1} a_{2}^{1} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\} \\ \mathbb{E}\left\{f_{2}(\mathbf{x}) f_{2}\left(\mathbf{x}^{\prime}\right)\right\} &=\mathbb{E}\left\{a_{2}^{1} u^{1}(\mathbf{x}) a_{2}^{1} u^{1}\left(\mathbf{x}^{\prime}\right)\right\}=\left(a_{2}^{1}\right)^{2} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\} \end{aligned} E{[f1(x)f2(x)][f1(x)f2(x)]}=[E{f1(x)f1(x)}E{f2(x)f1(x)}E{f1(x)f2(x)}E{f2(x)f2(x)}]E{f1(x)f1(x)}E{f1(x)f2(x)}E{f2(x)f2(x)}=E{a11u1(x)a11u1(x)}=(a11)2E{u1(x)u1(x)}=E{a11u1(x)a21(x)}=a11a21E{u1(x)u1(x)}=E{a21u1(x)a21u1(x)}=(a21)2E{u1(x)u1(x)}

So that term could be written as:
E { f ( x ) [ f ( x ′ ) ] ⊤ } = [ ( a 1 1 ) 2 E { u 1 ( x ) u 1 ( x ′ ) } a 1 1 a 2 1 E { u 1 ( x ) u 1 ( x ′ ) } a 1 a 2 E { u 1 ( x ) u 1 ( x ′ ) } ( a 2 1 ) 2 E { u 1 ( x ) u 1 ( x ′ ) } ] = [ ( a 1 1 ) 2 a 1 1 a 2 1 a 1 1 a 2 1 ( a 2 1 ) 2 ] E { u 1 ( x ) u 1 ( x ′ ) } \mathbb{E}\left\{\mathbf{f}(\mathbf{x})\left[\mathbf{f}\left(\mathbf{x}^{\prime}\right)\right]^{\top}\right\} =\left[\begin{array}{cc}{\left(a_{1}^{1}\right)^{2} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}} & {a_{1}^{1} a_{2}^{1} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}} \\ {a^{1} a^{2} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}} & {\left(a_{2}^{1}\right)^{2} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}}\end{array}\right]\\ =\left[\begin{array}{cc}{\left(a_{1}^{1}\right)^{2}} & {a_{1}^{1} a_{2}^{1}} \\{a_{1}^{1} a_{2}^{1}} & {\left(a_{2}^{1}\right)^{2}}\end{array}\right] \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\} E{f(x)[f(x)]}=[(a11)2E{u1(x)u1(x)}a1a2E{u1(x)u1(x)}a11a21E{u1(x)u1(x)}(a21)2E{u1(x)u1(x)}]=[(a11)2a11a21a11a21(a21)2]E{u1(x)u1(x)}
The term E { f ( x ) } \mathbb{E}\{\mathbf{f}(\mathbf{x})\} E{f(x)} is computed as:
E { [ f 1 ( x ) f 2 ( x ) ] } = [ E { f 1 ( x ) } E { f 1 ( x ) } ] = [ E { a 1 1 u 1 ( x ) } E { a 2 1 u 1 ( x ) } ] ] = [ a 1 1 a 2 1 ] E { u 1 ( x ) } \mathbb{E}\left\{\left[\begin{array}{c}{f_{1}(\mathbf{x})} \\ {f_{2}(\mathbf{x})}\end{array}\right]\right\}=\left[\begin{array}{c}{\mathbb{E}\left\{f_{1}(\mathbf{x})\right\}} \\ {\mathbb{E}\left\{f_{1}(\mathbf{x})\right\}}\end{array}\right]=\left[\begin{array}{c}{\mathbb{E}\left\{a_{1}^{1} u^{1}(\mathbf{x})\right\}} \\ {\mathbb{E}\left\{a_{2}^{1} u^{1}(\mathbf{x})\right\}}\end{array}\right] ]=\left[\begin{array}{c}{a_{1}^{1}} \\ {a_{2}^{1}}\end{array}\right] \mathbb{E}\left\{u^{1}(\mathbf{x})\right\} E{[f1(x)f2(x)]}=[E{f1(x)}E{f1(x)}]=[E{a11u1(x)}E{a21u1(x)}]]=[a11a21]E{u1(x)}
Putting them together, the covariance for f ( x ′ ) f(x^{'}) f(x) follows as:
[ ( a 1 1 ) 2 a 1 1 a 2 1 a 1 1 a 2 1 ( a 2 1 ) 2 ] E { u 1 ( x ) u 1 ( x ′ ) } − [ a 1 1 a 2 1 ] [ a 1 1 a 2 1 ] { u 1 ( x ) } E { u 1 ( x ′ ) } \left[\begin{array}{cc}{\left(a_{1}^{1}\right)^{2}} & {a_{1}^{1} a_{2}^{1}} \\ {a_{1}^{1} a_{2}^{1}} & {\left(a_{2}^{1}\right)^{2}}\end{array}\right] \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}-\left[\begin{array}{c}{a_{1}^{1}} \\ {a_{2}^{1}}\end{array}\right]\left[\begin{array}{cc}{a_{1}^{1}} & {a_{2}^{1} ]}\end{array}\left\{u^{1}(\mathbf{x})\right\} \mathbb{E}\left\{u^{1}\left(\mathbf{x}^{\prime}\right)\right\}\right. [(a11)2a11a21a11a21(a21)2]E{u1(x)u1(x)}[a11a21][a11a21]{u1(x)}E{u1(x)}
Defining a = [ a 1 1 a 2 1 ] ⊤ \mathbf{a}=\left[\begin{array}{ll}{a_{1}^{1}} & {a_{2}^{1}}\end{array}\right]^{\top} a=[a11a21],
cov ⁡ ( f ( x ) , f ( x ′ ) ) = a a ⊤ E { u 1 ( x ) u 1 ( x ′ ) } − a a ⊤ E { u 1 ( x ) } E { u 1 ( x ′ ) } = a a ⊤ [ E { u 1 ( x ) u 1 ( x ′ ) } − E { u 1 ( x ) } E { u 1 ( x ′ ) } ] ⎵ k ( x , x ′ ) = a a ⊤ k ( x , x ′ ) \begin{aligned} \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right) &=\mathbf{a a}^{\top} \mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}-\mathbf{a a}^{\top} \mathbb{E}\left\{u^{1}(\mathbf{x})\right\} \mathbb{E}\left\{u^{1}\left(\mathbf{x}^{\prime}\right)\right\} \\ &=\mathbf{a a}^{\top} \underbrace{\left[\mathbb{E}\left\{u^{1}(\mathbf{x}) u^{1}\left(\mathbf{x}^{\prime}\right)\right\}-\mathbb{E}\left\{u^{1}(\mathbf{x})\right\} \mathbb{E}\left\{u^{1}\left(\mathbf{x}^{\prime}\right)\right\}\right]}_{k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)} \\ &=\mathbf{a} \mathbf{a}^{\top} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right) \end{aligned} cov(f(x),f(x))=aaE{u1(x)u1(x)}aaE{u1(x)}E{u1(x)}=aak(x,x) [E{u1(x)u1(x)}E{u1(x)}E{u1(x)}]=aak(x,x)
We define B = a a ⊤ \mathbf{B}=\mathbf{a a}^{\top} B=aa, leading to
cov ⁡ ( f ( x ) , f ( x ′ ) ) = B k ( x , x ′ ) = [ b 11 b 12 b 21 b 22 ] k ( x , x ′ ) \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right)=\mathbf{B} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\left[\begin{array}{ll}{b_{11}} & {b_{12}} \\ {b_{21}} & {b_{22}}\end{array}\right] k\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov(f(x),f(x))=Bk(x,x)=[b11b21b12b22]k(x,x)
and the B \bf{B} B has rank one, since it is the result of the multiplication of two column-vector.

Sample Twice

Sample twice from a GP u ( x ) ∼ G P ( 0 , k ( x , x ′ ) ) u(\mathbf{x}) \sim \mathcal{G} \mathcal{P}\left(0, k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)\right) u(x)GP(0,k(x,x)) to obtain u 1 ( x )  and  u 2 ( x ) u^{1}(\mathbf{x}) \text { and } u^{2}(\mathbf{x}) u1(x) and u2(x).

Adding a scaled transformation.:
f 1 ( x ) = a 1 1 u 1 ( x ) + a 1 2 u 2 ( x ) f 2 ( x ) = a 2 1 u 1 ( x ) + a 2 2 u 2 ( x ) \begin{array}{l}{f_{1}(\mathbf{x})=a_{1}^{1} u^{1}(\mathbf{x})+a_{1}^{2} u^{2}(\mathbf{x})} \\ {f_{2}(\mathbf{x})=a_{2}^{1} u^{1}(\mathbf{x})+a_{2}^{2} u^{2}(\mathbf{x})}\end{array} f1(x)=a11u1(x)+a12u2(x)f2(x)=a21u1(x)+a22u2(x)**

Notice that the u 1 u_1 u1 and u 2 u_2 u2 are independent, although they share the same covariance k k k.
f ( x ) = [ ( a 1 1 ) a 1 2 a 2 1 ( a 2 2 ) ] [ u 1 u 2 ] \mathbf{f}(\mathbf{x}) = \left[\begin{array}{cc}{\left(a_{1}^{1}\right)^{}} & {a_{1}^{2} } \\ {a_{2}^{1} } & {\left(a_{2}^{2}\right)^{}}\end{array}\right] \left[\begin{array}{l}{u^{1}} \\ {u^{2}}\end{array}\right] f(x)=[(a11)a21a12(a22)][u1u2]
The vector-valued function can be written as f ( x ) f(x) f(x), where a 1 = [ a 1 1      a 2 1 ] ⊤  and  a 2 = [ a 1 2      a 2 2 ] ⊤ \mathbf{a}^{1}=\left[a_{1}^{1 } \;\;a_{2}^{1}\right]^{\top} \text { and } \mathbf{a}^{2}=\left[a_{1}^{2}\;\; a_{2}^{2}\right]^{\top} a1=[a11a21] and a2=[a12a22]

The covariance for f ( x ) f(x) f(x) is computed as:
cov ⁡ ( f ( x ) , f ( x ′ ) ) = a 1 ( a 1 ) ⊤ cov ⁡ ( u 1 ( x ) , u 1 ( x ′ ) ) + a 2 ( a 2 ) ⊤ cov ⁡ ( u 2 ( x ) , u 2 ( x ′ ) ) = a 1 ( a 1 ) ⊤ k ( x , x ′ ) + a 2 ( a 2 ) ⊤ k ( x , x ′ ) = [ a 1 ( a 1 ) ⊤ + a 2 ( a 2 ) ⊤ ] k ( x , x ′ ) \begin{aligned} \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right) &=\mathbf{a}^{1}\left(\mathbf{a}^{1}\right)^{\top} \operatorname{cov}\left(u^{1}(\mathbf{x}), u^{1}\left(\mathbf{x}^{\prime}\right)\right)+\mathbf{a}^{2}\left(\mathbf{a}^{2}\right)^{\top} \operatorname{cov}\left(u^{2}(\mathbf{x}), u^{2}\left(\mathbf{x}^{\prime}\right)\right) \\ &=\mathbf{a}^{1}\left(\mathbf{a}^{1}\right)^{\top} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)+\mathbf{a}^{2}\left(\mathbf{a}^{2}\right)^{\top} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right) \\ &=\left[\mathbf{a}^{1}\left(\mathbf{a}^{1}\right)^{\top}+\mathbf{a}^{2}\left(\mathbf{a}^{2}\right)^{\top}\right] k\left(\mathbf{x}, \mathbf{x}^{\prime}\right) \end{aligned} cov(f(x),f(x))=a1(a1)cov(u1(x),u1(x))+a2(a2)cov(u2(x),u2(x))=a1(a1)k(x,x)+a2(a2)k(x,x)=[a1(a1)+a2(a2)]k(x,x)
notice that u 1 u_1 u1 and u 2 u_2 u2 are independent, so their variance could be added directly.

we define B = a 1 ( a 1 ) ⊤ + a 2 ( a 2 ) ⊤ \mathbf{B}=\mathbf{a}^{1}\left(\mathbf{a}^{1}\right)^{\top}+\mathbf{a}^{2}\left(\mathbf{a}^{2}\right)^{\top} B=a1(a1)+a2(a2), leading to:
cov ⁡ ( f ( x ) , f ( x ′ ) ) = B k ( x , x ′ ) = [ b 11 b 12 b 21 b 22 ] k ( x , x ′ ) \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right)=\mathbf{B} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\left[\begin{array}{ll}{b_{11}} & {b_{12}} \\ {b_{21}} & {b_{22}}\end{array}\right] k\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov(f(x),f(x))=Bk(x,x)=[b11b21b12b22]k(x,x)
Notice that B B B has rank two.

Observed Data:
[ f 1 f 2 ] = [ f 1 ( x 1 ) ⋮ f 1 ( x N ) f 2 ( x 1 ) ⋮ f 2 ( x N ) ] ∼ N ( [ 0 0 ] , [ b 11 K b 12 K b 21 K b 22 K ] ) \left[\begin{array}{c}{\mathbf{f}_{1}} \\ {\mathbf{f}_{2}}\end{array}\right]=\left[\begin{array}{c}{f_{1}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{1}\left(\mathbf{x}_{N}\right)} \\ {f_{2}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{2}\left(\mathbf{x}_{N}\right)}\end{array}\right] \sim \mathcal{N}\left(\left[\begin{array}{l}{\mathbf{0}} \\ {\mathbf{0}}\end{array}\right],\left[\begin{array}{cc}{b_{11} \mathbf{K}} & {b_{12} \mathbf{K}} \\ {b_{21} \mathbf{K}} & {b_{22} \mathbf{K}}\end{array}\right]\right) [f1f2]=f1(x1)f1(xN)f2(x1)f2(xN)N([00],[b11Kb21Kb12Kb22K])
The matrix k ∈ R N ∗ N \bf{k} \in \mathcal{R}^{N*N} kRNN has elements k ( x i , x j ) k(x_i,x_j) k(xi,xj).

If we use Kronecker product we would get:
[ f 1 f 2 ] = [ f 1 ( x 1 ) ⋮ f 1 ( x N ) f 2 ( x 1 ) ⋮ f 2 ( x N ) ] ∼ N ( [ 0 0 ] , B ⊗ K ) \left[\begin{array}{c}{\mathbf{f}_{1}} \\ {\mathbf{f}_{2}}\end{array}\right]=\left[\begin{array}{c}{f_{1}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{1}\left(\mathbf{x}_{N}\right)} \\ {f_{2}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{2}\left(\mathbf{x}_{N}\right)}\end{array}\right] \sim \mathcal{N}\left(\left[\begin{array}{l}{\mathbf{0}} \\ {\mathbf{0}}\end{array}\right], \mathbf{B} \otimes \mathbf{K}\right) [f1f2]=f1(x1)f1(xN)f2(x1)f2(xN)N([00],BK)

General Case

Consider a set of functions { f d ( x ) } d = 1 D \left\{f_{d}(\mathbf{x})\right\}_{d=1}^{D} {fd(x)}d=1D.

In the ICM,
f d ( x ) = ∑ i = 1 R a d i u i ( x ) f_{d}(\mathbf{x})=\sum_{i=1}^{R} a_{d}^{i} u^{i}(\mathbf{x}) fd(x)=i=1Radiui(x)
where the functions u i ( x ) u_i(x) ui(x) are GPs sampled independently, and share the same covariance function k ( x , x ′ ) k(x, x^{'}) k(x,x).

For f ( x ) = [ f 1 ( x ) ⋯ f D ( x ) ] ⊤ \mathbf{f}(\mathbf{x})=\left[f_{1}(\mathbf{x}) \cdots f_{D}(\mathbf{x})\right]^{\top} f(x)=[f1(x)fD(x)], the covariance is given as:
cov ⁡ [ f ( x ) , f ( x ′ ) ] = A A ⊤ k ( x , x ′ ) = B k ( x , x ′ ) \operatorname{cov}\left[\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right]=\mathbf{A} \mathbf{A}^{\top} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\mathbf{B} k\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov[f(x),f(x)]=AAk(x,x)=Bk(x,x)
A = [ a 1 a 2 ⋯ a R ] \mathbf{A}=\left[\mathbf{a}^{1} \mathbf{a}^{2} \cdots \mathbf{a}^{R}\right] A=[a1a2aR]
and the Rank of B B B is given by R R R.

ICM: autokrigeability

If the outputs are considered to be noise-free, prediction using the ICM under an isotopic data case is equivalent to independent prediction over each output. This circumstance is also known as autokrigeability.

The prove:

Assume that we only have two outputs: f 1 , f 2 f_1,f_2 f1,f2

the predicated mean could be written as:
μ = K f ∗ , f ( K f , f ) − 1 f K f , f = B ⊗ K \mu = K_{f_{*},f} (K_{f,f})^{-1}f\\ K_{f,f} = B \otimes K μ=Kf,f(Kf,f)1fKf,f=BK

μ = B ⊗ K ∗ ( B ⊗ K ) − 1 f = B ⊗ K ∗ ( B − 1 ⊗ K − 1 ) f = B B − 1 ⊗ K ∗ K − 1 f = I ⊗ K ∗ K − 1 f = [ K ∗ K − 1 0 0 K ∗ K − 1 ] [ f 1 f 2 ] \begin{aligned} \mu &= B \otimes K_{*} (B \otimes K)^{-1} f\\ &= B \otimes K_{*} (B^{-1} \otimes K^{-1})f\\ &= BB^{-1}\otimes K_{*}K^{-1}f\\ &=I \otimes K_{*}K^{-1}f \\ &=\begin{bmatrix} K_{*}K^{-1} & 0\\ 0 & K_{*}K^{-1} \end{bmatrix}\begin{bmatrix} f_{1}\\ f_{2} \end{bmatrix}\end{aligned} μ=BK(BK)1f=BK(B1K1)f=BB1KK1f=IKK1f=[KK100KK1][f1f2]

it means, the prediction of f 1 f_{1} f1 only depends on the data set for f 1 f_{1} f1

Semiparametric Latent Factor Model (SLFM)

ICM uses R samples u i ( x ) u^{i}(x) ui(x) from u ( x ) u(x) u(x) with the same covariance function. SLFM uses Q samples from u q u_{q} uq processes with different covariance functions.

Two Outputs

  1. Sample from a GP G P ( 0 , k 1 ( x , x ′ ) ) \mathcal{G P}\left(0, k_{1}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)\right) GP(0,k1(x,x)) to obtain u 1 ( x ) u_1(x) u1(x).

  2. Sample from a GP G P ( 0 , k 2 ( x , x ′ ) ) \mathcal{G P}\left(0, k_{2}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)\right) GP(0,k2(x,x)) to obtain u 2 ( x ) u_2(x) u2(x).

  3. Adding a scaled versions:
    f 1 ( x ) = a 1 , 1 u 1 ( x ) + a 1 , 2 u 2 ( x ) f 2 ( x ) = a 2 , 1 u 1 ( x ) + a 2 , 2 u 2 ( x ) \begin{array}{l}{f_{1}(\mathbf{x})=a_{1,1} u_{1}(\mathbf{x})+a_{1,2} u_{2}(\mathbf{x})} \\ {f_{2}(\mathbf{x})=a_{2,1} u_{1}(\mathbf{x})+a_{2,2} u_{2}(\mathbf{x})}\end{array} f1(x)=a1,1u1(x)+a1,2u2(x)f2(x)=a2,1u1(x)+a2,2u2(x)

Similar, it can be written as:
f ( x ) = a 1 u 1 ( x ) + a 2 u 2 ( x ) \mathbf{f}(\mathbf{x})=\mathbf{a}_{1} u_{1}(\mathbf{x})+\mathbf{a}_{2} u_{2}(\mathbf{x}) f(x)=a1u1(x)+a2u2(x)
with a 1 = [ a 1 , 1 a 2 , 1 ] ⊤  and  a 2 = [ a 1 , 2 a 2 , 2 ] ⊤ \mathbf{a}_{1}=\left[a_{1,1} a_{2,1}\right]^{\top} \text { and } \mathbf{a}_{2}=\left[a_{1,2} a_{2,2}\right]^{\top} a1=[a1,1a2,1] and a2=[a1,2a2,2]

The covariance for f ( x ) f(x) f(x) is computed as:
cov ⁡ ( f ( x ) , f ( x ′ ) ) = a 1 ( a 1 ) ⊤ cov ⁡ ( u 1 ( x ) , u 1 ( x ′ ) ) + a 2 ( a 2 ) ⊤ cov ⁡ ( u 2 ( x ) , u 2 ( x ′ ) ) = a 1 ( a 1 ) ⊤ k 1 ( x , x ′ ) + a 2 ( a 2 ) ⊤ k 2 ( x , x ′ ) \begin{aligned} \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right) &=\mathbf{a}_{1}\left(\mathbf{a}_{1}\right)^{\top} \operatorname{cov}\left(u_{1}(\mathbf{x}), u_{1}\left(\mathbf{x}^{\prime}\right)\right)+\mathbf{a}_{2}\left(\mathbf{a}_{2}\right)^{\top} \operatorname{cov}\left(u_{2}(\mathbf{x}), u_{2}\left(\mathbf{x}^{\prime}\right)\right) \\ &=\mathbf{a}_{1}\left(\mathbf{a}_{1}\right)^{\top} k_{1}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)+\mathbf{a}_{2}\left(\mathbf{a}_{2}\right)^{\top} k_{2}\left(\mathbf{x}, \mathbf{x}^{\prime}\right) \end{aligned} cov(f(x),f(x))=a1(a1)cov(u1(x),u1(x))+a2(a2)cov(u2(x),u2(x))=a1(a1)k1(x,x)+a2(a2)k2(x,x)
We define B 1 = a 1 ( a 1 ) ⊤  and  B 2 = a 2 ( a 2 ) ⊤ \mathbf{B}_{1}=\mathbf{a}_{1}\left(\mathbf{a}_{1}\right)^{\top} \text { and } \mathbf{B}_{2}=\mathbf{a}_{2}\left(\mathbf{a}_{2}\right)^{\top} B1=a1(a1) and B2=a2(a2), leading to:
cov ⁡ ( f ( x ) , f ( x ′ ) ) = B 1 k 1 ( x , x ′ ) + B 2 k 2 ( x , x ′ ) \operatorname{cov}\left(\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right)=\mathbf{B}_{1} k_{1}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)+\mathbf{B}_{2} k_{2}\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov(f(x),f(x))=B1k1(x,x)+B2k2(x,x)
Notice that $B_{1} $ and B 2 B_{2} B2 have rank one.

[ f 1 f 2 ] = [ f 1 ( x 1 ) ⋮ f 1 ( x N ) f 2 ( x 1 ) ⋮ f 2 ( x N ) ] ∼ N ( [ 0 0 ] , B 1 ⊗ K 1 + B 2 ⊗ K 2 ) \left[\begin{array}{c}{\mathbf{f}_{1}} \\ {\mathbf{f}_{2}}\end{array}\right]=\left[\begin{array}{c}{f_{1}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{1}\left(\mathbf{x}_{N}\right)} \\ {f_{2}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{2}\left(\mathbf{x}_{N}\right)}\end{array}\right] \sim \mathcal{N}\left(\left[\begin{array}{l}{\mathbf{0}} \\ {\mathbf{0}}\end{array}\right], \mathbf{B}_{1} \otimes \mathbf{K}_{1}+\mathbf{B}_{2} \otimes \mathbf{K}_{2}\right) [f1f2]=f1(x1)f1(xN)f2(x1)f2(xN)N([00],B1K1+B2K2)

General Case:

Consider a set of functions { f d ( x ) } d = 1 D \left\{f_{d}(\mathbf{x})\right\}_{d=1}^{D} {fd(x)}d=1D

In the SLFM,
f d ( x ) = ∑ q = 1 Q a d , q u q ( x ) f_{d}(\mathbf{x})=\sum_{q=1}^{Q} a_{d, q} u_{q}(\mathbf{x}) fd(x)=q=1Qad,quq(x)
where the functions u q ( x ) u_{q}(x) uq(x) are GPs with covariance functions k q ( x , x ′ ) k_{q}(x,x^{'}) kq(x,x).

For f ( x ) = [ f 1 ( x ) ⋯ f D ( x ) ] ⊤ \mathbf{f}(\mathbf{x})=\left[f_{1}(\mathbf{x}) \cdots f_{D}(\mathbf{x})\right]^{\top} f(x)=[f1(x)fD(x)], the covariance is given as:
cov ⁡ [ f ( x ) , f ( x ′ ) ] = ∑ q = 1 Q A q A q ⊤ k q ( x , x ′ ) = ∑ q = 1 Q B q k q ( x , x ′ ) \operatorname{cov}\left[\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right]=\sum_{q=1}^{Q} \mathbf{A}_{q} \mathbf{A}_{q}^{\top} k_{q}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\sum_{q=1}^{Q} \mathbf{B}_{q} k_{q}\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov[f(x),f(x)]=q=1QAqAqkq(x,x)=q=1QBqkq(x,x)
where A q = a q A_{q} = a_{q} Aq=aq.

The rank of each B q B_{q} Bq is one.

Linear model of coregionalization (LMC)

The LMC generalizes the ICM and the SLFM allowing several independent samples from GPs with different covariances.

Consider a set of functions { f d ( x ) } d = 1 D \left\{f_{d}(\mathbf{x})\right\}_{d=1}^{D} {fd(x)}d=1D
f d ( x ) = ∑ q = 1 Q ∑ i = 1 R q a d , q i u q i ( x ) f_{d}(\mathbf{x})=\sum_{q=1}^{Q} \sum_{i=1}^{R_{q}} a_{d, q}^{i} u_{q}^{i}(\mathbf{x}) fd(x)=q=1Qi=1Rqad,qiuqi(x)
where the functions u q i u_{q}^{i} uqi are GPs with zero means and covariance functions:
cov ⁡ [ u q i ( x ) , u q ′ i ′ ( x ′ ) ] = k q ( x , x ′ ) \operatorname{cov}\left[u_{q}^{i}(\mathbf{x}), u_{q^{\prime}}^{i^{\prime}}\left(\mathbf{x}^{\prime}\right)\right]=k_{q}\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov[uqi(x),uqi(x)]=kq(x,x)
if i = i ′ i = i^{'} i=i and q = q ′ q = q^{'} q=q

There are Q Q Q groups of samples. For each group, there are R q R_{q} Rq samples obtained independently from the same GP with covariance k q ( x , x ′ ) k_q(x,x^{'}) kq(x,x).

The LMC corresponds to the sum of Q ICMs.

Suppose we have D = 2, Q = 2, and R q R_q Rq=2. According to LMC:
f 1 ( x ) = a 1 , 1 1 u 1 1 ( x ) + a 1 , 1 2 u 1 2 ( x ) + a 1 , 2 1 u 2 1 ( x ) + a 1 , 2 2 u 2 2 ( x ) f 2 ( x ) = a 2 , 1 1 u 1 1 ( x ) + a 2 , 1 2 u 1 2 ( x ) + a 2 , 2 1 u 2 1 ( x ) + a 2 , 2 2 u 2 2 ( x ) \begin{array}{l}{f_{1}(\mathbf{x})=a_{1,1}^{1} u_{1}^{1}(\mathbf{x})+a_{1,1}^{2} u_{1}^{2}(\mathbf{x})+a_{1,2}^{1} u_{2}^{1}(\mathbf{x})+a_{1,2}^{2} u_{2}^{2}(\mathbf{x})} \\ {f_{2}(\mathbf{x})=a_{2,1}^{1} u_{1}^{1}(\mathbf{x})+a_{2,1}^{2} u_{1}^{2}(\mathbf{x})+a_{2,2}^{1} u_{2}^{1}(\mathbf{x})+a_{2,2}^{2} u_{2}^{2}(\mathbf{x})}\end{array} f1(x)=a1,11u11(x)+a1,12u12(x)+a1,21u21(x)+a1,22u22(x)f2(x)=a2,11u11(x)+a2,12u12(x)+a2,21u21(x)+a2,22u22(x)
For f ( x ) = [ f 1 ( x ) ⋯ f D ( x ) ] ⊤ \mathbf{f}(\mathbf{x})=\left[f_{1}(\mathbf{x}) \cdots f_{D}(\mathbf{x})\right]^{\top} f(x)=[f1(x)fD(x)], the covariance cov ⁡ [ f ( x ) , f ( x ′ ) ] \operatorname{cov}\left[\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right] cov[f(x),f(x)] is given as:
cov ⁡ [ f ( x ) , f ( x ′ ) ] = ∑ q = 1 Q A q A q ⊤ k q ( x , x ′ ) = ∑ q = 1 Q B q k q ( x , x ′ ) \operatorname{cov}\left[\mathbf{f}(\mathbf{x}), \mathbf{f}\left(\mathbf{x}^{\prime}\right)\right]=\sum_{q=1}^{Q} \mathbf{A}_{q} \mathbf{A}_{q}^{\top} k_{q}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\sum_{q=1}^{Q} \mathbf{B}_{q} k_{q}\left(\mathbf{x}, \mathbf{x}^{\prime}\right) cov[f(x),f(x)]=q=1QAqAqkq(x,x)=q=1QBqkq(x,x)
where A q = [ a q 1 a q 2 ⋯ a q R q ] \mathbf{A}_{q}=\left[\mathbf{a}_{q}^{1} \mathbf{a}_{q}^{2} \cdots \mathbf{a}_{q}^{R_{q}}\right] Aq=[aq1aq2aqRq].

The rank of each B q B_{q} Bq is R q R_{q} Rq.

The matrices B q B_{q} Bq are known as the coregionalization matrices.
[ f 1 f 2 ] = [ f 1 ( x 1 ) ⋮ f 1 ( x N ) f 2 ( x 1 ) ⋮ f 2 ( x N ) ] ∼ N ( [ 0 0 ] , ∑ q = 1 Q B q ⊗ K q ) \left[\begin{array}{c}{\mathbf{f}_{1}} \\ {\mathbf{f}_{2}}\end{array}\right]=\left[\begin{array}{c}{f_{1}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{1}\left(\mathbf{x}_{N}\right)} \\ {f_{2}\left(\mathbf{x}_{1}\right)} \\ {\vdots} \\ {f_{2}\left(\mathbf{x}_{N}\right)}\end{array}\right] \sim \mathcal{N}\left(\left[\begin{array}{l}{\mathbf{0}} \\ {\mathbf{0}}\end{array}\right], \sum_{q=1}^{Q} \mathbf{B}_{q} \otimes \mathbf{K}_{q}\right) [f1f2]=f1(x1)f1(xN)f2(x1)f2(xN)N([00],q=1QBqKq)





