02 Cokriging变可信度代理模型理论相关推导
Cokriging模型是20世纪70年代发展起来的一种地质统计学插值模型。在地质统计学领域,为了提高对某个抽样比较困难的量的预测精度,提出了采用更容易抽样的量进行辅助预测的Kriging模型,称为Cokriging模型,本文对Cokriging模型相关理论进行了推导。
2.1 代理模型问题的基本描述
对于一个有
m
m
m个设计变量的优化问题,在设计空间中同时进行高可信度分析和低可信度抽样,以建立所谓变可信度模型,变可信代理模型在达到相同近似精度的条件下,可以显著提高建立代理模型的效率。设高、低可信度分析程序的抽样位置分别为:
{
S
1
=
[
x
1
(
1
)
x
1
(
2
)
…
x
1
(
n
1
)
]
T
∈
R
n
1
×
m
S
2
=
[
x
2
(
1
)
x
2
(
2
)
…
x
2
(
n
2
)
]
T
∈
R
n
2
×
m
(1)
\begin {cases} \mathbf S_1=[\mathbf x_1^{(1)} \ \ \mathbf x_1^{(2)} \ \ \ldots \ \ \mathbf x_1^{(n_1)}]^T\in\R^{n_1\times m} \\ \mathbf S_2=[\mathbf x_2^{(1)} \ \ \mathbf x_2^{(2)} \ \ \ldots \ \ \mathbf x_2^{(n_2)}]^T\in \R^{n_2 \times m} \end {cases} \tag{1}
{S1=[x1(1) x1(2) … x1(n1)]T∈Rn1×mS2=[x2(1) x2(2) … x2(n2)]T∈Rn2×m(1)
上式中,下标 “1” 和 “2” 分别代表高、低可信度,例如
n
1
n_1
n1 和
n
2
n_2
n2 分别代表高、低可信度样本点数(可合理假设
n
2
≫
n
1
n_2\gg n_1
n2≫n1 )。相应的目标函数或约束函数响应值为:
{
y
1
=
[
y
1
(
1
)
y
1
(
2
)
…
y
1
(
n
1
)
]
T
∈
R
n
1
y
2
=
[
y
2
(
1
)
y
2
(
2
)
…
y
2
(
n
2
)
]
T
∈
R
n
2
(2)
\begin {cases} \mathbf y_1=[y_1^{(1)} \ \ y_1^{(2)} \ \ \ldots \ \ y_1^{(n_1)}]^T\in\R^{n_1} \\ \mathbf y_2=[y_2^{(1)} \ \ y_2^{(2)} \ \ \ldots \ \ y_2^{(n_2)}]^T\in\R^{n_2} \end {cases} \tag{2}
{y1=[y1(1) y1(2) … y1(n1)]T∈Rn1y2=[y2(1) y2(2) … y2(n2)]T∈Rn2(2)
结合高可信度数据集
(
S
1
,
y
1
)
(\mathbf S_1, \mathbf y_1)
(S1,y1) 和低可信度数据
(
S
1
,
y
2
)
(\mathbf S_1,\mathbf y_2)
(S1,y2) 可以建立 Cokriging 模型。
2.2 Cokriging 模型及其预估值
Cokriging 模型预估值定义为:
y
^
1
(
x
)
=
λ
T
y
s
=
λ
1
T
y
1
+
λ
2
T
y
2
(3)
\hat y_1 (\mathbf x)= \boldsymbol \lambda^T\mathbf y_s=\boldsymbol \lambda_1^T\mathbf y_1+\boldsymbol \lambda_2^T\mathbf y_2 \tag{3}
y^1(x)=λTys=λ1Ty1+λ2Ty2(3)
上式中:
λ
1
\boldsymbol \lambda_1
λ1 、
λ
2
\boldsymbol \lambda _2
λ2 分别为对高低可信度响应值的加权系数,只要给出加权系数
λ
1
=
[
λ
1
(
1
)
λ
1
(
2
)
…
λ
1
n
1
]
\boldsymbol \lambda_1 = [\lambda_1^{(1)} \ \ \lambda_1^{(2)} \ \ \ldots \ \ \lambda_1^{n_1}]
λ1=[λ1(1) λ1(2) … λ1n1] 和
λ
2
=
[
λ
2
(
1
)
λ
2
(
2
)
…
λ
2
(
n
2
)
]
\boldsymbol\lambda_2 = [\lambda_2^{(1)} \ \ \lambda_2^{(2)} \ \ \ldots \ \ \lambda_2^{(n_2)}]
λ2=[λ2(1) λ2(2) … λ2(n2)] ,就可以得到设计空间中任意点的响应值。为了计算这两个加权系数,引入统计学假设,假设存在分别与
y
1
y_1
y1 、
y
2
y_2
y2 对应的两个高斯静态随机过程:
{
Y
1
(
x
)
=
β
1
+
Z
1
(
x
)
Y
2
(
x
)
=
β
2
+
Z
2
(
x
)
(4)
\begin {cases} Y_1{(\mathbf x)} =\beta_1+ Z_1(\mathbf x) \\ Y_2(\mathbf x)=\beta_2+ Z_2(\mathbf x)\end {cases} \tag {4}
{Y1(x)=β1+Z1(x)Y2(x)=β2+Z2(x)(4)
上式中,
β
1
\beta_1
β1 、
β
2
\beta_2
β2 均为未知常数,也称为全局趋势模型,它们分别代表
Y
1
(
x
)
Y_1(\mathbf x)
Y1(x) 和
Y
2
(
x
)
Y_2(\mathbf x)
Y2(x) 的数学期望值;
Z
1
(
⋅
)
Z_1(\cdot)
Z1(⋅) 、
Z
2
(
⋅
)
Z_2(\cdot)
Z2(⋅) 是均值为0,方差分别为
σ
1
2
\sigma _1^2
σ12 和
σ
2
2
\sigma_2^2
σ22 的静态随机过程。在设计空间不同位置处,随机变量之间的协方差和交叉协方差定义为:
{
C
o
v
(
Z
(
x
1
(
i
)
)
,
Z
(
x
1
(
j
)
)
)
=
σ
1
2
R
(
11
)
(
x
1
(
i
)
,
x
1
(
j
)
)
C
o
v
(
Z
(
x
2
(
i
)
)
,
Z
(
x
2
(
j
)
)
)
=
σ
2
2
R
(
22
)
(
x
2
(
i
)
,
x
2
(
j
)
)
C
o
v
(
Z
(
x
1
(
i
)
)
,
Z
(
x
2
(
j
)
)
)
=
σ
1
σ
2
R
(
12
)
(
x
1
(
i
)
,
x
2
(
j
)
)
(5)
\begin {cases} Cov(Z(\mathbf x_1^{(i)}),Z(\mathbf x_1^{(j)}))=\sigma_1^2R^{(11)}(\mathbf x_1^{(i)},\mathbf x_1^{(j)}) \\ Cov(Z(\mathbf x_2^{(i)}),Z(\mathbf x_2^{(j)}))=\sigma_2^2 R^{(22)} (\mathbf x_2^{(i)},\mathbf x_2^{(j)}) \\ Cov(Z(\mathbf x_1^{(i)}),Z(\mathbf x_2^{(j)}))=\sigma_1\sigma_2R^{(12)}(\mathbf x_1^{(i)},\mathbf x_2^{(j)}) \end {cases} \tag{5}
⎩
⎨
⎧Cov(Z(x1(i)),Z(x1(j)))=σ12R(11)(x1(i),x1(j))Cov(Z(x2(i)),Z(x2(j)))=σ22R(22)(x2(i),x2(j))Cov(Z(x1(i)),Z(x2(j)))=σ1σ2R(12)(x1(i),x2(j))(5)
上式中,
σ
1
2
\sigma_1^2
σ12 和
σ
2
2
\sigma _2^2
σ22 分别为随机过程
Y
1
(
x
)
Y_1(\mathbf x)
Y1(x) 和
Y
2
(
x
)
Y_2(\mathbf x)
Y2(x) 的过程方差。
基于上述假设,Cokriging 模型寻找最优加权系数
λ
1
\boldsymbol \lambda_1
λ1 和
λ
2
\boldsymbol \lambda_2
λ2 使得均方差:
M
S
E
(
y
^
1
(
x
)
)
=
E
[
(
λ
1
T
Y
1
+
λ
2
T
Y
2
−
Y
1
(
x
)
)
2
]
(6)
MSE(\hat y_1(\mathbf x))=E[(\boldsymbol \lambda_1^T\mathbf Y_1+\boldsymbol \lambda_2^T \mathbf Y_2- Y_1(\mathbf x))^2] \tag{6}
MSE(y^1(x))=E[(λ1TY1+λ2TY2−Y1(x))2](6)
最小,并引入无偏估计:
E
[
λ
1
T
Y
1
+
λ
2
T
Y
2
]
=
E
[
Y
1
]
(7)
E[\boldsymbol \lambda_1^T\mathbf Y_1+\boldsymbol \lambda_2^T\mathbf Y_2]=E[Y_1] \tag{7}
E[λ1TY1+λ2TY2]=E[Y1](7)
无偏估计化简得:
β
1
(
∑
i
=
1
n
1
λ
1
(
i
)
−
1
)
+
β
2
∑
j
=
1
n
2
λ
2
(
j
)
=
0
(8)
\beta_1(\sum_{i=1}^{n_1}\lambda_1^{(i)}-1)+\beta_2\sum_{j=1}^{n_2}\lambda_2^{(j)}=0 \tag{8}
β1(i=1∑n1λ1(i)−1)+β2j=1∑n2λ2(j)=0(8)
为了让无偏估计条件不依赖于
β
1
\beta_1
β1 和
β
2
\beta_2
β2 的取值,取如下更强一些的无偏估计条件:
∑
i
=
1
n
1
λ
1
(
i
)
−
1
=
0
,
∑
i
=
1
n
2
λ
2
(
i
)
=
0
(9)
\sum_{i=1}^{n_1}\lambda_1^{(i)}-1=0,\ \ \sum_{i=1}^{n_2}\lambda_2^{(i)}=0 \tag{9}
i=1∑n1λ1(i)−1=0, i=1∑n2λ2(i)=0(9)
采用拉格朗日乘数法,令:
L
(
λ
1
,
λ
2
,
μ
1
,
μ
2
)
=
E
[
(
λ
1
T
Y
1
+
λ
2
T
Y
2
−
Y
1
(
x
)
)
2
]
+
μ
1
(
∑
i
=
1
n
1
λ
1
(
i
)
−
1
)
+
μ
2
∑
j
=
1
n
2
λ
2
(
j
)
(10)
L(\boldsymbol\lambda_1,\boldsymbol \lambda_2,\mu_1,\mu_2)=E[(\boldsymbol \lambda_1^T\mathbf Y_1+\boldsymbol \lambda_2^T\mathbf Y_2-Y_1(\mathbf x))^2] + \mu_1(\sum_{i=1}^{n_1}\lambda_1^{(i)}-1)+\mu_2\sum_{j=1}^{n_2}\lambda_2^{(j)} \tag{10}
L(λ1,λ2,μ1,μ2)=E[(λ1TY1+λ2TY2−Y1(x))2]+μ1(i=1∑n1λ1(i)−1)+μ2j=1∑n2λ2(j)(10)
令:
{
∂
L
(
λ
1
,
λ
2
,
μ
1
,
μ
2
)
∂
λ
1
(
i
)
=
E
[
2
(
λ
1
T
Y
1
+
λ
2
T
Y
2
−
Y
1
(
x
)
)
∗
Y
1
(
x
1
(
i
)
)
]
+
μ
1
=
0
∂
L
(
λ
1
,
λ
2
,
μ
1
,
μ
2
)
∂
λ
2
(
i
)
=
E
[
2
(
λ
1
T
Y
1
+
λ
2
T
Y
2
−
Y
1
(
x
)
)
∗
Y
2
(
x
2
(
i
)
)
]
+
μ
2
=
0
∂
L
(
λ
1
,
λ
2
,
μ
1
,
μ
2
)
∂
μ
1
=
∑
i
=
1
n
1
λ
1
(
i
)
−
1
=
0
∂
L
(
λ
1
,
λ
2
,
μ
1
,
μ
2
)
∂
μ
2
=
∑
i
=
1
n
2
λ
2
(
i
)
=
0
(11)
\begin {cases}\frac {\partial L(\boldsymbol \lambda_1,\boldsymbol \lambda_2,\mu_1,\mu_2)}{\partial \lambda_1^{(i)}}= E[2(\boldsymbol \lambda_1^T\mathbf Y_1+\boldsymbol \lambda_2^T\mathbf Y_2-Y_1(\mathbf x))*Y_1(\mathbf x_1^{(i)})] +\mu_1= 0 \\ \frac {\partial L(\boldsymbol \lambda_1,\boldsymbol \lambda_2,\mu_1,\mu_2)}{\partial \lambda_2^{(i)}}=E[2(\boldsymbol \lambda_1^T\mathbf Y_1+\boldsymbol\lambda_2^T\mathbf Y_2-Y_1(\mathbf x))*Y_2(\mathbf x_2^{(i)})]+\mu_2=0 \\ \frac{\partial L(\boldsymbol\lambda_1,\boldsymbol \lambda_2,\mu_1,\mu_2)}{\partial\mu_1}= \sum_{i=1}^{n_1}\lambda_1^{(i)}-1 =0 \\ \frac {\partial L(\boldsymbol \lambda_1,\boldsymbol \lambda_2,\mu_1,\mu_2)}{\partial \mu_2}=\sum_{i=1}^{n_2}\lambda_2^{(i)}=0 \end {cases} \tag{11}
⎩
⎨
⎧∂λ1(i)∂L(λ1,λ2,μ1,μ2)=E[2(λ1TY1+λ2TY2−Y1(x))∗Y1(x1(i))]+μ1=0∂λ2(i)∂L(λ1,λ2,μ1,μ2)=E[2(λ1TY1+λ2TY2−Y1(x))∗Y2(x2(i))]+μ2=0∂μ1∂L(λ1,λ2,μ1,μ2)=∑i=1n1λ1(i)−1=0∂μ2∂L(λ1,λ2,μ1,μ2)=∑i=1n2λ2(i)=0(11)
以式 (9) 中第一行式子为例演示展开过程:
∂
L
(
λ
1
,
λ
2
,
μ
1
,
μ
2
)
∂
λ
1
(
i
)
=
2
E
[
(
∑
j
=
1
n
1
λ
1
(
j
)
Y
1
(
x
1
(
j
)
)
)
⋅
Y
1
(
x
1
(
i
)
)
]
+
2
E
[
(
∑
j
=
1
n
2
λ
2
(
j
)
Y
2
(
x
2
(
j
)
)
)
⋅
Y
1
(
x
1
(
i
)
)
]
−
2
E
[
Y
1
(
x
1
)
⋅
Y
1
(
x
1
(
i
)
)
]
+
μ
1
=
2
∑
j
=
1
n
1
λ
1
(
j
)
(
β
1
2
+
σ
1
2
R
11
(
x
1
(
i
)
,
x
1
(
j
)
)
)
+
2
∑
j
=
1
n
2
λ
2
(
j
)
(
β
1
β
2
+
σ
1
σ
2
R
12
(
x
1
(
i
)
,
x
2
(
j
)
)
)
−
2
(
β
1
2
+
σ
1
2
R
(
x
1
(
i
)
,
x
1
)
)
+
μ
1
=
2
(
∑
j
=
1
n
1
λ
1
(
j
)
−
1
)
β
1
2
+
2
σ
1
2
(
∑
j
=
1
n
1
λ
1
(
j
)
R
(
x
1
(
i
)
,
x
1
(
j
)
)
)
+
2
β
1
β
2
∑
j
=
1
n
2
λ
2
(
j
)
+
2
σ
1
σ
2
(
∑
j
=
1
n
2
λ
2
(
j
)
R
12
(
x
1
(
i
)
,
x
2
(
j
)
)
)
−
2
σ
1
2
R
11
(
x
1
,
x
1
(
i
)
)
+
μ
1
=
−
2
β
1
(
∑
j
=
1
n
2
λ
2
(
j
)
β
2
)
+
2
σ
1
2
(
∑
j
=
1
n
1
λ
1
(
j
)
R
(
x
1
(
i
)
,
x
1
(
j
)
)
)
+
2
β
1
β
2
∑
j
=
1
n
2
λ
2
(
j
)
+
2
σ
1
σ
2
(
∑
j
=
1
n
2
λ
2
(
j
)
R
12
(
x
1
(
i
)
,
x
2
(
j
)
)
)
−
2
σ
1
2
R
11
(
x
1
,
x
1
(
i
)
)
+
μ
1
=
2
σ
1
2
(
∑
j
=
1
n
1
λ
1
(
j
)
R
(
x
1
(
i
)
,
x
1
(
j
)
)
+
2
σ
1
σ
2
(
∑
j
=
1
n
2
λ
2
(
j
)
R
12
(
x
1
(
i
)
,
x
2
(
j
)
)
)
−
2
σ
1
2
R
11
(
x
1
,
x
1
(
i
)
)
+
μ
1
(12)
\begin {align} \frac {\partial {L(\boldsymbol \lambda_1,\boldsymbol \lambda_2,\mu_1,\mu_2)}}{\partial \lambda_1^{(i)}} &= 2E[(\sum_{j=1}^{n_1}\lambda_1^{(j)} Y_1(\mathbf x_1^{(j)}))\cdot Y_1(\mathbf x_1^{(i)})]+2E[(\sum_{j=1}^{n_2}\lambda_2^{(j)}Y_2(\mathbf x_2^{(j)})) \cdot Y_1(\mathbf x_1^{(i)})]-2E[Y_1(\mathbf x_1)\cdot Y_1(\mathbf x_1^{(i)})] + \mu_1 \\ &= 2\sum_{j=1}^{n_1}\lambda_1^{(j)}(\beta_1^2+\sigma_1^2R^{11}(\mathbf x_1^{(i)},\mathbf x_1^{(j)}))+2\sum_{j=1}^{n_2}\lambda_2^{(j)}(\beta_1\beta_2+\sigma_1\sigma_2R^{12}(\mathbf x_1^{(i)},\mathbf x_2^{(j)}))-2(\beta_1^2+\sigma_1^2R(\mathbf x_1^{(i)},\mathbf x_1))+\mu_1 \\ & =2(\sum_{j=1}^{n_1}\lambda_1^{(j)}-1)\beta_1^2+2\sigma_1^2(\sum_{j=1}^{n_1}\lambda_1^{(j)}R(\mathbf x_1^{(i)},\mathbf x_1^{(j)}))+2\beta_1\beta_2\sum_{j=1}^{n_2}\lambda_2^{(j)} +2\sigma_1\sigma_2(\sum_{j=1}^{n_2} \lambda_2^{(j)}R^{12}(\mathbf x_1^{(i)},\mathbf x_2^{(j)}))-2\sigma_1^2R^{11}(\mathbf x_1,\mathbf x_1^{(i)})+\mu_1 \\ &=-2\beta_1(\sum_{j=1}^{n_2}\lambda_2^{(j)}\beta_2)+2\sigma_1^2(\sum_{j=1}^{n_1}\lambda_1^{(j)}R(\mathbf x_1^{(i)},\mathbf x_1^{(j)}))+2\beta_1\beta_2\sum_{j=1}^{n_2}\lambda_2^{(j)}+ 2\sigma_1\sigma_2(\sum_{j=1}^{n_2}\lambda _2^{(j)}R^{12}(\mathbf x_1^{(i)},\mathbf x_2^{(j)}))-2\sigma_1^2R^{11}(\mathbf x_1,\mathbf x_1^{(i)})+ \mu_1 \\ &= 2\sigma_1^2(\sum_{j=1}^{n_1}\lambda_1^{(j)}R(\mathbf x_1^{(i)},\mathbf x_1^{(j)})+2\sigma_1\sigma_2(\sum_{j=1}^{n_2}\lambda _2^{(j)}R^{12}(\mathbf x_1^{(i)},\mathbf x_2^{(j)}))-2\sigma_1^2R^{11}(\mathbf x_1,\mathbf x_1^{(i)}) +\mu_1 \end {align} \tag{12}
∂λ1(i)∂L(λ1,λ2,μ1,μ2)=2E[(j=1∑n1λ1(j)Y1(x1(j)))⋅Y1(x1(i))]+2E[(j=1∑n2λ2(j)Y2(x2(j)))⋅Y1(x1(i))]−2E[Y1(x1)⋅Y1(x1(i))]+μ1=2j=1∑n1λ1(j)(β12+σ12R11(x1(i),x1(j)))+2j=1∑n2λ2(j)(β1β2+σ1σ2R12(x1(i),x2(j)))−2(β12+σ12R(x1(i),x1))+μ1=2(j=1∑n1λ1(j)−1)β12+2σ12(j=1∑n1λ1(j)R(x1(i),x1(j)))+2β1β2j=1∑n2λ2(j)+2σ1σ2(j=1∑n2λ2(j)R12(x1(i),x2(j)))−2σ12R11(x1,x1(i))+μ1=−2β1(j=1∑n2λ2(j)β2)+2σ12(j=1∑n1λ1(j)R(x1(i),x1(j)))+2β1β2j=1∑n2λ2(j)+2σ1σ2(j=1∑n2λ2(j)R12(x1(i),x2(j)))−2σ12R11(x1,x1(i))+μ1=2σ12(j=1∑n1λ1(j)R(x1(i),x1(j))+2σ1σ2(j=1∑n2λ2(j)R12(x1(i),x2(j)))−2σ12R11(x1,x1(i))+μ1(12)
同样的过程可以展开式(9)中第二行式子得:
∂
L
(
λ
1
,
λ
2
,
μ
1
,
μ
2
)
∂
λ
2
(
i
)
=
2
σ
1
σ
2
(
∑
j
=
1
n
1
λ
1
(
j
)
R
12
(
x
1
(
j
)
,
x
2
(
i
)
)
)
+
2
σ
2
2
(
∑
j
=
1
n
2
λ
2
(
j
)
R
22
(
x
2
(
i
)
,
x
2
(
j
)
)
)
−
2
σ
1
σ
2
R
12
(
x
1
,
x
2
(
i
)
)
+
μ
2
(13)
\frac {\partial L(\boldsymbol\lambda_1,\boldsymbol\lambda_2,\mu_1,\mu_2)}{\partial \lambda_2^{(i)}}=2\sigma_1\sigma_2(\sum_{j=1}^{n_1}\lambda_1^{(j)}R^{12}(\mathbf x_1^{(j)},\mathbf x_2^{(i)}))+2\sigma_2^2(\sum_{j=1}^{n_2}\lambda_2^{(j)}R^{22}(\mathbf x_2^{(i)},\mathbf x_2^{(j)}))-2\sigma_1\sigma_2R^{12}(\mathbf x_1,\mathbf x_2^{(i)})+\mu_2 \tag{13}
∂λ2(i)∂L(λ1,λ2,μ1,μ2)=2σ1σ2(j=1∑n1λ1(j)R12(x1(j),x2(i)))+2σ22(j=1∑n2λ2(j)R22(x2(i),x2(j)))−2σ1σ2R12(x1,x2(i))+μ2(13)
则式 (9) 简化如下:
{
2
σ
1
2
(
∑
j
=
1
n
1
λ
1
(
j
)
R
(
x
1
(
i
)
,
x
1
(
j
)
)
+
2
σ
1
σ
2
(
∑
j
=
1
n
2
λ
2
(
j
)
R
12
(
x
1
(
i
)
,
x
2
(
j
)
)
)
−
2
σ
1
2
R
11
(
x
1
,
x
1
(
i
)
)
+
μ
1
=
0
2
σ
1
σ
2
(
∑
j
=
1
n
1
λ
1
(
j
)
R
12
(
x
1
(
j
)
,
x
2
(
i
)
)
)
+
2
σ
2
2
(
∑
j
=
1
n
2
λ
2
(
j
)
R
22
(
x
2
(
i
)
,
x
2
(
j
)
)
)
−
2
σ
1
σ
2
R
12
(
x
1
,
x
2
(
i
)
)
+
μ
2
=
0
∑
i
=
1
n
1
λ
1
(
i
)
−
1
=
0
∑
i
=
1
n
2
λ
2
(
i
)
=
0
(14)
\begin {cases}2\sigma_1^2(\sum_{j=1}^{n_1}\lambda_1^{(j)}R(\mathbf x_1^{(i)},\mathbf x_1^{(j)})+2\sigma_1\sigma_2(\sum_{j=1}^{n_2}\lambda_2^{(j)}R^{12}(\mathbf x_1^{(i)},\mathbf x_2^{(j)}))-2\sigma_1^2R^{11}(\mathbf x_1,\mathbf x_1^{(i)}) +\mu_1=0 \\ 2\sigma_1\sigma_2(\sum_{j=1}^{n_1}\lambda_1^{(j)}R^{12}(\mathbf x_1^{(j)},\mathbf x_2^{(i)}))+2\sigma_2^2(\sum_{j=1}^{n_2}\lambda_2^{(j)}R^{22}(\mathbf x_2^{(i)},\mathbf x_2^{(j)}))-2\sigma_1\sigma_2R^{12}(\mathbf x_1,\mathbf x_2^{(i)})+\mu_2=0 \\ \sum_{i=1}^{n_1}\lambda_1^{(i)}-1 =0 \\ \sum_{i=1}^{n_2} \lambda_2^{(i)}=0 \end{cases} \tag{14}
⎩
⎨
⎧2σ12(∑j=1n1λ1(j)R(x1(i),x1(j))+2σ1σ2(∑j=1n2λ2(j)R12(x1(i),x2(j)))−2σ12R11(x1,x1(i))+μ1=02σ1σ2(∑j=1n1λ1(j)R12(x1(j),x2(i)))+2σ22(∑j=1n2λ2(j)R22(x2(i),x2(j)))−2σ1σ2R12(x1,x2(i))+μ2=0∑i=1n1λ1(i)−1=0∑i=1n2λ2(i)=0(14)
改写为矩阵形式有:
[
C
(
11
)
C
(
12
)
F
1
0
C
(
21
)
C
(
22
)
0
F
2
F
1
T
0
T
0
0
0
T
F
2
T
0
0
]
[
λ
1
λ
2
μ
1
/
2
μ
2
/
2
]
=
[
c
1
c
2
1
0
]
(13)
\begin {bmatrix} \mathbf C^{(11)} & \mathbf C^{(12)} & \mathbf F_1 & \mathbf 0 \\ \mathbf C^{(21)} & \mathbf C^{(22)} & \mathbf 0 & \mathbf F_2 \\ \mathbf F_1^T & \mathbf 0^T & 0 &0 \\ \mathbf 0^T & \mathbf F_2^T & 0 & 0\end {bmatrix} \begin{bmatrix} \boldsymbol\lambda_1 \\ \boldsymbol \lambda_2 \\ \mu_1/2 \\ \mu_2/2 \end {bmatrix} =\begin {bmatrix} \mathbf c_1 \\ \mathbf c_2 \\1 \\0 \end {bmatrix} \tag{13}
C(11)C(21)F1T0TC(12)C(22)0TF2TF10000F200
λ1λ2μ1/2μ2/2
=
c1c210
(13)
上式中:
C
(
11
)
=
(
σ
1
2
R
11
(
x
1
(
i
)
,
x
1
(
j
)
)
)
i
,
j
∈
R
n
1
×
n
1
C
(
12
)
=
(
σ
1
σ
2
R
12
(
x
1
(
i
)
,
x
2
(
j
)
)
)
i
,
j
=
(
C
(
21
)
)
T
∈
R
n
1
×
n
2
C
(
22
)
=
(
σ
2
2
R
22
(
x
2
(
i
)
,
x
2
(
j
)
)
)
i
,
j
∈
R
n
2
×
n
2
c
1
=
(
σ
1
2
R
11
(
x
1
(
i
)
,
x
1
)
)
i
∈
R
n
1
c
2
=
(
σ
1
σ
2
R
12
(
x
2
(
i
)
,
x
1
)
)
i
∈
R
n
2
F
1
=
[
1
1
…
1
]
i
T
,
i
∈
[
1
,
n
1
]
F
2
=
[
1
1
…
1
]
i
T
,
i
∈
[
1
,
n
2
]
(14)
\begin {align} &\mathbf C^{(11)}=(\sigma_1^2R^{11}(\mathbf x_1^{(i)},\mathbf x_1^{(j)}))_{i,j} \in\R^{n_1\times n_1} \\ & \mathbf C^{(12)}=(\sigma_1\sigma_2R^{12}(\mathbf x_1^{(i)},\mathbf x_2^{(j)}))_{i,j}=(\mathbf C^{(21)})^T \in \R^{n_1 \times n_2} \\ &\mathbf C^{(22)}=(\sigma_2^2R^{22}(\mathbf x_2^{(i)},\mathbf x_2^{(j)}))_{i,j} \in \R^{n_2\times n_2} \\ & \mathbf c_1=(\sigma_1^2R^{11}(\mathbf x_1^{(i)},\mathbf x_1))_i \in\R^{n_1} \\ & \mathbf c_2=(\sigma_1\sigma_2R^{12}(\mathbf x_2^{(i)},\mathbf x_1))_i \in \R^{n_2} \\ & \mathbf F_1=[1 \ \ 1 \ \ \ldots \ \ 1]^T_i,i\in[1,n_1] \\ & \mathbf F_2=[1 \ \ 1 \ \ \ldots \ \ 1]^T_i,i\in[1,n_2] \end {align} \tag{14}
C(11)=(σ12R11(x1(i),x1(j)))i,j∈Rn1×n1C(12)=(σ1σ2R12(x1(i),x2(j)))i,j=(C(21))T∈Rn1×n2C(22)=(σ22R22(x2(i),x2(j)))i,j∈Rn2×n2c1=(σ12R11(x1(i),x1))i∈Rn1c2=(σ1σ2R12(x2(i),x1))i∈Rn2F1=[1 1 … 1]iT,i∈[1,n1]F2=[1 1 … 1]iT,i∈[1,n2](14)
为了在让
σ
1
\sigma_1
σ1 和
σ
2
\sigma_2
σ2 在式(13)中消去,设:
λ
~
1
=
λ
1
,
λ
~
2
=
σ
2
σ
1
λ
2
μ
~
1
=
μ
1
/
(
2
σ
1
2
)
,
μ
~
2
=
μ
2
/
(
2
σ
1
σ
2
)
(15)
\begin {align} &\widetilde {\boldsymbol \lambda}_1=\boldsymbol \lambda_1, \ \ \widetilde {\boldsymbol \lambda}_2=\frac {\sigma_2}{\sigma_1}\boldsymbol\lambda_2 \\ &\widetilde \mu_1=\mu_1/(2\sigma_1^2), \ \ \widetilde {\mu}_2=\mu_2/(2\sigma_1\sigma_2) \end {align} \tag{15}
λ
1=λ1, λ
2=σ1σ2λ2μ
1=μ1/(2σ12), μ
2=μ2/(2σ1σ2)(15)
则式(13)化为:
[
R
(
11
)
R
(
12
)
F
1
0
R
(
21
)
R
(
22
)
0
F
2
F
1
T
0
T
0
0
0
T
F
2
T
0
0
]
[
λ
~
1
λ
~
2
μ
~
1
μ
~
2
]
=
[
r
1
(
x
)
r
2
(
x
)
1
0
]
(16)
\begin {bmatrix} \mathbf R^{(11)} & \mathbf R^{(12)} & \mathbf F_1 & \mathbf 0 \\ \mathbf R^{(21)} & \mathbf R^{(22)} & \mathbf 0 & \mathbf F_2 \\ \mathbf F_1^T & \mathbf 0^T & 0 & 0 \\ \mathbf 0^T & \mathbf F_2^T & 0 & 0 \end {bmatrix} \begin {bmatrix} \widetilde {\boldsymbol \lambda}_1 \\ \widetilde {\boldsymbol \lambda}_2 \\ \widetilde \mu_1 \\ \widetilde \mu_2 \end {bmatrix}=\begin {bmatrix} \mathbf r_1(\mathbf x) \\ \mathbf r_2(\mathbf x) \\ 1 \\ 0 \end {bmatrix} \tag{16}
R(11)R(21)F1T0TR(12)R(22)0TF2TF10000F200
λ
1λ
2μ
1μ
2
=
r1(x)r2(x)10
(16)
上式中:
R
(
11
)
=
(
R
11
(
x
1
(
i
)
,
x
1
(
j
)
)
)
i
,
j
∈
R
n
1
×
n
1
R
(
12
)
=
(
R
12
(
x
1
(
i
)
,
x
2
(
j
)
)
)
i
,
j
=
R
(
21
)
)
T
∈
R
n
1
×
n
2
R
(
22
)
=
(
R
22
(
x
2
(
i
)
,
x
2
(
j
)
)
)
i
,
j
∈
R
n
2
×
n
2
r
1
(
x
)
=
(
R
11
(
x
1
,
x
1
(
i
)
)
)
i
∈
R
n
1
r
2
(
x
)
=
(
R
12
(
x
1
,
x
2
(
i
)
)
)
i
∈
R
n
2
(17)
\begin {align} & \mathbf R^{(11)} = (R^{11}(\mathbf x_1^{(i)},\mathbf x_1^{(j)}))_{i,j}\in\R^{n_1 \times n_1} \\ & \mathbf R^{(12)} =(R^{12}(\mathbf x_1^{(i)},\mathbf x_2^{(j)}))_{i,j}=\mathbf R^{(21)})^T \in\R^{n_1 \times n_2} \\ & \mathbf R^{(22)} =(R^{22}(\mathbf x_2^{(i)},\mathbf x_2^{(j)}))_{i,j}\in\R^{n_2 \times n_2} \\ & \mathbf r_1(\mathbf x)=\mathbf (R^{11}(\mathbf x_1,\mathbf x_1^{(i)}))_i \in \R^{n_1} \\ & \mathbf r_2(\mathbf x)=(R^{12}(\mathbf x_1,\mathbf x_2^{(i)}))_i\in\R^{n_2} \end {align} \tag{17}
R(11)=(R11(x1(i),x1(j)))i,j∈Rn1×n1R(12)=(R12(x1(i),x2(j)))i,j=R(21))T∈Rn1×n2R(22)=(R22(x2(i),x2(j)))i,j∈Rn2×n2r1(x)=(R11(x1,x1(i)))i∈Rn1r2(x)=(R12(x1,x2(i)))i∈Rn2(17)
则 Cokriging 模型表达式:
y
^
(
x
)
=
λ
~
1
T
Y
1
,
s
+
σ
1
σ
2
λ
~
2
Y
2
,
s
=
[
λ
~
1
λ
~
2
]
[
y
1
,
s
σ
1
σ
2
y
2
,
s
]
(18)
\hat y(\mathbf x) = \widetilde {\boldsymbol \lambda}_1^T\mathbf Y_{1,s}+\frac {\sigma_1}{\sigma_2}\widetilde {\boldsymbol \lambda}_2\mathbf Y_{2,s}=\begin {bmatrix} \widetilde{\boldsymbol \lambda}_1 & \widetilde {\boldsymbol \lambda}_2 \end {bmatrix} \begin{bmatrix} \mathbf y_{1,s} \\ \frac {\sigma_1}{\sigma_2}\mathbf y_{2,s} \end {bmatrix} \tag{18}
y^(x)=λ
1TY1,s+σ2σ1λ
2Y2,s=[λ
1λ
2][y1,sσ2σ1y2,s](18)
结合式(16),则其矩阵形式为:
y
^
1
(
x
)
=
[
r
1
(
x
)
r
2
(
x
)
1
0
]
T
[
R
(
11
)
R
(
12
)
F
1
0
R
(
21
)
R
(
22
)
0
F
2
F
1
T
0
T
0
0
0
T
F
2
T
0
0
]
−
1
[
y
1
,
s
σ
1
σ
2
y
2
,
s
0
0
]
(19)
\hat y_1(\mathbf x)=\begin {bmatrix} \mathbf r_1(\mathbf x) \\ \mathbf r_2(\mathbf x) \\ 1 \\0 \end {bmatrix}^T \begin {bmatrix}\mathbf R^{(11)} & \mathbf R^{(12)} & \mathbf F_1 & \mathbf 0 \\ \mathbf R^{(21)} & \mathbf R^{(22)} & \mathbf 0 & \mathbf F_2 \\ \mathbf F_1^T & \mathbf 0^T &0 &0 \\ \mathbf 0^T &\mathbf F_2^T & 0 & 0 \end {bmatrix}^{-1} \begin {bmatrix} \mathbf y_{1,s} \\ \frac {\sigma_1}{\sigma_2}\mathbf y_{2,s} \\ 0 \\ 0 \end {bmatrix} \tag{19}
y^1(x)=
r1(x)r2(x)10
T
R(11)R(21)F1T0TR(12)R(22)0TF2TF10000F200
−1
y1,sσ2σ1y2,s00
(19)
参考 Kriging 模型推导过程,可得 Cokriging 模型表达式如下:
y
^
1
(
x
)
=
φ
β
~
+
r
(
x
)
R
−
1
(
y
~
s
−
F
β
~
)
(20)
\hat y_1(\mathbf x)= \boldsymbol \varphi\widetilde {\boldsymbol\beta}+\mathbf r(\mathbf x)\mathbf R^{-1}(\widetilde {\mathbf y}_s-\mathbf F\widetilde{\boldsymbol \beta}) \tag {20}
y^1(x)=φβ
+r(x)R−1(y
s−Fβ
)(20)
上式中:
φ
=
[
1
0
]
,
β
~
=
[
β
~
1
β
~
2
]
=
(
F
T
R
−
1
F
)
−
1
F
T
R
−
1
y
~
s
,
r
=
[
r
1
(
x
)
r
2
(
x
)
]
,
R
=
[
R
(
11
)
R
(
12
)
R
(
21
)
R
(
22
)
]
,
y
~
s
=
[
y
1
σ
1
σ
2
y
2
]
,
F
~
=
[
F
1
0
0
F
2
]
(21)
\begin {align} &\boldsymbol \varphi = \begin {bmatrix}1 \\ 0 \end {bmatrix}, \ \ \widetilde {\boldsymbol\beta}=\begin {bmatrix} \widetilde\beta_1 \\ \widetilde\beta_2 \end {bmatrix}=(\mathbf F^T\mathbf R^{-1}\mathbf F)^{-1}\mathbf F^T\mathbf R^{-1}\widetilde {\mathbf y}_s, \\ &\mathbf r=\begin {bmatrix} \mathbf r_1(\mathbf x) \\ \mathbf r_2(\mathbf x) \end {bmatrix} , \ \ \mathbf R=\begin {bmatrix} \mathbf R^{(11)} & \mathbf R^{(12)} \\ \mathbf R^{(21)} & \mathbf R^{(22)}\end {bmatrix}, \\ & \widetilde {\mathbf y}_s=\begin {bmatrix}\mathbf y_1 \\ \frac{\sigma_1}{\sigma_2} \mathbf y_2 \end {bmatrix}, \ \ \widetilde {\mathbf F}= \begin {bmatrix} \mathbf F_1 & \mathbf 0 \\ \mathbf 0 & \mathbf F_2\end {bmatrix} \end {align} \tag{21}
φ=[10], β
=[β
1β
2]=(FTR−1F)−1FTR−1y
s,r=[r1(x)r2(x)], R=[R(11)R(21)R(12)R(22)],y
s=[y1σ2σ1y2], F
=[F100F2](21)
在式(20)中,令
v
c
o
k
=
R
−
1
(
y
~
s
−
F
β
~
)
v_{cok}=\mathbf R^{-1}(\widetilde {\mathbf y}_s-\mathbf F\widetilde {\boldsymbol \beta})
vcok=R−1(y
s−Fβ
) ,那么
φ
β
~
\boldsymbol \varphi \widetilde \beta
φβ
和
v
c
o
k
v_{cok}
vcok 只与已知样本点有关,已未知点无关,可以在模型训练结束后一次性计算并存储。之后,预测任意
x
\mathbf x
x 处的响应值只需要计算
r
(
x
)
\mathbf r(\mathbf x)
r(x) 与
V
k
r
i
g
\mathbf V_{krig}
Vkrig 间的点乘。
同理得到均方差为:
M
S
E
[
y
^
1
(
x
)
]
=
σ
1
2
[
1.0
−
r
T
R
−
1
r
+
(
r
T
R
−
1
F
~
−
φ
)
(
F
~
T
R
−
1
F
)
−
1
(
r
T
R
−
1
F
~
−
φ
)
T
]
(22)
MSE[\hat y_1(\mathbf x)]=\sigma_1^2[1.0-\mathbf r^T\mathbf R^{-1}\mathbf r+(\mathbf r^T\mathbf R^{-1}\widetilde {\mathbf F} - \boldsymbol \varphi)(\widetilde {\mathbf F}^T\mathbf R^{-1}\mathbf F)^{-1}(\mathbf r^T\mathbf R^{-1}\widetilde {\mathbf F}-\boldsymbol\varphi)^T] \tag{22}
MSE[y^1(x)]=σ12[1.0−rTR−1r+(rTR−1F
−φ)(F
TR−1F)−1(rTR−1F
−φ)T](22)
2.3 相关函数
在式 CoKriging模型表达式中,相关矩阵 R \mathbf R R 和相关矢量 $\mathbf r $ 构造涉及相关函数的选择和计算,这一过程与 Kriging 模型建立过程是类似的。下表为常用的相关函数:
相关函数 | R ( x − x ′ ) \mathbf R(\mathbf x - \mathbf x') R(x−x′) |
---|---|
Exponential function | R ( x , x ′ ) = e x p { − ∑ i = 1 m θ i / ∣ x i − x i ′ ∣ } \mathbf R(\mathbf x , \mathbf x')=exp\{-\sum_{i=1}^m\theta_i/ \lvert {x_i -x_i'} \rvert \} R(x,x′)=exp{−∑i=1mθi/∣xi−xi′∣} |
Power function | R ( x , x ′ ) = e x p { − ∑ i = 1 m θ i / ( x i − x i ′ ) 2 } \mathbf R(\mathbf x , \mathbf x')=exp\{-\sum_{i=1}^m\theta_i/(x_i-x_i')^2\} R(x,x′)=exp{−∑i=1mθi/(xi−xi′)2} |
Gaussian function | R ( x , x ′ ) = e x p { − ∑ i = 1 m θ i ( x i − x i ′ ) p i } \mathbf R(\mathbf x , \mathbf x')=exp\{-\sum_{i=1}^m\theta_i(x_i-x_i')^{p_i}\} R(x,x′)=exp{−∑i=1mθi(xi−xi′)pi} |
Linear function | R ( x , x ′ ) = m a x { 0 , 1 − ∑ i = 1 m θ i / ∣ x i − x i ′ ∣ } \mathbf R(\mathbf x , \mathbf x')=max\{0,1-\sum_{i=1}^m\theta_i/ \lvert x_i-x_i' \rvert \} R(x,x′)=max{0,1−∑i=1mθi/∣xi−xi′∣} |
Ball function | R ( x , x ′ ) = 1 − 1.5 ζ + 0.5 ζ 3 , ζ = m i n { 1 , ∑ i = 1 m θ i ∣ x i − x i ′ ∣ } \mathbf R (\mathbf x,\mathbf x')=1-1.5\zeta+0.5 \zeta ^3, \\ \zeta=min\{ 1,\sum_{i=1}^m\theta_i \lvert x_i-x_i' \rvert \} R(x,x′)=1−1.5ζ+0.5ζ3,ζ=min{1,∑i=1mθi∣xi−xi′∣} |
Spline function | R ( x , x ′ ) = { 1 − 15 ( ∑ i = 1 m θ i ∣ x i − x i ′ ∣ ) 2 + 30 ( ∑ i = 1 m θ i ∣ x i − x i ′ ∣ ) 3 0 ≤ ∑ i = 1 m θ i ∣ x i − x i ′ ∣ ≤ 0.2 1.25 ( 1 − ( ∑ i = 1 m θ i ∣ x i − x i ′ ∣ ) 3 ) 0.2 ≤ ∑ i = 1 m θ i ∣ x i − x i ′ ∣ ≤ 1 0 ∑ i = 1 m θ i ∣ x i − x i ′ ∣ ≥ 1 \mathbf R(\mathbf x , \mathbf x')=\begin {cases}1-15(\sum_{i=1}^m \theta_i \lvert x_i-x_i' \rvert )^2+30(\sum_{i=1}^m \theta_i \lvert x_i-x_i' \rvert )^3 \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0\leq\sum_{i=1}^m \theta_i \lvert x_i-x_i' \rvert \leq0.2 \\ 1.25(1-(\sum_{i=1}^m\theta_i \lvert x_i-x_i' \rvert)^3) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0.2\leq\sum_{i=1}^m\theta_i \lvert x_i-x_i' \rvert \leq 1 \\ 0 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \sum_{i=1}^m\theta_i \lvert x_i-x_i' \rvert \ge1 \end {cases} R(x,x′)=⎩ ⎨ ⎧1−15(∑i=1mθi∣xi−xi′∣)2+30(∑i=1mθi∣xi−xi′∣)3 0≤∑i=1mθi∣xi−xi′∣≤0.21.25(1−(∑i=1mθi∣xi−xi′∣)3) 0.2≤∑i=1mθi∣xi−xi′∣≤10 ∑i=1mθi∣xi−xi′∣≥1 |
2.4 模型参数训练
采用“最大似然估计”对 Cokriging 模型超参数进行训练,由前述引入的高斯静态随机过程的假设,在任意一点 x \mathbf x x 处的响应值 y ^ ( x ) \hat y(\mathbf x) y^(x) 服从正态分布 N ( β 0 , σ 2 ) N(\beta_0,\sigma^2) N(β0,σ2) ,则每个样本点处的概率密度函数为:
P
[
Y
(
x
)
]
=
1
2
π
σ
e
x
p
(
−
(
y
(
i
)
−
β
0
)
2
2
σ
2
)
,
i
=
1
,
…
,
n
1
+
n
2
(23)
P[Y(\mathbf x)]= \frac {1}{\sqrt{2\pi}\sigma}exp\left( -\frac{(y^{(i)}-\beta_0)^2}{2\sigma^2} \right),i=1, \ldots,n_1+n_2 \tag{23}
P[Y(x)]=2πσ1exp(−2σ2(y(i)−β0)2),i=1,…,n1+n2(23)
则所有样本点的联合分布密度(即似然函数)为:
L
(
β
~
0
,
σ
1
/
σ
2
,
θ
(
11
)
,
θ
(
12
)
,
θ
(
22
)
)
=
1
(
2
π
σ
1
2
)
(
n
1
+
n
2
)
∣
R
∣
⋅
e
x
p
(
−
1
2
(
y
~
s
−
F
β
~
)
T
R
−
1
(
y
~
s
−
F
β
~
)
σ
2
)
(24)
L(\widetilde {\boldsymbol\beta}_0,\sigma_1/\sigma_2,\boldsymbol \theta^{(11)},\boldsymbol \theta^{(12)},\boldsymbol \theta^{(22)})=\frac{1}{\sqrt {(2\pi\sigma_1^2)^{(n_1+n_2)}|\mathbf R|}}\cdot exp(-\frac{1}{2}\frac {(\widetilde {\mathbf y}_s-\mathbf F\widetilde{\boldsymbol \beta})^T\mathbf R^{-1}(\widetilde {\mathbf y}_s-\mathbf F\widetilde{\boldsymbol \beta})}{\sigma^2}) \tag{24}
L(β
0,σ1/σ2,θ(11),θ(12),θ(22))=(2πσ12)(n1+n2)∣R∣1⋅exp(−21σ2(y
s−Fβ
)TR−1(y
s−Fβ
))(24)
与Kriging模型推导过程相似,让似然函数最大,可以得到:
β
~
=
(
F
T
R
−
1
F
)
−
1
F
T
R
−
1
y
~
s
σ
1
σ
2
=
(
[
0
y
2
]
T
R
−
1
[
0
y
2
]
)
−
1
[
0
y
2
]
T
R
−
1
[
−
(
y
1
−
F
1
β
~
1
)
−
F
2
β
~
2
]
σ
1
2
=
(
y
~
s
−
F
β
~
)
T
R
−
1
(
y
~
s
−
F
β
~
)
T
n
1
+
n
2
(25)
\begin {align} &\widetilde {\boldsymbol \beta}=(\mathbf F^T\mathbf R^{-1}\mathbf F)^{-1}\mathbf F^T\mathbf R^{-1}\widetilde {\mathbf y}_s \\ & \frac{\sigma_1}{\sigma_2}=\begin {pmatrix} \begin {bmatrix} \mathbf 0 \\ \mathbf y_2\end {bmatrix}^T\mathbf R^{-1}\begin {bmatrix} \mathbf 0 \\ \mathbf y_2 \end {bmatrix} \end {pmatrix}^{-1}\begin {bmatrix} \mathbf 0 \\ \mathbf y_2 \end {bmatrix}^T \mathbf R^{-1}\begin {bmatrix} -(\mathbf y_1-\mathbf F_1 \widetilde \beta_1) \\ -\mathbf F_2\widetilde \beta_2 \end {bmatrix} \\ & \sigma_1^2=\frac{(\widetilde {\mathbf y}_s-\mathbf F\widetilde {\boldsymbol\beta})^T\mathbf R^{-1}(\widetilde {\mathbf y}_s-\mathbf F\widetilde{\boldsymbol \beta})^T}{n_1+n_2} \end {align} \tag {25}
β
=(FTR−1F)−1FTR−1y
sσ2σ1=([0y2]TR−1[0y2])−1[0y2]TR−1[−(y1−F1β
1)−F2β
2]σ12=n1+n2(y
s−Fβ
)TR−1(y
s−Fβ
)T(25)
将上式代入式(22)并对等式两边取对数有:
l
n
[
L
(
θ
(
11
)
,
θ
(
12
)
,
θ
22
)
]
=
−
1
2
[
(
n
1
+
n
2
)
⋅
l
n
(
σ
1
2
)
+
l
n
(
∣
R
∣
)
]
(26)
ln[L(\boldsymbol\theta^{(11)},\boldsymbol \theta^{(12)},\boldsymbol \theta^{22})]=-\frac {1}{2}[(n_1+n_2)\cdot ln(\sigma_1^2)+ln(|\mathbf R|)] \tag{26}
ln[L(θ(11),θ(12),θ22)]=−21[(n1+n2)⋅ln(σ12)+ln(∣R∣)](26)
无法解析地求出
θ
(
11
)
\boldsymbol \theta^{(11)}
θ(11) 、
θ
(
12
)
\boldsymbol \theta^{(12)}
θ(12)和
θ
(
22
)
\boldsymbol \theta^{(22)}
θ(22) 的最优值,需要采用数值优化算法寻优,如梯度优化算法等。
参考文献
[1] Han Z , Zimmerman R , GoRtz S . Alternative Cokriging Method for Variable-Fidelity Surrogate Modeling[J]. Aiaa Journal, 2012, 50(5):1205-1210.