Local extreme learning machines and domain decomposition for solving linear and nonlinear partial differential equations
- Suchuan Dong, Zongwei Lib
- Center for Computational and Applied Mathematics, Department ofMathematics, Purdue University West Lafayette, IN, USA
- Comput. Methods Appl. Mech. Engrg. 2021
Abstract
- We present a neural network-based method for solving linear and nonlinear partial differential equations, by combining the ideas of extreme learning machines (ELM), domain decomposition and local neural networks.
- The field solution on each sub-domain is represented by a local feed-forward neural network, and C k C^{k} Ck continuity conditions are imposed on the sub-domain boundaries. Each local neural network consists of a small number of hidden layers, while its last hidden layer can be wide.
Motivation
- Several successful DNN-based PDE solvers have emerged in the past years, such as the deep Galerkin method (DGM), the physics-informed neural network (PINN), and related approaches . Neural network-based PDE solutions are smooth analytical functions, provided that smooth activation functions are used therein. The solution and its derivatives can then be computed exactly, by evaluation of the neural network or by auto -differentiation.
- While their computational performance is promising, DNN-based PDE solvers, in their current state, suffer from a number of limitations that make them numerically less than satisfactory and computationally uncompetitive.
- The first limitation is the solution accuracy of DNN-based methods. A survey of related literature indicates
that the absolute error of the current DNN-based methods is generally on, and rarely goes below, the level of 1 0 − 3 − 1 0 − 4 10^{-3}-10^{-4} 10−3−10−4. - Another limitation concerns the computational cost.
The computational cost of DNN-based PDE solvers is extremely high.
- The first limitation is the solution accuracy of DNN-based methods. A survey of related literature indicates
- In the current work we concentrate on the accuracy and the computational cost of neural network-based numerical
methods.
Contribution
- Network architecture and training parameters. The current method is based on shallow feed-forward neural networks.
- Training method. The network is trained and the values for the training parameters are determined by a least squares computation, not by the back propagation (gradient descent-type) algorithm. For linear PDEs, training the neural network involves a linear least squares computation. For nonlinear PDEs, the network training involves a nonlinear least squares computation.
- Domain decomposition and local neural networks. We partition the overall domain into sub-domains, and
represent the solution on each sub-domain locally by a shallow feed-forward neural network. - Block time marching. For long-time simulations of time-dependent PDEs, the current method adopts a block
time-marching strategy. (在时间上被划分为几个窗口,称为时间块。然后依次在每个时间块的时空域上分别求解偏微分方程
Main contribution:
(1)A main contribution of this work is the introduction of
an ELM-like method for nonlinear differential equations, based on domain decomposition and local neural networks. In contrast, existing ELM-based methods for differential equations have been confined to linear problems, and the neural network is limited to a single hidden layer. For nonlinear problems, to solve the nonlinear system for the training parameters, we have adopted two methods:
- a nonlinear least squares method with perturbations (referred
to as NLSQ-perturb) - a combined Newton/linear least squares method (referred to as Newton-LLSQ).
We find that the random perturbation in the NLSQ-perturb method is crucial to preventing the method from being
trapped to local minima with cost values exceeding some given tolerance, especially in under-resolved cases and in
long-time simulations.
(2)An other contribution of the current work is the aforementioned block time-marching scheme for long-time
simulations of time-dependent linear/nonlinear PDEs
Method
Local extreme learning machines (locELM) for representing functions
ELM
u i s ( x ) = ∑ j = 1 M V j s ( x ) w j i s , x ∈ Ω s , 1 ⩽ i ⩽ N u_{i}^{s}(\mathbf{x})=\sum_{j=1}^{M} V_{j}^{s}(\mathbf{x}) w_{j i}^{s}, \quad \mathbf{x} \in \Omega_{s}, \quad 1 \leqslant i \leqslant N uis(x)=j=1∑MVjs(x)wjis,x∈Ωs,1⩽i⩽N
f s ( x ) = ( u 1 s , u 2 s , … , u N s ) f_{s}(\mathbf{x})=\left(u_{1}^{s}, u_{2}^{s}, \ldots, u_{N}^{s}\right) fs(x)=(u1s,u2s,…,uNs)
Remark:Apart from the above logical operations, in the implementation we incorporate an additional
normalization layer immediately behind the input layer in each of the local neural networks. For each subdomain emn, the normalization layer performs an affine mapping and normalizes the input data,
(
x
,
y
)
∈
Ω
e
m
n
=
[
X
m
,
X
m
+
1
]
×
[
Y
n
,
Y
n
+
1
]
(x, y) ∈ Ω_{emn} = [X_{m}, X_{m+1}]×[Y_{n}, Y_{n+1}]
(x,y)∈Ωemn=[Xm,Xm+1]×[Yn,Yn+1], such that the output data of the normalization layer fall into the domain [−1, 1]×[−1, 1]. This extra normalization layer contains no adjustable (training) parameters.
Time-independent linear PDE
L
u
=
f
(
x
,
y
)
,
u
(
x
,
y
)
=
g
(
x
,
y
)
,
on
∂
Ω
,
\begin{array}{l} L u=f(x, y), \\ u(x, y)=g(x, y), \quad \text { on } \partial \Omega, \end{array}
Lu=f(x,y),u(x,y)=g(x,y), on ∂Ω,
u
e
m
n
(
x
,
y
)
=
∑
j
=
1
M
V
j
e
m
n
(
x
,
y
)
w
j
e
m
n
,
(
x
,
y
)
∈
Ω
e
m
n
,
0
⩽
m
⩽
N
x
−
1
,
0
⩽
n
⩽
N
y
−
1
,
u^{e_{m n}}(x, y)=\sum_{j=1}^{M} V_{j}^{e_{m n}}(x, y) w_{j}^{e_{m n}}, \quad(x, y) \in \Omega_{e_{m n}}, \quad 0 \leqslant m \leqslant N_{x}-1, \quad 0 \leqslant n \leqslant N_{y}-1,
uemn(x,y)=j=1∑MVjemn(x,y)wjemn,(x,y)∈Ωemn,0⩽m⩽Nx−1,0⩽n⩽Ny−1,
∑
j
=
1
M
[
L
V
j
e
m
n
(
x
p
e
m
n
,
y
q
e
m
n
)
]
w
j
e
m
n
=
f
(
x
p
e
m
n
,
y
q
e
m
n
)
,
for
0
⩽
m
⩽
N
x
−
1
,
0
⩽
n
⩽
N
y
−
1
,
0
⩽
p
⩽
Q
x
−
1
,
0
⩽
q
⩽
Q
y
−
1
,
\begin{array}{l} \sum_{j=1}^{M}\left[L V_{j}^{e_{m n}}\left(x_{p}^{e_{m n}}, y_{q}^{e_{m n}}\right)\right] w_{j}^{e_{m n}}=f\left(x_{p}^{e_{m n}}, y_{q}^{e_{m n}}\right), \\ \text { for } 0 \leqslant m \leqslant N_{x}-1,0 \leqslant n \leqslant N_{y}-1,0 \leqslant p \leqslant Q_{x}-1,0 \leqslant q \leqslant Q_{y}-1, \end{array}
∑j=1M[LVjemn(xpemn,yqemn)]wjemn=f(xpemn,yqemn), for 0⩽m⩽Nx−1,0⩽n⩽Ny−1,0⩽p⩽Qx−1,0⩽q⩽Qy−1,
where, We enforce Eq on the four boundaries of the domain Ω
∑
j
=
1
M
V
j
e
0
n
(
a
1
,
y
q
e
0
n
)
w
j
e
0
n
=
g
(
a
1
,
y
q
e
0
n
)
,
∑
j
=
1
M
V
j
e
m
n
(
b
1
,
y
q
e
m
n
)
w
j
e
m
n
=
g
(
b
1
,
y
q
e
m
n
)
∑
j
=
1
M
V
j
e
m
0
(
x
p
e
m
0
,
a
2
)
w
j
e
m
0
=
g
(
x
p
e
m
0
,
a
2
)
∑
j
=
1
M
V
j
e
m
n
(
x
p
e
m
n
,
b
2
)
w
j
e
m
n
=
g
(
x
p
e
m
n
,
b
2
)
\begin{array}{l} \sum_{j=1}^{M} V_{j}^{e_{0 n}}\left(a_{1}, y_{q}^{e_{0 n}}\right) w_{j}^{e_{0 n}}=g\left(a_{1}, y_{q}^{e_{0 n}}\right), \\ \sum_{j=1}^{M} V_{j}^{e_{m n}}\left(b_{1}, y_{q}^{e_{m n}}\right) w_{j}^{e_{m n}}=g\left(b_{1}, y_{q}^{e_{m n}}\right) \\ \sum_{j=1}^{M} V_{j}^{e_{m 0}}\left(x_{p}^{e_{m 0}}, a_{2}\right) w_{j}^{e_{m 0}}=g\left(x_{p}^{e_{m 0}}, a_{2}\right) \\ \sum_{j=1}^{M} V_{j}^{e_{m n}}\left(x_{p}^{e_{m n}}, b_{2}\right) w_{j}^{e_{m n}}=g\left(x_{p}^{e_{m n}}, b_{2}\right) \end{array}
∑j=1MVje0n(a1,yqe0n)wje0n=g(a1,yqe0n),∑j=1MVjemn(b1,yqemn)wjemn=g(b1,yqemn)∑j=1MVjem0(xpem0,a2)wjem0=g(xpem0,a2)∑j=1MVjemn(xpemn,b2)wjemn=g(xpemn,b2)
continuity conditions
∑
j
=
1
M
V
j
e
m
n
(
X
m
+
1
,
y
q
e
m
n
)
w
j
e
m
n
−
∑
j
=
1
M
V
j
e
m
+
1
,
n
(
X
m
+
1
,
y
q
e
m
+
1
,
n
)
w
j
e
m
+
1
,
n
=
0
,
∑
j
=
1
M
∂
V
j
e
m
n
∂
x
∣
(
X
m
+
1
,
y
q
e
m
n
)
w
j
e
m
n
−
∑
j
=
1
M
∂
V
j
e
m
+
1
,
n
∂
x
∣
(
X
m
+
1
,
y
q
e
m
+
1
,
n
)
w
j
e
m
+
1
,
n
=
0
,
for
0
⩽
m
⩽
N
x
−
2
,
0
⩽
n
⩽
N
y
−
1
,
0
⩽
q
⩽
Q
y
−
1
,
\begin{array}{l} \sum_{j=1}^{M} V_{j}^{e_{m n}}\left(X_{m+1}, y_{q}^{e_{m n}}\right) w_{j}^{e_{m n}}-\sum_{j=1}^{M} V_{j}^{e_{m+1, n}}\left(X_{m+1}, y_{q}^{e_{m+1, n}}\right) w_{j}^{e_{m+1, n}}=0, \\ \left.\sum_{j=1}^{M} \frac{\partial V_{j}^{e_{m n}}}{\partial x}\right|_{\left(X_{m+1}, y_{q}^{e_{m n}}\right)} w_{j}^{e_{m n}}-\left.\sum_{j=1}^{M} \frac{\partial V_{j}^{e_{m+1, n}}}{\partial x}\right|_{\left(X_{m+1}, y_{q}^{e_{m+1, n}}\right)} w_{j}^{e_{m+1, n}}=0, \\ \text { for } 0 \leqslant m \leqslant N_{x}-2,0 \leqslant n \leqslant N_{y}-1,0 \leqslant q \leqslant Q_{y}-1, \end{array}
∑j=1MVjemn(Xm+1,yqemn)wjemn−∑j=1MVjem+1,n(Xm+1,yqem+1,n)wjem+1,n=0,∑j=1M∂x∂Vjemn∣∣∣(Xm+1,yqemn)wjemn−∑j=1M∂x∂Vjem+1,n∣∣∣∣(Xm+1,yqem+1,n)wjem+1,n=0, for 0⩽m⩽Nx−2,0⩽n⩽Ny−1,0⩽q⩽Qy−1,
∑
j
=
1
M
V
j
e
m
n
(
x
p
e
m
n
,
Y
n
+
1
)
w
j
e
m
n
−
∑
j
=
1
M
V
j
e
m
,
n
+
1
(
x
p
e
m
,
n
+
1
,
Y
n
+
1
)
w
j
e
m
,
n
+
1
=
0
,
∑
j
=
1
M
∂
V
j
e
m
n
∂
y
∣
(
x
p
e
m
n
,
Y
n
+
1
)
w
j
e
m
n
−
∑
j
=
1
M
∂
V
j
e
m
,
n
+
1
∂
y
∣
(
x
p
e
m
n
+
1
,
Y
n
+
1
)
w
j
e
m
,
n
+
1
=
0
,
for
0
⩽
m
⩽
N
x
−
1
,
0
⩽
n
⩽
N
y
−
2
,
0
⩽
p
⩽
Q
x
−
1
,
\begin{array}{l} \sum_{j=1}^{M} V_{j}^{e_{m n}}\left(x_{p}^{e_{m n}}, Y_{n+1}\right) w_{j}^{e_{m n}}-\sum_{j=1}^{M} V_{j}^{e_{m, n+1}}\left(x_{p}^{e_{m, n+1}}, Y_{n+1}\right) w_{j}^{e_{m, n+1}}=0, \\ \left.\sum_{j=1}^{M} \frac{\partial V_{j}^{e_{m n}}}{\partial y}\right|_{\left(x_{p}^{e_{m n}}, Y_{n+1}\right)} w_{j}^{e_{m n}}-\left.\sum_{j=1}^{M} \frac{\partial V_{j}^{e_{m, n+1}}}{\partial y}\right|_{\left(x_{p}^{e_{m n+1}}, Y_{n+1}\right)} w_{j}^{e_{m, n+1}}=0, \\ \text { for } 0 \leqslant m \leqslant N_{x}-1,0 \leqslant n \leqslant N_{y}-2,0 \leqslant p \leqslant Q_{x}-1, \end{array}
∑j=1MVjemn(xpemn,Yn+1)wjemn−∑j=1MVjem,n+1(xpem,n+1,Yn+1)wjem,n+1=0,∑j=1M∂y∂Vjemn∣∣∣(xpemn,Yn+1)wjemn−∑j=1M∂y∂Vjem,n+1∣∣∣∣(xpemn+1,Yn+1)wjem,n+1=0, for 0⩽m⩽Nx−1,0⩽n⩽Ny−2,0⩽p⩽Qx−1,
Time-dependent linear differential equations
类似,增加时间分区域
∑
j
=
1
M
V
j
e
m
n
l
(
x
p
e
m
n
l
,
y
q
e
m
n
l
,
T
l
+
1
)
w
j
e
m
n
l
−
∑
j
=
1
M
V
j
e
m
n
,
l
+
1
(
x
p
e
m
n
,
l
+
1
,
y
q
e
m
n
,
l
+
1
,
T
l
+
1
)
w
j
e
m
n
,
l
+
1
=
0
,
0
⩽
m
⩽
N
x
−
1
,
0
⩽
n
⩽
N
y
−
1
,
0
⩽
l
⩽
N
t
−
2
,
0
⩽
p
⩽
Q
x
−
1
,
0
⩽
q
⩽
Q
y
−
1
\begin{array}{l} \sum_{j=1}^{M} V_{j}^{e_{m n l}}\left(x_{p}^{e_{m n l}}, y_{q}^{e_{m n l}}, T_{l+1}\right) w_{j}^{e_{m n l}}-\sum_{j=1}^{M} V_{j}^{e_{m n, l+1}}\left(x_{p}^{e_{m n, l+1}}, y_{q}^{e_{m n, l+1}}, T_{l+1}\right) w_{j}^{e_{m n, l+1}}=0, \\ \quad 0 \leqslant m \leqslant N_{x}-1, \quad 0 \leqslant n \leqslant N_{y}-1, \quad 0 \leqslant l \leqslant N_{t}-2, \quad 0 \leqslant p \leqslant Q_{x}-1, \quad 0 \leqslant q \leqslant Q_{y}-1 \end{array}
∑j=1MVjemnl(xpemnl,yqemnl,Tl+1)wjemnl−∑j=1MVjemn,l+1(xpemn,l+1,yqemn,l+1,Tl+1)wjemn,l+1=0,0⩽m⩽Nx−1,0⩽n⩽Ny−1,0⩽l⩽Nt−2,0⩽p⩽Qx−1,0⩽q⩽Qy−1
Nonlinear differential equations
L
u
+
F
(
u
,
u
x
,
u
y
)
=
f
(
x
,
y
)
,
u
(
x
,
y
)
=
g
(
x
,
y
)
,
on
∂
Ω
,
(
22
)
\begin{array}{l} L u+F\left(u, u_{x}, u_{y}\right)=f(x, y), \\ u(x, y)=g(x, y), \quad \text { on } \partial \Omega, \end{array}(22)
Lu+F(u,ux,uy)=f(x,y),u(x,y)=g(x,y), on ∂Ω,(22)
u
e
m
n
(
x
,
y
)
=
∑
j
=
1
M
V
j
e
m
n
(
x
,
y
)
w
j
e
m
n
,
∂
u
e
m
n
∂
x
=
∑
j
=
1
M
∂
V
j
e
m
n
∂
x
w
j
e
m
n
,
∂
u
e
m
n
∂
y
=
∑
j
=
1
M
∂
V
j
e
m
n
∂
y
w
j
e
m
n
,
for
0
⩽
m
⩽
N
x
−
1
,
0
⩽
n
⩽
N
y
−
1
,
\begin{aligned} u^{e_{m n}}(x, y) &=\sum_{j=1}^{M} V_{j}^{e_{m n}}(x, y) w_{j}^{e_{m n}}, \quad \frac{\partial u^{e_{m n}}}{\partial x}=\sum_{j=1}^{M} \frac{\partial V_{j}^{e_{m n}}}{\partial x} w_{j}^{e_{m n}}, \quad \frac{\partial u^{e_{m n}}}{\partial y}=\sum_{j=1}^{M} \frac{\partial V_{j}^{e_{m n}}}{\partial y} w_{j}^{e_{m n}}, \\ \text { for } 0 \leqslant m \leqslant N_{x}-1,0 \leqslant n \leqslant N_{y}-1, & \end{aligned}
uemn(x,y) for 0⩽m⩽Nx−1,0⩽n⩽Ny−1,=j=1∑MVjemn(x,y)wjemn,∂x∂uemn=j=1∑M∂x∂Vjemnwjemn,∂y∂uemn=j=1∑M∂y∂Vjemnwjemn,
∑
j
=
1
M
[
L
V
j
e
m
n
(
x
p
e
m
n
,
y
q
e
m
n
)
]
w
j
e
m
n
+
F
(
u
e
m
n
,
u
x
e
m
n
,
u
y
e
m
n
)
∣
(
x
p
e
m
n
,
y
q
e
m
n
)
−
f
(
x
p
e
m
n
,
y
q
e
m
n
)
=
0
,
for
0
⩽
m
⩽
N
x
−
1
,
0
⩽
n
⩽
N
y
−
1
,
0
⩽
p
⩽
Q
x
−
1
,
0
⩽
q
⩽
Q
y
−
1
,
\begin{array}{c} \sum_{j=1}^{M}\left[L V_{j}^{e_{m n}}\left(x_{p}^{e_{m n}}, y_{q}^{e_{m n}}\right)\right] w_{j}^{e_{m n}}+\left.F\left(u^{e_{m n}}, u_{x}^{e_{m n}}, u_{y}^{e_{m n}}\right)\right|_{\left(x_{p}^{e_{m n}}, y_{q}^{e_{m n}}\right)}-f\left(x_{p}^{e_{m n}}, y_{q}^{e_{m n}}\right)=0, \\ \quad \text { for } 0 \leqslant m \leqslant N_{x}-1,0 \leqslant n \leqslant N_{y}-1,0 \leqslant p \leqslant Q_{x}-1,0 \leqslant q \leqslant Q_{y}-1, \end{array}
∑j=1M[LVjemn(xpemn,yqemn)]wjemn+F(uemn,uxemn,uyemn)∣∣(xpemn,yqemn)−f(xpemn,yqemn)=0, for 0⩽m⩽Nx−1,0⩽n⩽Ny−1,0⩽p⩽Qx−1,0⩽q⩽Qy−1,
- NLSQ
- Newton-LLSQ
we first linearize Eq. (22a) to arrive at a linear differential equation about the increment field.
We observe that the convergence behavior of the Newton-LLSQ method is not as regular as the NLSQ-perturb method, but it appears less likely to be trapped to local minimum solutions.
思考
- 非规则区域