Gauss quadrature approximation by Lanczos algorithm

中原H

已于 2022-12-31 22:51:48 修改

阅读量263

点赞数

分类专栏：数学学海文章标签：算法 1024程序员节

于 2022-07-28 18:13:25 首次发布

本文链接：https://blog.csdn.net/m0_49946797/article/details/126039366

版权

数学学海专栏收录该内容

9 篇文章 1 订阅

订阅专栏

Gauss quadrature approximation by Lanczos algorithm

一、 Gauss Quadrature
- 1.1 Gauss quadrature with weight function
- 1.2 The caculation of the weights and nodes
二、Lanczos algorithm
三、Two important theory
- 3.1 Theorem 1
- 3.2 Theorem 2
四、Gauss quadrature by Lanczos algorithm

一、 Gauss Quadrature

1.1 Gauss quadrature with weight function

For
$I(f)=\int_{a}^{b} \rho(x) f(x)dx \tag{1}$
If the following quadrature formula[2,p220]
$\int_{a}^{b}\rho(x)f(x)dx\approx \sum_{k=0}^{n}A_{k}f(x_{k}) \tag{2}$
with the $(2 n + 1)$ algebraic degree of precision, that is, when the degree of polynomial $f (x)$ less than or equal to $2 n + 1$ , the left and right sides of equation (2) are strictly equal. Thus, it’s called the Gauss-type quadrature formula with weights $\rho(x)$ .
The nodes $x_{k}$ are called Gauss points (nodes), $A_{k}$ s are called quadrature coefficients about $\rho(x)$ (weights or coefficients).

1.2 The caculation of the weights and nodes

(Note: This section is mainly about how to caculate the Gauss points and coefficents. The theoretical foundation can refer to [2, p223]. )

1)Undeterminated coefficient method: As the Gaussian quadrature formula in equation (2) is accurately for any polynomial with degree at most $2 n + 1$ . Thus, for the polynomial sequence $\{1,x,x^{2},\cdots,x^{2n+1}\}$ , equation (2) are all exactly equal.
Based on the above analysis, we can construct the following $(2 n + 2)$ equations about quadrature nodes ${x_{k}\}_{k=0}^{n}$ and quadrature coefficient ${A_{k}\}_{k=0}^{n}$ :
$\left\{\begin{array}{l}\sum_{k=0}^{n} A_{k}=\int_{a}^{b} 1 \mathrm{~d} x \\ \sum_{k=0}^{n} x_{k} A_{k}=\int_{a}^{b} x \mathrm{~d} x \\ \sum_{k=0}^{n} x_{k}^{2} A_{k}=\int_{a}^{b} x^{2} \mathrm{~d} x \\ \vdots \\ \sum_{k=0}^{n} x_{k}^{2 n+1} A_{k}=\int_{a}^{b} x^{2 n+1} \mathrm{~d} x\end{array}\right.$

Note: when $n\geq 2$ , it is difficult to solve the above system of equations directly.

2)Orthogonal polynomials method: The basic idea mainly based on the following two theorem.

[2, Theorem 5.5.1, p221] Let $I(f)=\int_{a}^{b}f(x)\mathrm{d}x$ and its interpolation quadrature formula is $I_{n}(f)=\sum_{k=0}^{n}A_{k}f(x_{k}),$ and denote $W_{n+1}(x)=(x-x_{0})(x-x_{1})\cdots(x-x_{n}).$
Then ${x_{k}\}_{k=0}^{n}$ are Gaussian nodes if and only if $W_{n+1}(x)$ is orthogonal to any polynomial $p (x)$ with degree at most $n$ , that is $\int_{a}^{b}p(x)W_{n+1}(x)\mathrm{d}x=0.$

[2, Theorem 5.5.3, p223] Let $\{q_{n}(x)\}_{n=0}^{\infty}$ is an orthogonal polynomilas sequence on $[a, b]$ , thus $q_{n}(x)$ has $n$ different zeros on $[a, b]$ .

` As the orthogonal polynomials sequence ${q_{k}(x)\}_{k=0}^{n+1}$ form a basis for a polynomial space of degree n+1, and $q_{n}(x)$ is orthogoanl to any polynomial of degree at most n on the interval [a,b].
Thus, for a given weight function $\rho(x)$ (can be $\rho(x)=1$ ), if we can find a orthogonal polynomials sequence $\{q_{0}(x),q_{1}(x),\cdots,q_{n}(x),q_{n+1}(x)\}$ related to the weight funciton $\rho(x)$ .
Where $q_{n+1}(x)$ is exact degree $n + 1$ , suppose its zeros are $t_{0},\cdots,t_{n}$ , then they are also the Gauss points ${x_{k}\}_{k=0}^{n}$ , their coresponding coefficients ${A_{k}\}_{k=0}^{n}$ are derived by
$A_{k}=\int_{a}^{b}\rho(x)l_{k}(x)dx, \tag{3}$
where $l_{k}(t)$ are base function of interpolation
$l_{k}(t)=\prod_{j=0,j\neq k}^{n}\frac{t-t_{j}}{t_{k}-t_{j}}.$

Note: Using the zeros of orthogoanl polynomials to construct the Gauss quadrature formula, which is only effective for some special weight functions, such as $1$ (Legendre), $\frac{1}{\sqrt{1-x^{2}}}$ (Chebyshev 1), $\sqrt{1-x^{2}}$ (Chebyshev 2) and $e^{-x^{2}}$ (Hermite), et.al.
However, for the general weight function, the above undertermined coefficient method is usually adopted.

二、Lanczos algorithm

2.1 Krylov subspace

Krylov subspace: $A$ is real matrix (unsymmetric) of order $n$ . Let $v$ be a given vector and
$K_{k}=\left(v, \quad A v, \quad \cdots, \quad A^{k-1} v\right)$
be the Krylov matrix of dimension $n\times k$ . The subspace spanned by the columns of matrix $K_{k}$ is called a Krylov subspace and denoted by $K_{k}(A,v)$ or $K (A, v)$ .

Note: The natural basis of the Krylov subspace $K (A, v)$ given by the columns of the Krylov matirx $K_{k}$ is badly conditioned when $k$ is large.

2.2 Arnoldi algorithm

Arnoldi algorithm constructs an orthonormal basis of the Krylov subspace $K (A, v)$ by applying a variant of the Gram-Schmidt orthogonalization process.

Arnoldi algorithm: Set vectors $v^{(j+1)}=Av^{(j)}$ with $v^{(1)}=v$ , then $K (A, v)$ is spanned by the vectors ${v^{(j)}\}_{j=1}^{k}$ .For constucting orthogonal basis vectors $v^{j}$ , instead of orthogonalizing $A^{j}v$ against the previous vectors, orthogonalize $Av^{j}$ .
Starting from $v^{1}=v$ (normalized,that is $v^{1}=v/\|v\|$ ),the $(j + 1)$ st vector of the basis is computed by using the previous vectors,
Projection: $h_{i,j}=(Av^{j},v^{i}),i=1,\cdots,j,$
Orthogonalize: $\tilde{v}^{j+1}=Av^{j}-\sum_{i=1}^{j}h_{i,j}v^{i},$
Length: $h_{j+1,j}=\|\tilde{v}^{j+1}\|$ ,(if $h_{j+1,j}=0$ then stop)
Normalize: $v^{j+1}=\frac{\tilde{v}^{j+1}}{h_{j+1,j}}$

As $h_{j+1,j}v^{j+1}=Av^{j}-\sum_{i=1}^{j}h_{i,j}v^{i}$ , we have
$Av^{j}=\sum_{i=1}^{j}h_{i,j}v^{i}+h_{j+1,j}v^{j+1}.$
If we collect the vectors $v^{j},j=1,\cdots,k$ in a matrix $V_{k}$ , that is $V_{k}=[v_{1},v_{2},\cdots,v_{k}]$ (thus $V_{k}$ is column orthogonal matrix), the relations defining the vectors $v^{k+1}$ can be written in the matrix form as
$V_{k}=V_{k} H_{k}+h_{k+1, k} v^{k+1}\left(e^{k}\right)^{T}, \tag{4}$
where $H_{k}$ is an upper Hessenberg matrix with elements $h_{i,j}$ , $h_{i,j}=0,j=1,\cdots,i-2,i>2$ .
$H_{k}=\left[ \begin{array}{lllll} h_{1,1} & h_{1,2} & h_{1,3} & \cdots & h_{1,k}\\ h_{2,1} & h_{2,2} & h_{2,3} & \cdots & h_{2,k}\\ 0 & h_{3,2} & h_{3,3} & \cdots & h_{3,k}\\ 0 & 0 & h_{4,3} & \cdots & h_{4,k}\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & h_{k,k-1} & h_{k,k}\\ \end{array} \right ]$

2.3 Lanczos algorithm

Multiplying equation (4) by $V_{k}^{T}$ and using orthogonality, we have
$H_{k}=V_{k}^{T}A V_{k}.$
If matrix $A$ is symmetric, then the Hessenberg matrix $H_{k}$ is tridiagonal and denoted by $J_{k}$ , that is $J_{i,j}=0,j=i+2,\cdots,k.$ (or $h_{i,j}=0,i=1,\cdots,j-2,j+2,\cdots,k$ ). This implies that the new vector $v^{k+1}$ can be computed by using only the two previous $v^{k}$ and $v^{k-1}$ .
$h_{j+1,j}v^{j+1}=Av^{j}-\sum_{i=1}^{j}h_{i,j}v^{i}=Av^{j}-h_{j-1,j}v^{j-1}-h_{j,j}v^{j}.$
In matrix form, set $\eta_{i}$ denote the nonzero off-diagonal entries of $J_{k}$ ,
$V_{k}=V_{k} J_{k}+\eta_{k} v^{k+1}\left(e^{k}\right)^{T}. \tag{5}$

Equation (5) describes in matrix form the elegant Lanczos algorithm.
For simple notation system, starting from a nonzero vector $v^{1}=v/\|v\|,\alpha_{1}=(Av^{1},v^{1}),\tilde{v}^{2}=Av^{1}-\alpha_{1}v^{1}$ , and then for $k=2,3,\cdots,$
$\begin{array}{c} \eta_{k-1}=\|\tilde{v}^{k}\|,\\ v^{k}=\frac{\tilde{v}^{k}}{\eta_{k-1}},\\ \alpha_{k}=\left(v^{k}, A v^{k}\right)=\left(v^{k}\right)^{T} A v^{k},\\ \tilde{v}^{k+1}=A v^{k}-\alpha_{k} v^{k}-\eta_{k-1} v^{k-1}.\\ \end{array}$
$J_{k}=\left[ \begin{array}{lllll} \alpha_{1} & \eta_{1} & 0 & \cdots & 0\\ \eta_{1} & \alpha_{2} & \eta_{2} & \cdots & 0\\ 0 & \eta_{2} & \alpha_{3} & \cdots & 0\\ 0 & 0 & \eta_{3} & \cdots & 0\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & \eta_{k-1} & \alpha_{k}\\ \end{array} \right ]$

三、Two important theory

3.1 Theorem 1

Theorem1 [1,Th.4.1]: Let $\chi_{k}(\lambda)$ be the determinant of $J_{k}-\lambda I$ (which is a monic polynomial); then
$v^{k}=p_{k}(A) v^{1}, \quad p_{k}(\lambda)=(-1)^{k-1} \frac{\chi_{k-1}(\lambda)}{\eta_{1} \cdots \eta_{k-1}}, k>1, \quad p_{1} \equiv 1$
The polynomials $p_{k}$ of degree $k - 1$ are called the normalized Lanczos polynomials.

This theorem describes the most important property of Lanczos alg orithm:The Lanczos vectors $v^{k}$ are given as a polynomial in matrix $A$ applied to the initial vector $v^{1}$ .

It is easy to verify that （expand on the last row of $J_{k+1 }$ ）
$\det(J_{k+1}) = \alpha_{k+1}\det(J_{k})-\eta_{k}^{2}\det(J_{k-1}), \tag{6}$
and
$\det(J_{k+1}-\lambda I) = (\alpha_{k+1}-\lambda)\det(J_{k}-\lambda I)-\eta_{k}^{2}\det(J_{k-1}-\lambda I). \tag{7}$
From the expression of polynomial $p_{k}(\lambda)$ in Theorem 1, we know that the Lanczos polynomials satisfy a scalar three-term recurrence,
$\eta_{k} p_{k+1}(\lambda)=\left(\lambda-\alpha_{k}\right) p_{k}(\lambda)-\eta_{k-1} p_{k-1}(\lambda), k=1,2, \ldots, \tag{8}$
with initial conditions $p_{0} \equiv 0, p_{1} \equiv 1$ .

3.2 Theorem 2

Theorem2 [1, Th.4.2]:
Consider the Lanczos vectors $v^{k}$ . There exists a measure $\alpha(\lambda)$ such that
$\left(v^{k}, v^{l}\right)=\left\langle p_{k}, p_{l}\right\rangle=\int_{a}^{b} p_{k}(\lambda) p_{l}(\lambda) d \alpha(\lambda), \tag{9}$
where $\alpha\leq \lambda_{1}=\lambda_{min}$ and $b\geq \lambda_{n}=\lambda_{max}$ , $\lambda_{min}$ and $\lambda_{max}$ being the smallest and largest eigenvalues of $A$ , and $p_{i}$ are the Lanczos polynomials associated with $A$ and $v^{1}$ .
$\alpha(\lambda)=\left\{\begin{array}{ll}0, & \text { if } \lambda<\lambda_{1} \\ \sum_{j=1}^{i}\left[\hat{v}_{j}\right]^{2}, & \text { if } \lambda_{i} \leq \lambda<\lambda_{i+1} \\ \sum_{j=1}^{n}\left[\hat{v}_{j}\right]^{2}, & \text { if } \lambda_{n} \leq \lambda\end{array}\right. \tag{10}$
where $\hat{v}=Q^{T}v^{1}$ , $v^{1}=v/\|v\|$ , and the spectral decomposition of $A$ is $A=Q\Lambda Q^{T}$ .

That is, normalized Lanczos polynomials sequence $\{p_{k}(\lambda)\}$ is orthogonal in the sense of measure $\alpha(\lambda)$ .

Note: For the sake of simplicity, here suppose that the eigenvalues of $A$ are distinct.

四、Gauss quadrature by Lanczos algorithm

For symetric matrix $A$ with eigenvalue decomposition $A=Q\Lambda Q^{T}$ , $f$ is smooth funciton, $v$ is random unit vector, then
$v^{T}f(A)v=v^{T}Qf(\Lambda)Q^{T}v$
set $\hat{v}=Q^{T}v$ , then
$v^{T}f(A)v=\hat{v}^{T}f(\Lambda)v=\sum_{j=1}^{m}f(\lambda_{i})[\hat{v}_{j}]^{2}$
Consider the above sum as Riemann-Stieltjes integral
$\sum_{j=1}^{m}f(\lambda_{i})[\hat{v}_{j}]^{2}=\int_{a}^{b}f(t)d\alpha(t). \tag{11}$
where the measure $\alpha(t)$ is defined as:
$\alpha(t)=\left\{\begin{array}{ll}0, & \text { if } t<\lambda_{1}=a \\ \sum_{j=1}^{i}\left[\hat{v}_{j}\right]^{2}, & \text { if } \lambda_{i} \leq t<\lambda_{i+1} \\ \sum_{j=1}^{m}\left[\hat{v}_{j}\right]^{2}, & \text { if } b=\lambda_{m} \leq t.\end{array}\right.$

The integral in Equation (11) can be estimated using the Gauss quadrature,
$\int_{a}^{b}f(t)d\alpha(t)\approx \sum_{k=1}^{m}A_{k}f(x_{k}).\tag{12}$
As stated in the Theorem2, ${p_{k}\}_{k=1}^{m+1}$ are orthonormal polynomial and the measure funciton is also $\alpha(t)$ .
Thus, the Gaussian quadrature nodes in (12) can be derived by caculating the zeros of polynomial $p_{m+1}(x)$ with degree of $m$ .
We can rewirte the three-term recurrence in Equation (8) in matrix form. Precisely, let $\mathbf{p}(\lambda)=[p_{1}(\lambda),p_{2}(\lambda),\cdots,p_{k}(\lambda)]^{T}$ and $p_{0}(\lambda)=0$ , then
$\lambda \mathbf{p}(\lambda)=\mathbf{p}(\lambda) J_{k}+\eta_{k}p_{k+1}(\lambda)e_{k}^{T}. \tag{13}$
Then the zeros of $p_{k+1}(\lambda)$ are equal to the eigenvalues of $J_{k}$ .
Thus, the Gaussian quadrature nodes in (12) can be derived by caculating the eigenvalues of tridiagonal Jacobi matrix $J_{m}$ , where $J_{m}$ is derived by m-steps Lanczos algorithm associate with $A$ and $v$ .

Note: $p_{k+1}(\lambda)$ is polynomial of degree k.

In conclusion, the Gauss point prolbem in Equation(12) can be get by the eigenvalues of $J_{k}$ which derived by Lanczos algorithm for matrix $A$ and $v$ .

References:
[1] Golub, G. H. and Meurant, G. Matrices, Moments and Quadrature with Applications, Princeton University Press, 2010.
[2] 孙志忠，袁慰平，闻震初. 数值分析(第3版)，东南大学出版社.