前言
如果你对这篇文章可感兴趣,可以点击「【访客必读 - 指引页】一文囊括主页内所有高质量博客」,查看完整博客分类与对应链接。
文章目录
2. Nonlinear Equation f(x) = 0
2.1 Bisection Method
2.1.1 Root-Finding Problem
Root-Finding or Zero-Finding Problem
Given a function f ( x ) f(x) f(x) in one variable x x x, finding a root x x x of an equation of the form f ( x ) = 0 f(x)=0 f(x)=0.
Solution x x x is called a root of equation $f(x)=0 $, or zero of function f ( x ) f(x) f(x).
2.1.2 Bisection Algorithm
Prejudgment
By the Intermediate Value Theorem, if
f ∈ C [ a , b ] , a n d f ( a ) ∗ f ( b ) < 0 , f\in C[a,b], and \ f(a)*f(b)<0, f∈C[a,b],and f(a)∗f(b)<0,
then there exists at least a point x ∗ ∈ ( a , b ) x^*\in (a,b) x∗∈(a,b), such that f ( x ∗ ) = 0 f(x^*)=0 f(x∗)=0.
Algorithm Description
-
INPUT: endpoints a , b a,b a,b; tolerance T O L TOL TOL; maximum number of iterations N N N.
-
OUTPUT: approximate solution c c c or message of failure.
-
Step 1 1 1: Set k = 1 , F A = f ( a ) k = 1,FA=f(a) k=1,FA=f(a);
-
Step 2 2 2: While k ≤ N k\leq N k≤N, do Steps 3 ~ 6 3~6 3~6
- Step 3 3 3: Set c = a + b 2 c= \displaystyle\frac{a+b}{2} c=2a+b; and compute F C = f ( c ) FC=f(c) FC=f(c).
- Step 4 4 4: If F C = 0 FC=0 FC=0 or ∣ b − a ∣ 2 < T O L \displaystyle\frac{|b-a|}{2}<TOL 2∣b−a∣<TOL, then output c c c, (Procedure complete successfully.) Stop!
- Step 5 5 5: If F A ∗ F C < 0 FA*FC<0 FA∗FC<0, then set b = c b=c b=c; else set a = c a=c a=c
- Step 6 6 6: Set k = k + 1 k=k+1 k=k+1.
-
Step 7 7 7: OUTPUT “Method failed after N N N iterations.” STOP!
Other Stopping Criteria
Other Stopping Criteria for Iteration procedures with a given tolerance ε > 0 : \varepsilon >0: ε>0:
∣ p n − p n − 1 ∣ < ε ∣ p n − p n − 1 ∣ ∣ p n ∣ < ε ∣ f ( p n ) ∣ < ε |p_n-p_{n-1}|<\varepsilon\\ \displaystyle\frac{|p_n-p_{n-1}|}{|p_n|} < \varepsilon \\ |f(p_n)|< \varepsilon ∣pn−pn−1∣<ε∣pn∣∣pn−pn−1∣<ε∣f(pn)∣<ε
2.1.3 Convergence Analysis
Theorem
Suppose that f ∈ C [ a , b ] f\in C[a,b] f∈C[a,b], and f ( a ) ∗ f ( b ) < 0 f(a)*f(b)<0 f(a)∗f(b)<0. The Bisection method generates a sequence { p n } 1 ∞ \{{p_n}\}_1^\infty {pn}1∞ approximating a zero point p p p of f f f with
∣ p n − p ∣ ≤ b − a 2 n , n ≥ 1. |p_n-p|\leq \displaystyle\frac{b-a}{2^n},n\geq1. ∣pn−p∣≤2nb−a,n≥1.
Proof
By the procedure, we know that
∣ b 1 − a 1 ∣ = ∣ b − a ∣ , ∣ b 2 − a 2 ∣ = ∣ b 1 − a 1 ∣ 2 = ∣ b − a ∣ 2 , . . . ∣ b n − a n ∣ = ∣ b n − 1 − a n − 1 ∣ 2 = ∣ b − a ∣ 2 n − 1 , |b_1-a_1|=|b-a|,\\ |b_2-a_2|=\displaystyle\frac{|b_1-a_1|}{2}=\displaystyle\frac{|b-a|}{2},\\ ...\\ |b_n-a_n|=\displaystyle\frac{|b_{n-1}-a_{n-1}|}{2}=\displaystyle\frac{|b-a|}{2^{n-1}}, ∣b1−a1∣=∣b−a∣,∣b2−a2∣=2∣b1−a1∣=2∣b−a∣,...∣bn−an∣=2∣bn−1−an−1∣=2n−1∣b−a∣,
Since p n = a n + b n 2 p_n=\displaystyle\frac{a_n+b_n}{2} pn=2an+bn and p ∈ ( a n , p n ] p\in (a_n,p_n] p∈(an,pn] or p ∈ [ p n , b n ) p\in [p_n,b_n) p∈[pn,bn) for all n ≥ 1 n\geq 1 n≥1, it follows that
∣ p n − p ∣ ≤ ∣ b n − a n ∣ 2 = b − a 2 n . |p_n-p|\leq \displaystyle\frac{|b_n-a_n| }{2}=\displaystyle\frac{b-a}{2^n}. ∣pn−p∣≤2∣bn−an∣=2nb−a.
Convergence Rate
Since
∣ p n − p ∣ ≤ ∣ b n − a n ∣ 2 = ∣ b − a ∣ 2 n |p_n-p|\leq \displaystyle\frac{|b_n-a_n|}{2}=\displaystyle\frac{|b-a|}{2^n} ∣pn−p∣≤2∣bn−an∣=2n∣b−a∣
The Sequence { p n } n = 1 ∞ \{p_n\}_{n=1}^\infty {pn}n=1∞ converges to p p p with rate of convergence O ( 1 2 n ) O(\displaystyle\frac{1}{2^n}) O(2n1), that is
p n = p + O ( 1 2 n ) p_n=p+O(\displaystyle\frac{1}{2^n}) pn=p+O(2n1)
Other Property
Bisection is certain to converge, but does so slowly.
Given starting interval [ a , b ] [a,b] [a,b], length of interval after k k k iterations is b − a 2 k \displaystyle\frac{b-a}{2^k} 2kb−a, so achieving error tolerance of ε ( ( b − a ) 2 k < ε ) \varepsilon(\displaystyle\frac{(b-a)}{2^k}<\varepsilon) ε(2k(b−a)<ε) requires k ≈ [ l o g 2 b − a ε ] k\approx[log_2^{\frac{b-a}{\varepsilon}}] k≈[log2εb−a] iterations, regardless of function f f f involved.
2.2 Fixed Point Method
2.2.1 Introduction
- Fixed point of given function g : R → R g:\mathbb{R}\rightarrow \mathbb{R} g:R→R is value x ∗ x^* x∗ such that x ∗ = g ( x ∗ ) x^*=g(x^*) x∗=g(x∗)
- Many iterative methods for solving nonlinear equations use fixed-point iteration scheme of form
x k + 1 = g ( x k ) x_{k+1}=g(x_k) xk+1=g(xk)
where fixed points for g g g are solutions for f ( x ) = 0 f(x)=0 f(x)=0.
Example
If
f
(
x
)
=
x
2
−
x
−
2
f(x)=x^2-x-2
f(x)=x2−x−2, it has two roots
x
∗
=
2
x^*=2
x∗=2 and
x
∗
=
−
1
x^*=-1
x∗=−1. Then fixed points of each of functions
g
(
x
)
=
x
2
−
2
g
(
x
)
=
x
+
2
g
(
x
)
=
1
+
2
x
g
(
x
)
=
x
2
+
2
2
∗
x
−
1
g(x)=x^2-2 \\ g(x) = \sqrt{x+2}\\ g(x) = 1+\displaystyle\frac{2}{x} \\ g(x) = \displaystyle\frac{x^2+2}{2*x-1}
g(x)=x2−2g(x)=x+2g(x)=1+x2g(x)=2∗x−1x2+2
are solutions to equation
f
(
x
)
=
0
f(x)=0
f(x)=0.
2.2.2 Algorithm
Definition
- To approximate the fixed point of a function
g
(
x
)
g(x)
g(x), we choose an initial approximation
p
0
p_0
p0, and generate the sequence
{
p
n
}
n
=
0
∞
\{p_n\}^\infty_{n=0}
{pn}n=0∞ by letting
{ G i v e n p 0 p n = g ( p n − 1 ) , n = 0 , 1 , . . . , \left\{ \begin{aligned} Given & \ \ \ \ p_0 \\ p_n = & g(p_{n-1}),n=0,1,..., \end{aligned} \right. {Givenpn= p0g(pn−1),n=0,1,...,
for each n ≥ 1 n\geq 1 n≥1. - If the sequence
{
p
n
}
n
=
0
∞
\{p_n\}^\infty_{n=0}
{pn}n=0∞ converges to
p
p
p and
g
(
x
)
g(x)
g(x) is continuous, then we have
p = lim n → ∞ p n = lim n → ∞ g ( p n ) = g ( lim n → ∞ p n ) = g ( p ) . p = \lim_{n\rightarrow \infty}p_n=\lim_{n\rightarrow \infty}g(p_n)=g(\lim_{n\rightarrow \infty}p_n)=g(p). p=n→∞limpn=n→∞limg(pn)=g(n→∞limpn)=g(p).
and a solution to x = g ( x ) x=g(x) x=g(x) is obtained. - This technique described above is called fixed point iteration (or functional iteration).
Example
Pseudo-Code
-
INPUT: Initial approximation p 0 p_0 p0, tolerance T O L TOL TOL, Maximum number of iteration N N N.
-
OUTPUT: approximate solution p p p or message of failure.
-
Step 1 1 1: Set n = 1 n = 1 n=1.
-
Step 2 2 2: While n ≤ N n\leq N n≤N, do Steps 3 ~ 5 3~5 3~5.
- Step 3 3 3: Set p = g ( p 0 ) p= g(p_0) p=g(p0).
- Step 4 4 4: If ∣ p − p 0 ∣ < T O L |p-p_0|<TOL ∣p−p0∣<TOL, then output p p p; (Procedure complete successfully.) Stop!
- Step 5 5 5: Set n = n + 1 , p 0 = p n=n+1, p_0=p n=n+1,p0=p.
-
Step 6 6 6: OUTPUT “Method failed after N N N iterations.” STOP!
2.2.3 Existence and Uniqueness
Theorem: Sufficient Conditions for the Existence and Uniqueness of a Fixed Point
- 【Existence】If g ( x ) ∈ C [ a , b ] g(x)\in C[a,b] g(x)∈C[a,b] and g ( x ) ∈ [ a , b ] g(x)\in [a,b] g(x)∈[a,b] for all x ∈ [ a , b ] x\in [a,b] x∈[a,b], then g ( x ) g(x) g(x) has a fixed point in [ a , b ] [a,b] [a,b]. ( g ( x ) ∈ [ a , b ] g(x)\in [a,b] g(x)∈[a,b] means m a x { g ( x ) } ≤ b , m i n { g ( x ) } ≥ a max\{g(x)\}\leq b,min\{g(x)\}\geq a max{g(x)}≤b,min{g(x)}≥a)
- 【Uniqueness】If, in addition, g ′ ( x ) g'(x) g′(x) exists on ( a , b ) (a,b) (a,b), and a positive constant k < 1 k<1 k<1 exists with ∣ g ′ ( x ) ∣ ≤ k |g'(x)|\leq k ∣g′(x)∣≤k, for all x ∈ ( a , b ) x\in (a,b) x∈(a,b).
Meet the two conditions above means the fixed point in [ a , b ] [a,b] [a,b] is unique.
Proof for Existence
- If g ( a ) = a g(a)=a g(a)=a or g ( b ) = b g(b)=b g(b)=b, then g ( x ) g(x) g(x) has a fixed point at an endpoint.
- Suppose not, then it must be true that g ( a ) > a g(a)>a g(a)>a and g ( b ) < b g(b)<b g(b)<b.
- Thus the function
h
(
x
)
=
g
(
x
)
−
x
h(x)=g(x)-x
h(x)=g(x)−x is continuous on
[
a
,
b
]
[a,b]
[a,b], and we have
h ( a ) = g ( a ) − a > 0 , h ( b ) = g ( b ) − b < 0. h(a)=g(a)-a>0,h(b)=g(b)-b<0. h(a)=g(a)−a>0,h(b)=g(b)−b<0. - The Intermediate Value Theorem implies that there exists p ∈ ( a , b ) p\in (a,b) p∈(a,b) for h ( x ) = g ( x ) − x h(x)=g(x)-x h(x)=g(x)−x which h ( p ) = 0 h(p)=0 h(p)=0.
- Thus g ( p ) − p = 0 g(p)-p=0 g(p)−p=0, and p p p is a fixed point of g ( x ) g(x) g(x).
The proving process above changes g ( x ) = x g(x)=x g(x)=x to f ( x ) = g ( x ) − x f(x)=g(x)-x f(x)=g(x)−x, changing fixed point problem into zero point existing problem.
Proof for Uniqueness
Using contradiction method to prove this characteristic.
- Suppose, in addition, that ∣ g ′ ( x ) ≤ k ≤ 1 ∣ |g'(x)\leq k\leq 1| ∣g′(x)≤k≤1∣ and that p p p and q q q are both fixed points in [ a , b ] [a,b] [a,b] with p ≠ q . p\not=q. p=q.
- Then by the Mean Value Theorem, a number
ξ
\xi
ξ exists between
p
p
p and
q
q
q. Hence, in
[
a
,
b
]
[a,b]
[a,b],
g ( p ) − g ( q ) p − q = g ′ ( ξ ) \displaystyle\frac{g(p)-g(q)}{p-q}=g'(\xi) p−qg(p)−g(q)=g′(ξ)
exists. - Then
∣ p − q ∣ = ∣ g ( p ) − g ( q ) ∣ = ∣ g ′ ( ξ ) ∣ ∣ p − q ∣ ≤ k ∗ ∣ p − q ∣ < ∣ p − q ∣ , |p-q|=|g(p)-g(q)|=|g'(\xi)||p-q|\leq k*|p-q|<|p-q|, ∣p−q∣=∣g(p)−g(q)∣=∣g′(ξ)∣∣p−q∣≤k∗∣p−q∣<∣p−q∣,
which is a contradiction. - So p = q p=q p=q, and the fixed point in [ a , b ] [a,b] [a,b] is unique.
2.2.4 Convergence Analysis
Theorem
Meet the two conditions above, we can find that, for any number
p
0
∈
[
a
,
b
]
p_0\in[a,b]
p0∈[a,b], the sequence
{
p
n
}
0
∞
\{p_n\}^\infty_0
{pn}0∞ defined by
p
n
=
g
(
p
n
−
1
)
,
n
≥
1
p_n=g(p_{n-1}), n\geq 1
pn=g(pn−1),n≥1
converges to the unique fixed point
p
p
p in
[
a
,
b
]
[a,b]
[a,b].
Proof
Using the Intermediate Value Theorem, we can prove the existence.
Using the fact that
∣
g
′
(
x
)
∣
≤
k
|g'(x)|\leq k
∣g′(x)∣≤k and the Mean Value Theorem, we have
∣
p
n
−
p
∣
=
∣
g
(
p
n
−
1
−
g
(
p
)
)
∣
=
∣
g
′
(
ξ
)
∣
∣
p
n
−
1
−
p
∣
≤
k
∗
∣
p
n
−
1
−
p
∣
,
|p_n-p|=|g(p_{n-1}-g(p))|=|g'(\xi)||p_{n-1}-p|\leq k*|p_{n-1}-p|,
∣pn−p∣=∣g(pn−1−g(p))∣=∣g′(ξ)∣∣pn−1−p∣≤k∗∣pn−1−p∣,
where
ξ
∈
(
a
,
b
)
\xi \in (a,b)
ξ∈(a,b).
Applying this inequality inductively shows
∣
p
n
−
p
∣
≤
k
∗
∣
p
n
−
1
−
p
∣
≤
k
2
∗
∣
p
n
−
2
−
p
∣
≤
.
.
.
≤
k
n
∗
∣
p
0
−
p
∣
.
|p_n-p|\leq k*|p_{n-1}-p|\leq k^2*|p_{n-2}-p|\leq ...\leq k^n*|p_{0}-p|.
∣pn−p∣≤k∗∣pn−1−p∣≤k2∗∣pn−2−p∣≤...≤kn∗∣p0−p∣.
Since
k
<
1
k<1
k<1,
lim
n
→
∞
∣
p
n
−
p
∣
≤
lim
n
→
∞
k
n
∣
p
0
−
p
∣
=
0
,
\lim_{n\rightarrow\infty}|p_n-p|\leq \lim_{n\rightarrow\infty}k^n|p_0-p|=0,
n→∞lim∣pn−p∣≤n→∞limkn∣p0−p∣=0,
and
{
p
n
}
0
∞
\{p_n\}_0^\infty
{pn}0∞ converges to
p
p
p.
Convergence Rate
Since
∣
p
n
−
p
∣
≤
k
n
∗
∣
p
0
−
p
∣
≤
k
n
∗
∣
b
−
a
∣
|p_n-p|\leq k^n*|p_0-p|\leq k^n*|b-a|
∣pn−p∣≤kn∗∣p0−p∣≤kn∗∣b−a∣,
the Sequence
{
p
n
}
n
=
0
∞
\{p_n\}_{n=0}^\infty
{pn}n=0∞ converges to
p
p
p with rate of convergence
O
(
k
n
)
O(k^n)
O(kn) with
k
<
1
k< 1
k<1, that is
p
n
=
p
+
O
(
k
n
)
,
k
<
1
p_n=p+O(k^n),k<1
pn=p+O(kn),k<1.
2.2.5 Bounds for the Error
Corollary
If the solution for
g
(
x
)
=
x
g(x)=x
g(x)=x exists and is unique, then the bounds for the error involved in using
p
n
p_n
pn to approximate
p
p
p are given by
∣
p
n
−
p
∣
≤
k
n
∗
m
a
x
{
p
0
−
a
,
b
−
p
0
}
|p_n-p|\leq k^n*max\{p_0-a,b-p_0\}
∣pn−p∣≤kn∗max{p0−a,b−p0}
and
∣
p
n
−
p
∣
≤
k
n
1
−
k
∗
∣
p
1
−
p
0
∣
,
|p_n-p|\leq \displaystyle\frac{k^n}{1-k}*|p_1-p_0|,
∣pn−p∣≤1−kkn∗∣p1−p0∣,
for all
n
≥
1
n\geq 1
n≥1.
Proof
∣ p n − p n − 1 ∣ ≤ ∣ g ( p n − 1 ) − g ( p n − 2 ) ∣ ≤ k ∗ ∣ p n − 1 − p n − 2 ∣ ≤ . . . ≤ k n − 1 ∣ p 1 − p 0 ∣ . |p_n-p_{n-1}|\leq |g(p_{n-1})-g(p_{n-2})|\leq k*|p_{n-1}-p_{n-2}|\leq ...\leq k^{n-1}|p_1-p_0|. ∣pn−pn−1∣≤∣g(pn−1)−g(pn−2)∣≤k∗∣pn−1−pn−2∣≤...≤kn−1∣p1−p0∣.
Let
m
>
n
m>n
m>n, using the theorem
∣
a
+
b
∣
≤
∣
a
∣
+
∣
b
∣
|a+b|\leq |a|+|b|
∣a+b∣≤∣a∣+∣b∣, then we have
∣
p
m
−
p
n
∣
≤
∣
p
m
−
p
m
−
1
∣
+
∣
p
m
−
1
−
p
m
−
2
∣
+
.
.
.
+
∣
p
n
+
1
−
p
n
∣
≤
(
k
m
−
1
+
k
m
−
2
+
.
.
.
+
k
n
)
∣
p
1
−
p
0
∣
∣
p
m
−
p
m
∣
≤
k
n
∗
(
1
+
k
+
.
.
.
+
k
m
−
n
−
1
)
∣
p
1
−
p
0
∣
≤
k
n
∗
1
−
k
m
−
n
−
1
1
−
k
∗
∣
p
1
−
p
0
∣
|p_m-p_n|\leq |p_m-p_{m-1}|+|p_{m-1}-p_{m-2}|+...+|p_{n+1}-p_n|\leq (k^{m-1}+k^{m-2}+...+k^n)|p_1-p_0|\\ |p_m-p_m|\leq k^n*(1+k+...+k^{m-n-1})|p_1-p_0|\leq k^n*\displaystyle\frac{1-k^{m-n-1}}{1-k}*|p_1-p_0|
∣pm−pn∣≤∣pm−pm−1∣+∣pm−1−pm−2∣+...+∣pn+1−pn∣≤(km−1+km−2+...+kn)∣p1−p0∣∣pm−pm∣≤kn∗(1+k+...+km−n−1)∣p1−p0∣≤kn∗1−k1−km−n−1∗∣p1−p0∣.
Let
m
→
∞
m\rightarrow\infty
m→∞, we have
lim
m
→
∞
∣
p
m
−
p
n
∣
=
∣
p
−
p
n
∣
≤
k
n
1
−
k
∗
∣
p
1
−
p
0
∣
.
\lim_{m\rightarrow\infty}|p_m-p_n|=|p-p_n|\leq \displaystyle\frac{k^n}{1-k}*|p_1-p_0|.
m→∞lim∣pm−pn∣=∣p−pn∣≤1−kkn∗∣p1−p0∣.
2.3 Newton’s Method
2.3.1 Introduction
Status
The Newton-Raphson (or simply Newton’s) method is one of the most powerful and well-known numerical methods for solving a root-finding problem
f
(
x
)
=
0.
f(x)=0.
f(x)=0.
Rough Description
(1) Suppose that f ∈ C 2 [ a , b ] f\in C^2[a,b] f∈C2[a,b], and x ∗ x^* x∗ is a solution of f ( x ) = 0 f(x)=0 f(x)=0.
(2) Let x ^ ∈ [ a , b ] \hat{x}\in [a,b] x^∈[a,b] be an approximation to x ∗ x^* x∗ such that f ′ ( x ^ ) ≠ 0 f'(\hat{x})\not=0 f′(x^)=0 and ∣ x ^ − x ∗ ∣ |\hat{x}-x^*| ∣x^−x∗∣ is “small”.
Consider the first Taylor polynomial for
f
(
x
)
f(x)
f(x) expanded about
x
^
\hat{x}
x^,
f
(
x
)
=
f
(
x
^
)
+
(
x
−
x
^
)
f
′
(
x
^
)
+
(
x
−
x
^
)
2
2
f
′
′
(
ξ
(
x
)
)
.
f(x)=f(\hat{x})+(x-\hat{x})f'(\hat{x})+\displaystyle\frac{(x-\hat{x})^2}{2}f''(\xi(x)).
f(x)=f(x^)+(x−x^)f′(x^)+2(x−x^)2f′′(ξ(x)).
where
ξ
(
x
)
\xi(x)
ξ(x) lies between
x
x
x and
x
^
\hat{x}
x^.
Thus, consider
f
(
x
∗
)
=
0
f(x^*)=0
f(x∗)=0, and gives
0
=
f
(
x
∗
)
≈
f
(
x
^
)
+
(
x
∗
−
x
^
)
f
′
(
x
^
)
.
0=f(x^*)\approx f(\hat{x})+(x^*-\hat{x})f'(\hat{x}).
0=f(x∗)≈f(x^)+(x∗−x^)f′(x^).
Solution for finding
x
∗
x^*
x∗ is
x
∗
≈
x
^
−
f
(
x
^
)
f
′
(
x
^
)
.
x^*\approx \hat{x}-\displaystyle\frac{f(\hat{x})}{f'(\hat{x})}.
x∗≈x^−f′(x^)f(x^).
2.3.2 Algorithm
Definition
- Starts with an initial approximation x 0 x_0 x0
- Defined iteration scheme by
x n = x n − 1 − f ( x n − 1 ) f ′ ( x n − 1 ) , ∀ n ≥ 1 x_n=x_{n-1}-\displaystyle\frac{f(x_{n-1})}{f'(x_{n-1})},\forall n\geq 1 xn=xn−1−f′(xn−1)f(xn−1),∀n≥1 - This scheme generates the sequence { x n } 0 ∞ \{x_n\}_0^\infty {xn}0∞
Example
Pseudo-Code
The function f f f is differentiable and p 0 p_0 p0 is an initial approximation.
-
INPUT: Initial approximation p 0 p_0 p0, tolerance T O L TOL TOL, Maximum number of iteration N N N.
-
OUTPUT: approximate solution p p p or message of failure.
-
Step 1 1 1: Set n = 1 n = 1 n=1.
-
Step 2 2 2: While n ≤ N n\leq N n≤N, do Steps 3 ~ 5 3~5 3~5.
- Step 3 3 3: Set p = p 0 − f ( p 0 ) f ′ ( p 0 ) p= p_0-\displaystyle\frac{f(p_0)}{f'(p_0)} p=p0−f′(p0)f(p0).
- Step 4 4 4: If ∣ p − p 0 ∣ < T O L |p-p_0|<TOL ∣p−p0∣<TOL, then output p p p; (Procedure complete successfully.) Stop!
- Step 5 5 5: Set n = n + 1 , p 0 = p n=n+1, p_0=p n=n+1,p0=p.
-
Step 6 6 6: OUTPUT “Method failed after N N N iterations.” STOP!
2.3.3 Convergence
Theorem
(1) f ∈ C 2 [ a , b ] . f\in C^2[a,b]. f∈C2[a,b].
(2) p ∈ [ a , b ] p\in [a,b] p∈[a,b] is such that f ( p ) = 0 f(p)=0 f(p)=0 and f ′ ( p ) ≠ 0 f'(p)\not=0 f′(p)=0.
Then there exists a δ > 0 \delta>0 δ>0 such that Newton’s method generates a sequence { p n } 1 ∞ \{p_n\}_1^\infty {pn}1∞ converging to p p p for any initial approximation p 0 ∈ [ p − δ , p + δ ] p_0\in[p-\delta,p+\delta] p0∈[p−δ,p+δ].
Proof
p n = g ( p n − 1 ) g ( x ) = x − f ( x ) f ′ ( x ) p_n=g(p_{n-1})\\ g(x)=x-\displaystyle\frac{f(x)}{f'(x)} pn=g(pn−1)g(x)=x−f′(x)f(x)
Things need to prove:
According to the proofing process of the fixed point method, we need to find an interval
[
p
−
δ
,
p
+
δ
]
[p-\delta,p+\delta]
[p−δ,p+δ] that
g
g
g maps into itself, and
∣
g
′
(
x
)
∣
≤
k
<
1
|g'(x)|\leq k<1
∣g′(x)∣≤k<1 for all
x
∈
[
p
−
δ
,
p
+
δ
]
.
x\in[p-\delta,p+\delta].
x∈[p−δ,p+δ]. (Existence and Uniqueness)
Proving Process:
-
∃ δ 1 > 0 , g ( x ) ∈ C [ p − δ 1 , p + δ 1 ] \exists \delta_1>0,g(x) \in C[p-\delta_1,p+\delta_1] ∃δ1>0,g(x)∈C[p−δ1,p+δ1]
Since f ′ ( p ) ≠ 0 f'(p)\not=0 f′(p)=0 and f ′ f' f′ is continuous, there exists δ 1 > 0 \delta_1>0 δ1>0 such that f ′ ( x ) ≠ 0 f'(x)\not= 0 f′(x)=0 and $f’(x)\in C[a,b] $ with ∀ x ∈ [ p − δ 1 , p + δ 1 ] \forall x\in [p-\delta_1,p+\delta_1] ∀x∈[p−δ1,p+δ1].
-
g ′ ( x ) ∈ C [ p − δ 1 , p + δ 1 ] g'(x) \in C[p-\delta_1,p+\delta_1] g′(x)∈C[p−δ1,p+δ1]
g ′ ( x ) = f ( x ) f ′ ′ ( x ) ( f ′ ( x ) ) 2 g'(x)=\displaystyle\frac{f(x)f''(x)}{(f'(x))^2} g′(x)=(f′(x))2f(x)f′′(x) for all x ∈ [ p − δ 1 , p + δ 1 ] x\in [p-\delta_1,p+\delta_1] x∈[p−δ1,p+δ1]. Since f ∈ C 2 [ a , b ] f\in C^2[a,b] f∈C2[a,b], g ′ ( x ) ∈ [ p − δ 1 , p + δ 2 ] g'(x)\in [p-\delta_1,p+\delta_2] g′(x)∈[p−δ1,p+δ2].
-
∣ g ′ ( x ) ∣ ≤ k < 1 |g'(x)|\leq k< 1 ∣g′(x)∣≤k<1
f ( p ) = 0 , g ′ ( p ) = 0 f(p)=0,g'(p)=0 f(p)=0,g′(p)=0
Since g ′ ∈ C [ p − δ 1 , p + δ 1 ] g'\in C[p-\delta_1,p+\delta_1] g′∈C[p−δ1,p+δ1], there exists a δ \delta δ with 0 < δ < δ 1 0<\delta<\delta_1 0<δ<δ1 and
∣ g ′ ( x ) ∣ ≤ k < 1 , ∀ x ∈ [ p − δ , p + δ ] . |g'(x)|\leq k<1, \forall x\in [p-\delta,p+\delta]. ∣g′(x)∣≤k<1,∀x∈[p−δ,p+δ].
-
g ∈ [ p − δ , p + δ ] ↦ [ p − δ , p + δ ] g\in[p-\delta,p+\delta]\mapsto [p-\delta,p+\delta] g∈[p−δ,p+δ]↦[p−δ,p+δ]
According to the Mean Value Theorem, if x ∈ [ p − δ , p + δ ] x\in[p-\delta,p+\delta] x∈[p−δ,p+δ], there exists
ξ ∈ [ x , p ] , ∣ g ( x ) − g ( p ) ∣ = ∣ g ′ ( ξ ) ∣ ∣ x − p ∣ ∣ g ( x ) − p ∣ = ∣ g ( x ) − g ( p ) ∣ = ∣ g ′ ( ξ ) ∣ ∣ x − p ∣ ≤ k ∣ x − p ∣ < ∣ x − p ∣ < δ \xi \in [x,p],|g(x)-g(p)|=|g'(\xi)||x-p|\\ |g(x)-p|=|g(x)-g(p)|=|g'(\xi)||x-p|\leq k|x-p|<|x-p|<\delta ξ∈[x,p],∣g(x)−g(p)∣=∣g′(ξ)∣∣x−p∣∣g(x)−p∣=∣g(x)−g(p)∣=∣g′(ξ)∣∣x−p∣≤k∣x−p∣<∣x−p∣<δ
Thus, g ∈ [ p − δ , p + δ ] ↦ [ p − δ , p + δ ] g\in[p-\delta,p+\delta]\mapsto [p-\delta,p+\delta] g∈[p−δ,p+δ]↦[p−δ,p+δ].
According to the proving process above, all the hypotheses of the Fixed-Point Theorem are satisfied for
g
(
x
)
=
x
−
f
(
x
)
f
′
(
x
)
g(x)=x-\displaystyle\frac{f(x)}{f'(x)}
g(x)=x−f′(x)f(x). Therefore, the sequence ${p_n}_{n=1}^\infty $ defined by
p
n
=
g
(
p
n
−
1
)
,
∀
n
≥
1
p_n=g(p_n-1),\forall n\geq 1
pn=g(pn−1),∀n≥1
converges to
p
p
p for any
p
0
∈
[
p
−
δ
,
p
+
δ
]
.
p_0\in[p-\delta,p+\delta].
p0∈[p−δ,p+δ].
2.3.4 Secant Method
Background
For Newton’s method, each iteration requires evaluation of both function f ( x k ) f(x_k) f(xk) and its derivative f ′ ( x k ) f'(x_k) f′(xk), which may be inconvenient or expensive.
Improvement
Derivative is approximated by finite difference using two successive iterates, so iteration becomes
x
k
+
1
=
x
k
−
f
(
x
k
)
∗
x
k
−
x
k
−
1
f
(
x
k
)
−
f
(
x
k
−
1
)
.
x_{k+1}=x_k-f(x_k)*\displaystyle\frac{x_k-x_{k-1}}{f(x_k)-f(x_{k-1})}.
xk+1=xk−f(xk)∗f(xk)−f(xk−1)xk−xk−1.
This method is known as secant method.
Example
Pseudo-Code
-
INPUT: Initial approximation p 0 , p 1 p_0,p_1 p0,p1, tolerance T O L TOL TOL, Maximum number of iteration N N N.
-
OUTPUT: approximate solution p p p or message of failure.
-
Step 1 1 1: Set n = 2 , q 0 = f ( p 0 ) , q 1 = f ( p 1 ) n = 2,q_0=f(p_0),q_1=f(p_1) n=2,q0=f(p0),q1=f(p1).
-
Step 2 2 2: While n ≤ N n\leq N n≤N, do Steps 3 ~ 5 3~5 3~5.
- Step 3 3 3: Set p = p 1 − q 1 ∗ p 1 − p 0 q 1 − q 0 p= p_1-q_1*\displaystyle\frac{p_1-p_0}{q_1-q_0} p=p1−q1∗q1−q0p1−p0.(Compute p i p_i pi)
- Step 4 4 4: If ∣ p − p 1 ∣ < T O L |p-p_1|<TOL ∣p−p1∣<TOL, then output p p p; (Procedure complete successfully.) Stop!
- Step 5 5 5: Set n = n + 1 , p 0 = p 1 , p 1 = p , q 0 = q 1 , q 1 = f ( p ) n=n+1, p_0=p_1,p_1=p, q_0=q_1,q_1=f(p) n=n+1,p0=p1,p1=p,q0=q1,q1=f(p).
-
Step 6 6 6: OUTPUT “Method failed after N N N iterations.” STOP!
2.3.5 False Position Method
Definition
To find a solution of
f
(
x
)
=
0
f(x)=0
f(x)=0 for a given continuous function
f
f
f on the interval
[
p
0
,
p
1
]
[p_0,p_1]
[p0,p1], where
f
(
p
0
)
f(p_0)
f(p0) and
f
(
p
1
)
f(p_1)
f(p1) have opposite signs
f
(
p
0
)
∗
f
(
p
1
)
<
0.
f(p_0)*f(p_1)<0.
f(p0)∗f(p1)<0.
The approximation p 2 p_2 p2 is chosen in same manner as in Secant Method, as the x x x-intercept ( x x x轴截距) of the line joining ( p 0 , f ( p 0 ) ) (p_0,f(p_0)) (p0,f(p0)) and ( p 1 , f ( p 1 ) ) (p_1,f(p_1)) (p1,f(p1)).
To decide the Secant Line to compute p 3 p_3 p3, we need to check f ( p 2 ) ∗ f ( p 1 ) f(p_2)*f(p_1) f(p2)∗f(p1) or f ( p 2 ) ∗ f ( p 0 ) f(p_2)*f(p_0) f(p2)∗f(p0).
If f ( p 2 ) ∗ f ( p 1 ) f(p_2)*f(p_1) f(p2)∗f(p1) is negative, we choose p 3 p_3 p3 as the x x x-intercept for the line joining ( p 1 , f ( p 1 ) (p_1,f(p_1) (p1,f(p1) and p 2 , f ( p 2 ) p_2,f(p_2) p2,f(p2).
In a similar manner, we can get a sequence { p n } 2 ∞ \{p_n\}_2^\infty {pn}2∞ which approximates to the root.
Example
Pseudo-Code
-
INPUT: Initial approximation p 0 , p 1 p_0,p_1 p0,p1, tolerance T O L TOL TOL, Maximum number of iteration N N N.
-
OUTPUT: approximate solution p p p or message of failure.
-
Step 1 1 1: Set n = 2 , q 0 = f ( p 0 ) , q 1 = f ( p 1 ) n = 2,q_0=f(p_0),q_1=f(p_1) n=2,q0=f(p0),q1=f(p1).
-
Step 2 2 2: While n ≤ N n\leq N n≤N, do Steps 3 ~ 6 3~6 3~6.
- Step 3 3 3: Set p = p 1 − q 1 ∗ p 1 − p 0 q 1 − q 0 p= p_1-q_1*\displaystyle\frac{p_1-p_0}{q_1-q_0} p=p1−q1∗q1−q0p1−p0.(Compute p i p_i pi)
- Step 4 4 4: If ∣ p − p 1 ∣ < T O L |p-p_1|<TOL ∣p−p1∣<TOL, then output p p p; (Procedure complete successfully.) Stop!
- Step 5 5 5: Set n = n + 1 , q = f ( p ) n=n+1, q=f(p) n=n+1,q=f(p).
- Step 6 6 6: If q ∗ q 1 < 0 q*q_1<0 q∗q1<0, set p 0 = p , q 0 = q p_0=p, q_0=q p0=p,q0=q; else set p 1 = p , q 1 = q . p_1=p, q_1=q. p1=p,q1=q.
-
Step 6 6 6: OUTPUT “Method failed after N N N iterations.” STOP!
2.4 Error Analysis for Iteration Methods
2.4.1 The Rate of Sequence Convergence
Definition
Suppose { p n } n = 0 ∞ \{p_n\}_{n=0}^\infty {pn}n=0∞ is a sequence that converges to p p p, with p n ≠ p p_n\not=p pn=p for all n.
If positive constants
λ
\lambda
λ and
α
\alpha
α exist with
lim
n
→
∞
∣
p
n
+
1
−
p
∣
∣
p
n
−
p
∣
α
=
λ
,
\lim_{n\rightarrow\infty} \displaystyle\frac{|p_{n+1}-p|}{|p_n-p|^\alpha}=\lambda,
n→∞lim∣pn−p∣α∣pn+1−p∣=λ,
then { p n } n = 0 ∞ \{p_n\}_{n=0}^\infty {pn}n=0∞ converges to p p p of order α \alpha α, with asymptotic error constant λ \lambda λ.
Properties
- A sequence with a high order of convergence converges more rapidly than a sequence with a lower order.
- The asymptotic constant affects the speed of convergence but is not as important as the order.
Example
- If α = 1 \alpha=1 α=1, the sequence is linearly convergent.
- If α = 2 \alpha =2 α=2, the sequence is quadratically convergent.
Summary
Using the Mean Value Theorem to prove Linear Convergence and the Taylor’s Theorem to prove Quadratic Convergence with g ′ ( p ) = 0. g'(p)=0. g′(p)=0.
2.4.2 Convergent Order of Fixed-Point Iteration (Improved)
Convergent Oder of Fixed-Point Iteration
(1)
g
∈
C
[
a
,
b
]
g\in C[a,b]
g∈C[a,b] for all
x
∈
[
a
,
b
]
x\in[a,b]
x∈[a,b]
(2)
g
′
(
x
)
g'(x)
g′(x) is continuous on
(
a
,
b
)
(a,b)
(a,b) and a positive constant
0
<
k
<
1
0<k<1
0<k<1 exists with
∣
g
′
(
x
)
∣
≤
k
|g'(x)|\leq k
∣g′(x)∣≤k, for all
x
∈
(
a
,
b
)
x\in(a,b)
x∈(a,b).
If g ′ ( p ) ≠ 0 g'(p)\not=0 g′(p)=0, then for any number p 0 p_0 p0 in [ a , b ] [a,b] [a,b], the sequence p n = g ( p n − 1 ) p_n=g(p_{n-1}) pn=g(pn−1), for n ≥ 1 n\geq 1 n≥1, converges only linearly to the unique fixed point p p p in [ a , b ] [a,b] [a,b].
Proof
p
n
+
1
−
p
=
g
(
p
n
)
−
g
(
p
)
=
g
′
(
ξ
n
)
(
p
n
−
p
)
,
p_{n+1}-p=g(p_n)-g(p)=g'(\xi_n)(p_n-p),
pn+1−p=g(pn)−g(p)=g′(ξn)(pn−p),
where
ξ
n
\xi_n
ξn is between
p
n
p_n
pn and
p
p
p.
Since { p n } n = 0 ∞ \{p_n\}_{n=0}^\infty {pn}n=0∞ converges to p p p, and ξ n \xi_n ξn is between p n p_n pn and p p p, thus { ξ n } n = 0 ∞ \{\xi_n\}_{n=0}^\infty {ξn}n=0∞ also converges to p p p.
Thus,
lim
n
→
∞
∣
p
n
+
1
−
p
∣
∣
p
n
−
p
∣
=
lim
n
→
∞
∣
g
′
(
ξ
n
)
∣
=
∣
g
′
(
p
)
∣
,
\lim_{n\rightarrow\infty}\displaystyle\frac{|p_{n+1}-p|}{|p_n-p|}=\lim_{n\rightarrow\infty}|g'(\xi_n)|=|g'(p)|,
n→∞lim∣pn−p∣∣pn+1−p∣=n→∞lim∣g′(ξn)∣=∣g′(p)∣,
fixed-point iteration exhibits linear convergence with asymptotic error constant
∣
g
′
(
p
)
∣
|g'(p)|
∣g′(p)∣ whenever
g
′
(
p
)
≠
0
g'(p)\not=0
g′(p)=0, which also implies that higher-order convergence for fixed-point methods can occur only when
g
′
(
p
)
=
0
g'(p)=0
g′(p)=0.
Quadratical Convergence
Let p p p be a solution of the equation x = g ( x ) x=g(x) x=g(x).
(1) g ′ ( p ) = 0 g'(p)=0 g′(p)=0
(2) g ′ ′ g'' g′′ is continuous and strictly bounded by M M M on an open interval I I I containing p p p.
Then there exists a δ > 0 \delta >0 δ>0 such that, for p 0 ∈ [ p − δ , p + δ ] p_0\in [p-\delta, p+\delta] p0∈[p−δ,p+δ], the sequence defined by p n = g ( p n − 1 ) p_n=g(p_{n-1}) pn=g(pn−1), when n ≥ 1 n\geq 1 n≥1, converges at least quadratically to p p p.
Moreover, for sufficiently large values of
n
n
n,
∣
p
n
+
1
−
p
∣
<
M
2
∣
p
n
−
p
∣
2
.
|p_{n+1}-p|<\displaystyle\frac{M}{2}|p_n-p|^2.
∣pn+1−p∣<2M∣pn−p∣2.
Proof
Due to the two conditions described above,
g
(
x
)
=
g
(
p
)
+
g
′
(
p
)
(
x
−
p
)
+
g
′
′
(
ξ
)
2
(
x
−
p
)
2
=
p
+
g
′
′
(
ξ
)
2
∗
(
x
−
p
)
2
g(x)=g(p)+g'(p)(x-p)+\displaystyle\frac{g''(\xi)}{2}(x-p)^2=p+\displaystyle\frac{g''(\xi)}{2}*(x-p)^2
g(x)=g(p)+g′(p)(x−p)+2g′′(ξ)(x−p)2=p+2g′′(ξ)∗(x−p)2
is derived, that
ξ
\xi
ξ lies between
x
x
x and
p
p
p.
Thus,
p
n
+
1
=
g
(
p
n
)
=
p
+
g
′
′
(
ξ
)
2
∗
(
p
n
−
p
)
2
p
n
+
1
−
p
=
g
′
′
(
ξ
)
2
∗
(
p
n
−
p
)
2
lim
n
→
∞
g
′
′
(
ξ
)
=
g
′
′
(
p
)
lim
n
→
∞
∣
p
n
+
1
−
p
n
∣
∣
p
n
−
p
∣
2
=
lim
n
→
∞
g
′
′
(
ξ
)
2
=
g
′
′
(
p
)
2
p_{n+1}=g(p_n)=p+\displaystyle\frac{g''(\xi)}{2}*(p_n-p)^2\\ p_{n+1}-p=\displaystyle\frac{g''(\xi)}{2}*(p_n-p)^2\\ \lim_{n\rightarrow\infty}g''(\xi)=g''(p)\\ \lim_{n\rightarrow\infty}\displaystyle\frac{|p_{n+1}-p_n|}{|p_n-p|^2}=\lim_{n\rightarrow\infty}\displaystyle\frac{g''(\xi)}{2}=\displaystyle\frac{g''(p)}{2}
pn+1=g(pn)=p+2g′′(ξ)∗(pn−p)2pn+1−p=2g′′(ξ)∗(pn−p)2n→∞limg′′(ξ)=g′′(p)n→∞lim∣pn−p∣2∣pn+1−pn∣=n→∞lim2g′′(ξ)=2g′′(p)
Since
g
′
′
g''
g′′ is strictly bounded by
M
M
M on the interval
∣
p
−
δ
,
p
+
δ
∣
|p-\delta,p+\delta|
∣p−δ,p+δ∣, for sufficiently large values of
n
n
n,
∣
p
n
+
1
−
p
∣
<
M
2
∣
p
n
−
p
∣
2
|p_{n+1}-p|<\displaystyle\frac{M}{2}|p_n-p|^2
∣pn+1−p∣<2M∣pn−p∣2
is also derived.
Construct a quadratically convergent fixed-point problem
Let
g
(
x
)
=
x
−
ϕ
(
x
)
f
(
x
)
g
′
(
x
)
=
1
−
ϕ
′
(
x
)
f
(
x
)
−
ϕ
(
x
)
f
′
(
x
)
g(x)=x-\phi(x)f(x)\\ g'(x)=1-\phi'(x)f(x)-\phi(x)f'(x)
g(x)=x−ϕ(x)f(x)g′(x)=1−ϕ′(x)f(x)−ϕ(x)f′(x)
And the condition is g ′ ( p ) = 0 g'(p)=0 g′(p)=0, thus ϕ ( p ) = 1 f ′ ( p ) \phi(p)=\displaystyle\frac{1}{f'(p)} ϕ(p)=f′(p)1.
A reasonable approach is to let ϕ ( x ) = 1 f ′ ( x ) \phi(x)=\displaystyle\frac{1}{f'(x)} ϕ(x)=f′(x)1, which is the Newton’s method.
Remarks
- the convergence rate of Fixed-Point iteration is usually linear, with constant C = ∣ g ′ ( p ) ∣ C=|g'(p)| C=∣g′(p)∣.
- But if g ′ ( p ) = 0 g'(p)=0 g′(p)=0, then the convergence rate is at least q u a d r a t i c quadratic quadratic.
2.4.3 Zero of Multiplicity
Definition
A solution
p
p
p of
f
(
x
)
=
0
f(x)=0
f(x)=0 is a zero of multiplicity
m
m
m of
f
(
x
)
f(x)
f(x) if for
x
≠
p
x\not=p
x=p, we can write
f
(
x
)
=
(
x
−
p
)
m
q
(
x
)
,
f(x)=(x-p)^mq(x),
f(x)=(x−p)mq(x),
where
lim
x
→
p
q
(
x
)
≠
0.
\lim_{x\rightarrow p}q(x)\not=0.
x→plimq(x)=0.
Theorem
- f ∈ C 1 [ a , b ] f\in C^1[a,b] f∈C1[a,b] has a simple zero at p p p in ( a , b ) (a,b) (a,b) if and only if f ( p ) = 0 f(p)=0 f(p)=0, but f ′ ( p ) ≠ 0 f'(p)\not=0 f′(p)=0.
- The function
f
∈
C
m
[
a
,
b
]
f\in C^m[a,b]
f∈Cm[a,b] has a zero of multiplicity
m
m
m at
p
p
p if and only if
0 = f ( p ) = f ′ ( p ) = f ′ ′ ( p ) = . . . = f ( m − 1 ) ( p ) . 0=f(p)=f'(p)=f''(p)=...=f^{(m-1)}(p). 0=f(p)=f′(p)=f′′(p)=...=f(m−1)(p).
but f m ( p ) ≠ 0 f^m(p)\not=0 fm(p)=0.
Proof
If
f
f
f has a simple zero at
p
p
p, then
f
(
p
)
=
0
f
(
x
)
=
(
x
−
p
)
∗
q
(
x
)
lim
x
→
p
q
(
x
)
≠
0.
f(p)=0\\ f(x)=(x-p)*q(x)\\ \lim_{x\rightarrow p}q(x)\not=0.
f(p)=0f(x)=(x−p)∗q(x)x→plimq(x)=0.
Since
f
∈
C
1
[
a
,
b
]
f\in C^1[a,b]
f∈C1[a,b],
f
′
(
p
)
=
lim
x
→
p
f
′
(
x
)
=
lim
x
→
p
[
q
(
x
)
+
(
x
−
p
)
∗
q
′
(
x
)
]
=
lim
x
→
p
q
(
x
)
≠
0.
f'(p)=\lim_{x\rightarrow p}f'(x)=\lim_{x\rightarrow p}[q(x)+(x-p)*q'(x)]=\lim_{x\rightarrow p}q(x)\not=0.
f′(p)=x→plimf′(x)=x→plim[q(x)+(x−p)∗q′(x)]=x→plimq(x)=0.
2.4.4 Convergence of Newton’s Method
Property
Newton’s method transforms nonlinear equation f ( x ) = 0 f(x)=0 f(x)=0 into fixed-point problem x = g ( x ) x=g(x) x=g(x) with g ( x ) = x − f ( x ) f ′ ( x ) g(x)=x-\displaystyle\frac{f(x)}{f'(x)} g(x)=x−f′(x)f(x).
-
If p p p is a simple root, f ( p ) = 0 , f ′ ( p ) ≠ 0 , g ′ ( p ) = 0 f(p)=0,f'(p)\not=0,g'(p)=0 f(p)=0,f′(p)=0,g′(p)=0, thus the convergence rate is quadratic. (Iterations must start close enough to root.)
-
If p p p is a root of multiplicity,
f ( x ) = ( x − p ) m q ( x ) g ′ ( p ) = 1 − 1 m ≠ 0 , f(x)=(x-p)^mq(x)\\ g'(p)=1-\displaystyle\frac{1}{m}\not=0, f(x)=(x−p)mq(x)g′(p)=1−m1=0,
thus the convergence rate is linear.
the Method of avoiding multiple root
f ( x ) = ( x − p ) m q ( x ) u ( x ) = f ( x ) f ′ ( x ) u ( x ) = ( x − p ) ∗ q ( x ) m q ( x ) + ( x − p ) q ′ ( x ) u ′ ( x ) = 1 m ≠ 0. f(x)=(x-p)^mq(x)\\ u(x)=\displaystyle\frac{f(x)}{f'(x)}\\ u(x)=(x-p)*\displaystyle\frac{q(x)}{mq(x)+(x-p)q'(x)}\\ u'(x)=\displaystyle\frac{1}{m}\not=0. f(x)=(x−p)mq(x)u(x)=f′(x)f(x)u(x)=(x−p)∗mq(x)+(x−p)q′(x)q(x)u′(x)=m1=0.
Thus
p
p
p is a simple root of
u
(
x
)
u(x)
u(x). Then we change the Newton’s method into
g
(
x
)
=
x
−
u
(
x
)
u
′
(
x
)
=
x
−
f
(
x
)
f
′
(
x
)
f
′
(
x
)
2
−
f
(
x
)
f
′
′
(
x
)
g(x)=x-\displaystyle\frac{u(x)}{u'(x)}=x-\displaystyle\frac{f(x)f'(x)}{f'(x)^2-f(x)f''(x)}
g(x)=x−u′(x)u(x)=x−f′(x)2−f(x)f′′(x)f(x)f′(x)
whose convergence rate is also quadratic.
2.4.5 Convergence rate of Secant Method
- Convergence rate of secant method is normally superlinear, with r ≈ 1.618 r\approx1.618 r≈1.618, which is lower than Newton’s method.
- Secant method need to evaluate two previous functions per iteration, there is no requirement to evaluate the derivative.
- Its disadvantage is that it needs two starting guesses which close enough to the solution in order to converge.
2.5 Accelerating Convergence
2.5.1 Aitken’s method
Background
Accelerating the convergence of a sequence that is linearly convergent, regardless of its origin or application.
lim
n
→
∞
p
n
+
1
−
p
p
n
−
p
=
λ
,
λ
≠
0.
\lim_{n\rightarrow \infty} \displaystyle\frac{p_{n+1}-p}{p_n-p}=\lambda,\lambda\not=0.
n→∞limpn−ppn+1−p=λ,λ=0.
Thus, when
n
n
n is sufficiently large,
p
n
+
1
−
p
p
n
−
p
≈
p
n
+
2
−
p
p
n
+
1
−
p
p
≈
p
n
∗
p
n
+
2
−
p
n
+
1
2
p
n
+
2
−
2
∗
p
n
+
1
+
p
n
p
≈
p
n
−
(
p
n
+
1
−
p
n
)
2
p
n
+
2
−
2
∗
p
n
+
1
+
p
n
\displaystyle\frac{p_{n+1}-p}{p_n-p}\approx \displaystyle\frac{p_{n+2}-p}{p_n+1-p}\\ p\approx\displaystyle\frac{p_n*p_{n+2}-p_{n+1}^2}{p_{n+2}-2*p_{n+1}+p_n}\\ p\approx p_n-\displaystyle\frac{(p_{n+1}-p_{n})^2}{p_{n+2}-2*p_{n+1}+p_n}
pn−ppn+1−p≈pn+1−ppn+2−pp≈pn+2−2∗pn+1+pnpn∗pn+2−pn+12p≈pn−pn+2−2∗pn+1+pn(pn+1−pn)2
Aitken’s
Δ
\Delta
Δ method is to define a new sequence
p
^
n
=
0
∞
:
{\hat{p}}_{n=0}^\infty:
p^n=0∞:
p
^
=
p
n
−
(
p
n
+
1
−
p
n
)
2
p
n
+
2
−
2
∗
p
n
+
1
+
p
n
,
\hat{p}=p_n-\displaystyle\frac{(p_{n+1}-p_{n})^2}{p_{n+2}-2*p_{n+1}+p_n},
p^=pn−pn+2−2∗pn+1+pn(pn+1−pn)2,
whose convergence rate is faster than the original sequence
{
p
n
}
n
=
0
∞
\{p_n\}_{n=0}^\infty
{pn}n=0∞.
Definition
Given the sequence
{
p
n
}
n
=
0
∞
\{p_n\}_{n=0}^\infty
{pn}n=0∞, the forward difference
Δ
p
n
\Delta p_n
Δpn is defined by
Δ
p
n
=
p
n
+
1
−
p
n
,
n
≥
0.
\Delta p_n=p_{n+1}-p_n,n\geq 0.
Δpn=pn+1−pn,n≥0.
Higher powers
Δ
k
p
n
\Delta^kp_n
Δkpn are defined recursively by
Δ
k
p
n
=
Δ
(
Δ
k
−
1
p
n
)
,
k
≥
2.
\Delta^k p_n=\Delta(\Delta^{k-1}p_{n}),k\geq 2.
Δkpn=Δ(Δk−1pn),k≥2.
For example,
Δ
2
p
n
=
Δ
(
Δ
p
n
)
=
Δ
(
p
n
+
1
−
p
n
)
=
Δ
(
p
n
+
1
)
−
Δ
(
p
n
)
=
p
n
+
2
−
2
p
n
+
1
+
p
n
.
\Delta^2p_n=\Delta(\Delta p_{n})=\Delta(p_{n+1}-p_n )=\Delta(p_{n+1})-\Delta(p_n)=p_{n+2}-2p_{n+1}+p_n.
Δ2pn=Δ(Δpn)=Δ(pn+1−pn)=Δ(pn+1)−Δ(pn)=pn+2−2pn+1+pn.
Thus,
p
n
^
=
p
n
−
(
Δ
p
n
)
2
Δ
2
p
n
\hat{p_n}=p_n-\displaystyle\frac{(\Delta p_n)^2}{\Delta^2p_n}
pn^=pn−Δ2pn(Δpn)2.
Theorem
Condition:
lim
n
→
∞
p
n
+
1
−
p
p
n
−
p
=
λ
,
λ
≠
0.
(
p
n
−
p
)
(
p
n
+
1
−
p
)
>
0
\lim_{n\rightarrow \infty} \displaystyle\frac{p_{n+1}-p}{p_n-p}=\lambda,\lambda\not=0.\\ (p_n-p)(p_{n+1}-p)>0
n→∞limpn−ppn+1−p=λ,λ=0.(pn−p)(pn+1−p)>0
Result: the sequence
{
p
n
^
}
n
=
0
∞
\{\hat{p_n}\}_{n=0}^\infty
{pn^}n=0∞ converges to
p
p
p faster than
{
p
n
}
n
=
0
∞
\{p_n\}_{n=0}^\infty
{pn}n=0∞ in the sense that
lim
n
→
∞
p
^
n
−
p
p
n
−
p
=
0.
\lim_{n\rightarrow\infty}\displaystyle\frac{\hat{p}_n-p}{p_n-p}=0.
n→∞limpn−pp^n−p=0.
Proof:
2.5.2 Steffensen’s method
Definition
The function is p = g ( p ) p=g(p) p=g(p), and the initial approximation is p 0 p_0 p0, p 0 ^ = p 0 − ( Δ p 0 ) 2 Δ 2 p 0 \hat{p_0}=p_0-\displaystyle\frac{(\Delta p_0)^2}{\Delta^2p_0} p0^=p0−Δ2p0(Δp0)2.
Assume that p 0 ^ \hat{p_0} p0^ is a better approximation than p 2 p_2 p2, so applying fixed-point iteration to p 0 ^ \hat{p_0} p0^ instead of p 2 p_2 p2, and the computing process shows below.
Theorem
Suppose that x = g ( x ) x=g(x) x=g(x) has the solution p p p with g ′ ( p ) ≠ 1 g'(p)\not=1 g′(p)=1.
If there exists a δ > 0 \delta>0 δ>0 such that g ∈ C 3 [ p − δ , p + δ ] g\in C^3[p-\delta,p+\delta] g∈C3[p−δ,p+δ],
then Steffensen’s method gives quadratic convergence for any p 0 ∈ [ p − δ , p + δ ] . p_0\in [p-\delta,p+\delta]. p0∈[p−δ,p+δ].
Pseudo-Code
-
INPUT: Initial approximation p 0 p_0 p0, tolerance T O L TOL TOL, Maximum number of iteration N N N.
-
OUTPUT: approximate solution p p p or message of failure.
-
Step 1 1 1: Set n = 1 n = 1 n=1.
-
Step 2 2 2: While n ≤ N n\leq N n≤N, do Steps 3 ~ 5 3~5 3~5.
- Step 3 3 3: Set p 1 = g ( p 0 ) , p 2 = g ( p 1 ) , p = p 0 − ( p 1 − p 0 ) 2 p 2 − 2 p 1 + p 0 p_1=g(p_0),p_2=g(p_1),p= p_0-\displaystyle\frac{(p_1-p_0)^2}{p_2-2p_1+p_0} p1=g(p0),p2=g(p1),p=p0−p2−2p1+p0(p1−p0)2.
- Step 4 4 4: If ∣ p − p 0 ∣ < T O L |p-p_0|<TOL ∣p−p0∣<TOL, then output p p p; (Procedure complete successfully.) Stop!
- Step 5 5 5: Set n = n + 1 , p 0 = p n=n+1, p_0=p n=n+1,p0=p.
-
Step 6 6 6: OUTPUT “Method failed after N N N iterations.” STOP!
2.6 Zeros of Polynomials and Muller’s Method
2.6.1 Polynomial Theorem
2.6.2 Horner’s Method
Background
A more efficient method to calculate the P ( x 0 ) P(x_0) P(x0) and P ′ ( x 0 ) P'(x_0) P′(x0) for a polynomial P ( x ) P(x) P(x).
Theorem
Let
P
(
x
)
=
∑
i
=
0
i
=
n
a
i
x
i
.
P(x)=\sum\limits_{i=0}^{i=n}a_ix^i.
P(x)=i=0∑i=naixi.
- Construction process for P(x_0) (Substitute formulas one by one to verify)
if
b
n
=
a
n
b_n=a_n
bn=an and
b
k
=
a
k
+
b
k
+
1
x
0
,
k
∈
[
0
,
n
−
1
]
,
b_k=a_k+b_{k+1}x_0,k\in [0,n-1],
bk=ak+bk+1x0,k∈[0,n−1],
then
b
0
=
P
(
x
0
)
b_0=P(x_0)
b0=P(x0).
Moreover, if
Q
(
x
)
=
∑
i
=
1
n
b
i
x
i
−
1
Q(x)=\sum\limits_{i=1}^{n}b_ix^{i-1}
Q(x)=i=1∑nbixi−1
then
P
(
x
)
=
(
x
−
x
0
)
Q
(
x
)
+
b
0
.
P(x)=(x-x_0)Q(x)+b_0.
P(x)=(x−x0)Q(x)+b0.
- Construction process for P’(x_0) (Substitute formulas one by one to verify)
P ( x ) = ( x − x 0 ) Q ( x ) + b 0 P ′ ( x ) = Q ( x ) + ( x − x 0 ) Q ′ ( x ) P ′ ( x 0 ) = Q ( x 0 ) P(x)=(x-x_0)Q(x)+b_0\\ P'(x)=Q(x)+(x-x_0)Q'(x)\\ P'(x_0)=Q(x_0) P(x)=(x−x0)Q(x)+b0P′(x)=Q(x)+(x−x0)Q′(x)P′(x0)=Q(x0)
Let
Q
(
x
)
=
∑
i
=
1
n
b
i
x
i
−
1
=
(
x
−
x
0
)
R
(
x
)
+
c
1
Q(x)=\sum\limits_{i=1}^{n}b_ix^{i-1}=(x-x_0)R(x)+c_1
Q(x)=i=1∑nbixi−1=(x−x0)R(x)+c1, where
R
(
x
)
=
∑
i
=
2
n
c
i
x
i
−
2
R(x)=\sum\limits_{i=2}^{n}c_ix^{i-2}
R(x)=i=2∑ncixi−2. Thus
c
n
=
b
n
,
c
k
=
b
k
+
c
k
+
1
x
0
,
k
∈
[
1
,
n
−
1
]
,
Q
(
x
0
)
=
c
1
=
P
′
(
x
0
)
.
c_n=b_n,\\ c_k=b_k+c_{k+1}x_0,k\in[1,n-1],\\ Q(x_0)=c_1=P'(x_0).
cn=bn,ck=bk+ck+1x0,k∈[1,n−1],Q(x0)=c1=P′(x0).
Pseudo-Code
To compute the value P ( x 0 ) P(x_0) P(x0) and P ′ ( x 0 ) P'(x_0) P′(x0) for a function P ( x ) = ∑ i = 0 n a i x i . P(x)=\sum\limits_{i=0}^{n}a_ix^i. P(x)=i=0∑naixi.
-
INPUT: Degree n n n, coefficients a 0 , a 1 , . . . , a n a_0,a_1,...,a_n a0,a1,...,an of polynomial P ( x ) P(x) P(x), point x 0 x_0 x0.
-
OUTPUT: Values of P ( x 0 ) P(x_0) P(x0) and P ′ ( x 0 ) P'(x_0) P′(x0).
-
Step 1 1 1: Set y = a n y = a_n y=an ( b n b_n bn for Q Q Q), z = 0 z=0 z=0 ( c n + 1 c_{n+1} cn+1 for R R R).
-
Step 2 2 2: For j = n − 1 , n − 2 , . . . , 0 j=n-1,n-2,...,0 j=n−1,n−2,...,0, set
- z = y + z ∗ x 0 z=y+z*x_0 z=y+z∗x0 ( c j + 1 c_{j+1} cj+1 for R R R),
- y = a j + y ∗ x 0 y =a_j+y*x_0 y=aj+y∗x0 ( b j b_j bj for Q Q Q).
-
Step 3 3 3: OUTPUT y : P ( x 0 ) y:P(x_0) y:P(x0) and z : P ′ ( x 0 ) z:P'(x_0) z:P′(x0).
2.6.3 Deflation Method
Newton’s method combined with Honor’s method
Deflation Method (压缩技术)
Suppose that
x
N
x_N
xN in the Nth iteration of the Newton-Raphson procedure, is an approximation zero of
P
(
x
)
P(x)
P(x), then
P
(
x
)
=
(
x
−
x
N
)
Q
(
x
)
+
b
0
=
(
x
−
x
N
)
Q
(
x
)
+
P
(
x
N
)
≈
(
x
−
x
N
)
Q
(
x
)
.
P(x)=(x-x_N)Q(x)+b_0=(x-x_N)Q(x)+P(x_N)\approx (x-x_N)Q(x).
P(x)=(x−xN)Q(x)+b0=(x−xN)Q(x)+P(xN)≈(x−xN)Q(x).
Let
x
1
^
=
x
N
\hat{x_1}=x_N
x1^=xN be the approximate zero of
P
P
P, and
Q
1
(
x
)
=
Q
(
x
)
Q_1(x)=Q(x)
Q1(x)=Q(x) be the approximate factor, then we have
P
(
x
)
≈
(
x
−
x
1
^
)
Q
1
(
x
)
.
P(x)\approx (x-\hat{x_1})Q_1(x).
P(x)≈(x−x1^)Q1(x).
To find the second approximate zero of P ( x ) P(x) P(x), we can use the same procedure to Q 1 ( x ) Q_1(x) Q1(x), give Q 1 ( x ) ≈ ( x − x 2 ^ ) Q 2 ( x ) Q_1(x)\approx(x-\hat{x_2})Q_2(x) Q1(x)≈(x−x2^)Q2(x), where Q 2 ( x ) Q_2(x) Q2(x) is a polynomial of degree n − 2 n-2 n−2. Thus P ( x ) ≈ ( x − x 1 ^ ) Q 1 ( x ) ≈ ( x − x 1 ^ ) ( x − x 2 ^ ) Q 2 ( x ) P(x)\approx (x-\hat{x_1})Q_1(x)\approx (x-\hat{x_1})(x-\hat{x_2})Q_2(x) P(x)≈(x−x1^)Q1(x)≈(x−x1^)(x−x2^)Q2(x).
Repeat this procedure, till
Q
n
−
2
(
x
)
Q_{n-2}(x)
Qn−2(x) which is an quadratic polynomial and can be solved by quadratic formula. We can get all approximate zeros of
P
(
x
)
P(x)
P(x). This method is called deflation method
.