Lect3: Trust-Region Methods

最新推荐文章于 2020-10-20 20:47:16 发布

weixin_30596023

最新推荐文章于 2020-10-20 20:47:16 发布

阅读量298

点赞数

原文链接：http://www.cnblogs.com/cihui/p/6402904.html

版权

1. Introduction:

Trust-region methods choose the direction and length of the step simultaneously.
If a step is not acceptable, they reduce the size of the region and find a new minimizer.
The step direction changes whenever the size of the trust region is altered.
If the region is too small, the algorithm misses an opportunity to take a substantial step.
If too large, the minimizer of the model may be far from the minimizer
Increase the trust region if the previous step is good, otherwise reduce the size

subproblem:
\[\min_{p}m_k(p)=f_k+\nabla f_k^Tp+\frac{1}{2}p^TB_kp, \ s.t.\ ||p||\leq\Delta_k\]
If \(||B_k^{-1}\nabla f_k||\leq\Delta_k\), then \(p_k=-B_k^{-1}\nabla f_k\) is the solution

2. Outline of the algorithm:

Algorithm:
Given \(\bar{\Delta}>0, \Delta_0\in(0,\bar{\Delta})\), and \(\eta\in[0,\frac{1}{4})\)
for \(k=0,1,2,\cdots\)
　　given \(p_k\) by solving the above subproblem
　　evalute \(\rho_k=\frac{f(x_k)-f(x_k+p_k)}{m_k(0)-m_k(p_k)}\)
　　if \(\rho_k<\frac{1}{4}\)
　　　　\(\Delta_{k+1}=\frac{1}{4}||p_k||\)
　　else
　　　　if \(\rho_k>\frac{3}{4}\) and \(||p_k||=\Delta_k\)
　　　　　　\(\Delta_{k+1}=\min(2\Delta_k,\bar{\Delta})\)
　　　　else
　　　　　　\(\Delta_{k+1}=\Delta_k\)
　　if \(\rho_k>\eta\)
　　　　\(x_{k+1}=x_k+p_k\)
　　else
　　　　\(x_{k+1}=x_k\)
end(for)

Theorem:
The vector \(p^*\) is a global solution of the trust-region problem
\[\min_{p\in\mathbb{R}^n}m(p)=f+g^Tp+\frac{1}{2}p^TBp, s.t. ||p||\leq \Delta\]
if and only if \(p^*\) is feasible and there is a scale \(\lambda\geq 0\) such that the following conditions are satisfied:
\[(B+\lambda I)p^*=-g\]
\[\lambda(\Delta-||p^*||)=0\]
\[(B+\lambda I) \ is\ positive\ semidefinite\ \]

3. Algorithms based on the Cauchy point:

Two strategies finding approximate solutions to achieve as much reduction as the Cauchy point

The dogleg method
The two-dimensional subspace minimization

Cauchy point calculation:
Find the vector \(p_k^s\) that solves a linear version of the subproblem, that is
\[p_k^s=argmin_{p\in\mathbb{R}^n}f_k+g_k^Tp, s.t. ||p||\leq\Delta)k\]
calculate the scalar \(\tau_k>0\) that minimizes \(m_k(\tau_kp_k^s)\) subject to the trust-region bounds:
\[\tau_k=argmin_{\tau\geq 0}m_k(\tau p_k^s), s.t. ||\tau p_k^s||\leq \Delta_k\]
Set \(p_k^c=\tau_kp_k^s\)
The solution is:
\[p_k^c=-\tau_k\frac{\Delta_k}{||g_k||}g_k\]
\[\tau_k=\begin{cases} 1, \ if\ g_k^TB_kg_k\leq 0 \\ min(||g_k||^3/(\Delta_kg_k^TB_kg_k), 1), \ otherwise \\ \end{cases}\]

The Dogleg method:

\[\tilde{p}(\tau)=\begin{cases} \tau p^U, \ 0\leq \tau \leq 1, \\ p^U+(\tau-1)(p^B-p^U), \ 1 \leq \tau \leq 2 \\ \end{cases}\]
\(p^B=-B^{-1}g, p^U=-\frac{g^Tg}{g^TBg}g\)

\(\tilde{p}(\tau)\) intersects the trust-region boundary \(||p||=\Delta\) at exactly one point if \(||p^B||\geq \Delta\) and nowhere otherwise. Since \(m\) is decreasing along the path, the chosen value of \(p\) will be \(p^B\) if \(||p^B||\leq \Delta\), otherwise at the point of intersection of the dogleg and the trust-region boundary. In the latter case, we compute the appropriate value of \(\tau\) by solving the scalar quadratic equation:
\[||p^U+(\tau-1)(p^B-p^U)||^2=\Delta^2\]

The Newton-dogleg method is most appropriate when the objective function is convex (that is \(\nabla^2f(x_k)\) is always positive semidefinite).

Two-dimensional subspace minimization:
The subproblem is replaced by:
\[min_pm(p)=f+g^Tp+\frac{1}{2}p^TBp, s.t. \ ||p||\leq\Delta, \ p\in span[g,B^{-1}g]\]
when \(B\) has negative eigenvalues, this method can be modified to handle the case by change the subspace to:
\[span[g, (B+\alpha I)^{-1}g], \ for\ some \alpha\in(-\lambda_1, -2\lambda_1]\]
where \(\lambda_1\) denotes the most negative eigenvalue of B.
When \(||(B+\alpha I)^{-1}g||\leq \Delta\), we discard the subspace search and instead define the step to be
\[p=-(B+\alpha I)^{-1}g+v\]
where \(v\) is a vector that satisfies \(v^T(B+\alpha I)^{-1}g\leq 0\) which ensures that \(||p||\geq ||(B+\alpha I)^{-1}g||\). When \(B\) has zero eigenvalues but no negative eigenvalues, we define the step
to be the Cauchy point \(p=p^c\).

4. Global convergence:

It can be proved that the sequence of gradients \(\{g_k\}\) generated by the algorithm in section 2 has an accumulation point at zero, and in fact converges to zero when \(\eta\) is strictly positive.

转载于:https://www.cnblogs.com/cihui/p/6402904.html

weixin_30596023

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Lect3: Trust-Region Methods

1. Introduction:Trust-region methods choose the direction and length of the step simultaneously.If a step is not acceptable, they reduce the size of the region and find a new minimizer.The step ...
复制链接

扫一扫