Convex Optimization 读书笔记 (5)

最新推荐文章于 2021-07-13 16:14:47 发布

来碗拿铁️

最新推荐文章于 2021-07-13 16:14:47 发布

阅读量427

点赞数

分类专栏：读书笔记凸优化

本文链接：https://blog.csdn.net/qq_39337332/article/details/109520966

版权

读书笔记同时被 2 个专栏收录

11 篇文章 1 订阅

订阅专栏

凸优化

10 篇文章 0 订阅

订阅专栏

Chapter 6: Approximation and fitting

6.1 Norm approximation

6.1.1 Basic norm approximation problem

The simplest norm approximation problem is an unconstrained problem of the form
${\rm minimize} \ \ \ \ ||Ax-b||$ where $A\in \mathbf{R}^{m\times n}, b\in \mathbf{R}^m,x\in \mathbf{R}^n,||\cdot||$ is a norm on $\mathbf{R}^m$ . A solution of the norm approximation problem is sometimes called an approximate solution of $A x \approx b$ , in the norm $∥ \cdot ∥$ . The vector
$r = A x - b$ is called the residual for the problem; its components are sometimes called the individual residuals associated with $x$ .

6.1.2 Penalty function approximation

In $l_p$ -norm approximation, for $1 \leq p < \infty$ , the objective is
$(|r_1|^p+\cdots+|r_m|^p)^{\frac{1}{p}}$ As in least-squares problems, we can consider the equivalent problem with objective
$|r_1|^p+\cdots+|r_m|^p$ The penalty function approximation problem has the form
$\begin{aligned} {\rm minimize} \ \ \ \ & \phi(r_1)+\cdots+\phi(r_m)\\ {\rm subject \ to} \ \ \ \ & r=Ax-b \\ \end{aligned}$ where $\phi : \mathbf{R} → \mathbf{R}$ is called the (residual) penalty function. We assume that $\phi$ is convex, so the penalty function approximation problem is a convex optimization problem.

deadzone-linear

$\phi(u)=\left\{ \begin{array}{rl} & 0 & |u|\leq a \\ & |u|-a & |u| > a \end{array} \right.$

log barrier

$\phi(u)=\left\{ \begin{array}{rl} & -a\log(1-(\frac{u}{a})^2) & |u|\leq a \\ & \infty & |u| > a \end{array} \right.$

The log barrier penalty puts weight very much like the $l_2$ -norm penalty for small residuals, but puts very strong weight on residuals larger than around 0.8, and infinite weight on residuals larger than 1.
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ez4wQbTL-1604582576416)(/Users/apple/Library/Application Support/typora-user-images/image-20201105154032988.png)]

6.1.3 Approximation with constraints

It is possible to add constraints to the basic norm approximation problem. When these constraints are convex, the resulting problem is convex.

6.2 Least-norm problems

The basic least-norm problem has the form
$\begin{aligned} {\rm minimize} \ \ \ \ & ||x|| \\ {\rm subject \ to} \ \ \ \ & Ax=b \\ \end{aligned}$ where $A\in \mathbf{R}^{m\times n}, b\in \mathbf{R}^m,x\in \mathbf{R}^n,||\cdot||$ is a norm on $\mathbf{R}^m$ .

6.3 Regularized approximation

6.3.1 Bi-criterion formulation

In the basic form of regularized approximation, the goal is to find a vector $x$ that is small (if possible), and also makes the residual $A x - b$ small. This is naturally described as a (convex) vector optimization problem with two objectives, $∥ A x - b ∥$ and $∥ x ∥$ :
$\begin{aligned} {\rm minimize \ (w.r.t. \ } \mathbf{R}^2_+) \ \ \ \ & (||Ax-b||,||x||) \\ \end{aligned}$
The two norms can be different: the first, used to measure the size of the residual, is on $\mathbf{R}^m$ ; the second, used to measure the size of $x$ , is on $\mathbf{R}^n$ .

6.3.2 Regularization

Regularization is a common scalarization method used to solve the bi-criterion problem. One form of regularization is to minimize the weighted sum of the objectives:
$\begin{aligned} {\rm minimize} \ \ \ \ & ||Ax-b||+\gamma||x|| \\ \end{aligned}$ where $γ > 0$ is a problem parameter. As $γ$ varies over $(0, \infty)$ , the solution of traces out the optimal trade-off curve.

Another common method of regularization, especially when the Euclidean norm is used, is to minimize the weighted sum of squared norms:
$\begin{aligned} {\rm minimize} \ \ \ \ & ||Ax-b||^2+\delta||x||^2 \\ \end{aligned}$ for a variety of values of $δ > 0$ .

6.3.3 Reconstruction, smoothing, and de-noising

Quadratic smoothing

The simplest reconstruction method uses the quadratic smoothing function
$\phi_{\rm quad}(x)=\sum_{i=1}^{n-1}(x_{i-1}-x_i)^2$
We can obtain the optimal trade-off between $∥\hat{x}−x_{\rm cor}∥^2$ and $∥D\hat{x}∥^2$ by minimizing
$||\hat{x}-x_{\rm cor}||^2_2+\delta||D\hat{x}||_2^2$
where $D\in \mathbf{R}^{(n-1)\times n}$ is the bidiagonal matrix.

But any rapid variations in the original signal will, obviously, be attenuated or removed by quadratic smoothing.

Total variation reconstruction

The method is based on the smoothing function
$\phi_{\rm tv}(x)=\sum_{i=1}^{n-1}|x_{i-1}-x_i|$

6.4 Robust approximation

6.4.1 Stochastic robust approximation

We assume that $A$ is a random variable taking values in $\mathbf{R}^{m\times n}$ with mean $\bar{A}$ so we can describe $A$ as with mean $\bar{A}$ as
$A=\bar{A}+U$ where $U$ is a random matrix with zero mean. Then the problem is
${\rm minimize} \ \ \ \ \mathbb{E}[||Ax-b||_2^2]$
We can express the objective as
${\rm minimize} \ \ \ \ ||\bar{A}x-b||_2^2+||P^{\frac{1}{2}}x||_2^2$ where $P=\mathbb{E}[U^TU]$ . The solution is
$x=(\bar{A}^T\bar{A}+P)^{-1}\bar{A}^Tb$

6.4.2 Worst-case robust approximation

The (worst-case) robust approximation problem is to minimize the worst-case error:
${\rm minimize} \ \ \ \ e_{\rm wc}(x)=\sup\{ ||Ax-b|| \mid A\in \mathcal{A} \}$ where $\mathcal{A}$ is a set of all possible value. The robust approximation problem is always a convex optimization problem, but its tractability depends on the norm used and the description of the uncertainty set $\mathcal{A}$ .

6.5 Function fitting and interpolation

6.5.1 Function families

We consider a family of functions $f_1, \cdots , f_n : \mathbf{R}^k → \mathbf{R}$ , with common domain $\mathbf{dom} \ f_i = D$ . With each $\mathbf{R}^n$ we associate the function $ f : \mathbf{R}^k → \mathbf{R} $ given by
$f(u)=x_1f_1(u)+\cdots+x_nf_n(x)$
The family $\{f_1, \cdots , f_n\}$ is sometimes called the set of basis functions (for the fitting problem) even when the functions are not independent. The vector $\mathbf{R}^n$ , which parametrizes the subspace of functions, is our optimization variable, and is sometimes called the coefficient vector.

6.5.2 Constraints

Interpolation and inequalities

Interpolation conditions
$f(v_j)=z_j, \ \ j=1,...m$ which require the function $f$ to have the values $z_j ∈ \mathbf{R}$ at specified points $v_j ∈ D$ , form a set of linear equalities in $x$ .

Derivative constrains

The gradient
$\nabla f(v)=\sum_{i=1}^{n}x_i\nabla f_i(v)$ is a linear function of $x$ , so interpolation conditions on the derivative of f at $v$ reduce to linear equality constraints on $x$ :
$||\nabla f(v)||=||\sum_{i=1}^{n}x_i\nabla f_i(v)||\leq M$
is a convex constrain on $x$ .

Integral constrains

Any linear functional $\mathcal{L}$ on the subspace of functions can be expressed as a linear function of $x$ :
$\mathcal{L}=c^Tx$ where $c_i=\int_{D}\phi(u)f_i(i)du$

6.5.3 Fitting and interpolation problems

6.5.4 Sparse descriptions and basis pursuit

In basis pursuit, there is a very large number of basis functions, and the goal is to find a good fit of the given data as a linear combination of a small number of the basis functions.

Thus we seek a function $\mathcal{F}$ that fits the data well,
$f(u_i)\approx y_i,\ \ i=1,...,m$ with a sparse coefficient vector $x$ , i.e., $\mathbf{card}(x)$ small:
$f=\sum_{i \in \mathcal{B}}x_if_i$

6.5.5 Interpolation with convex functions

If and only if there exist $g_1, . . . , g_m$ such that
$y_j\geq y_i+g_i^T(u_j-u_i), \ \ i,j=1,...,m$ then there exsit a convex function $f:\mathbf{R}^k → \mathbf{R}$ with $\mathbf{dom} \ f= \mathbf{R}^k$ , that satisfies
$f(u_i)=y_i, \ \ i=1,...,m$
For example, $f(z)=\max_{i=1,...,m}(y_i+g_i^T(z-u_i)).$

来碗拿铁️

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Convex Optimization 读书笔记 (5)

Chapter 6: Approximation and fitting6.1 Norm approximation6.1.1 Basic norm approximation problemThe simplest norm approximation problem is an unconstrained problem of the formminimize    ∣∣Ax−b∣∣{\rm minimize} \ \ \ \ ||Ax-b||min
复制链接

扫一扫