CVX notes

CVX notes


Preliminaries

1. PSD

M is positive semidefinite matrix \(\iff\) all principal submatrices \(P\) of \(M\) are PSD

Note: This follows by considering the quadratic form \(x^T Mx\) and looking at the components of \(x\) corresponding to the defining subset of principal submatrix. The converse is trivially true.

M is PSD \(\iff\) all principal minors are non-negative (所有主子式非负)

将M写成二次型:
\[ x^T M x = \sum_{i,j}M_{ij}x_ix_j \]
于是取 \(x\) 为标准基 \(e_i ~\implies M_{ii} \ge 0 \implies \mathbf{tr}(M) \ge 0\) , 再取\(x\)为零向量只有 i,j两个位置为 1,则
\[ \begin{gathered} x^T M x = M_{ii}M_{jj} - M_{ij}^2 \ge 0 ~~(PSD) \\ \implies M_{ij} \le \sqrt{M_{ii}M_{jj}} \le \frac{M_{ii} + M_{jj}}{2} \end{gathered} \]


2. Matrix norm

General definition of a norm:2018-06-05 20-39-34 的屏幕截图

Matrix norm:2018-06-05 20-44-07 的屏幕截图

  • Frobenius norm: \(\|A\|_F := \sqrt{\langle A,A\rangle_F} = \sqrt{\mathbf{tr}(A^*A)}\)
  • Induced norm: \(\|A\|_p := \sup_\limits{\|x\|_p = 1} \|Ax\|_p\)
  • Nuclear norm: \(\|A\|_{nuclear} := \sum \sigma_i(A)\) (奇异值之和)
  • Spectral norm: \(\|A\|_{spectral} := \lambda_1\) (最大特征值)

Spectrial radius

2018-06-05 21-01-46 的屏幕截图

2018-06-05 21-02-47 的屏幕截图


3. Duality

Two ==equivalent== ways to represent a convex set:

  • standard representation: The family of points in the set
  • dual representation: The set of halfspaces containing the set (半平面的交集)

A closed convex set \(S\) is the intersection of all closed halfspaces \(H\) containing it.2018-06-05 21-27-43 的屏幕截图

2018-06-05 21-29-20 的屏幕截图

2018-06-05 21-31-13 的屏幕截图

Polar
Let \(S \subseteq \mathbb{R}^n\) be a convex set containing the origin. The polar of \(S\) is defined as follows
\[ S^{\circ} := \{y ~|~ y^Tx \le 1, ~\forall x \in S\} \]

Note

  • polar is one way of representing the all halfspaces containing a convex set
  • every halfspace \(a^Tx \le b\) with \(b \neq 0\) can be written as a “normalized” inequality \(y^T x \le 1\), by dividing by \(b\)
  • \(S^{\circ}\) can be thought of as the normalized representations of halfspaces containing \(S\)

Properties of the polar:

  1. \(S^{\circ\circ} = S\)
  2. \(S^{\circ}\) is a closed convex set containing the origin
  3. When 0 is in the interior of \(S\), then \(S^{\circ}\) is bounded
  4. When \(S\) is non-convex, \(S^{\circ} = (\mathbf{conv}(S))^{\circ}\), and \(S^{\circ\circ} = \mathbf{conv}(S)\)

2018-06-05 22-03-51 的屏幕截图

2018-06-05 22-04-27 的屏幕截图

Polar duality of convex cones2018-06-05 22-17-25 的屏幕截图

2018-06-05 22-18-54 的屏幕截图

Notes

  1. \(K^{\circ\circ} = K\)
  2. \(K^{\circ}\) is closed and convex

Conjugation of convex functions
Let \(f: \mathbb{R}^n \mapsto \mathbb{R}\cup\{\infty\}\) be a convex function. The ==conjugation== of \(f\) is
\[ f^*(y) := \sup_\limits{x}(y^Tx - f(x)) \]
2018-06-05 23-57-38 的屏幕截图

Properties of the conjugate

  1. \(f^{**} = f\)
  2. \(f^*\) is convex (supremum of affine functions of \(y\))

2018-06-06 00-03-01 的屏幕截图


Convex sets


Convex functions

2018-05-14 09-26-39 的屏幕截图

  1. affine is convex: \(f(x) = a^T x+b\)

    affine 既凸也凹

  2. 任何_范数_是凸的

    Proof: let \(\pi(x)\) be a norm of \(x\), then2018-05-14 09-32-58 的屏幕截图

  3. \(f\) is convex \(\iff\) epi(\(f\)) is convex


1. Closed convex

A convex function \(f\) is called closed if its epigraph is a closed set.

  1. \(f\) which is convex and continuous on a closed domain is a closed function. (norms)
  2. all differentiable convex functions are closed with dom\(f = \mathbb{R}^n\).
  3. 当考虑一个凸函数时,通常认为在dom\(f\)外取值为\(\infty\)
  4. Jensen's inequality:2018-05-14 09-52-18 的屏幕截图
  5. Corollary:2018-06-04 18-35-09 的屏幕截图

    pf: \(f(x) = f(\sum\alpha_i x_i) \le \sum \alpha_i f(x_i) \le \max_\limits{i} f(x_i)\)


2. Level sets

2018-05-14 09-57-20 的屏幕截图

Note: the convexity of level sets does not characterize convex functions, but quasiconvex functions.

  1. convex \(f\) is closed \(\implies\) all its level sets are closed

Some convex sets

  1. norm ball (\(\{x\in \mathbb{R}^n | \|x\| \le 1\}\)) is convex and closed
  2. 椭球(\(\{x | (x-a)^T Q (x-a) \le r^2\}\)) is convex and closed

    pf: \(x^TQy := \langle x, y \rangle\) 满足内积的三条性质
    • bilinearity
    • symmetry
    • positivity
      上述三条性质 \(\iff\) Q is PSD
  3. \(\epsilon\)-neighborhood: 2018-06-04 20-56-40 的屏幕截图


3. Operations perserving convexity of functions

  1. stability under taking weighted sums: \(f,g \mapsto \lambda f + \mu g, \; \lambda,\mu \ge 0\)
  2. stability under affine substitutions of the argument: \(x \mapsto Ax+b\) or \(f(x) \mapsto \phi(x) = f(Ax+b)\)
  3. stability under taking pointwise sup: \(\{f_i\}_{i \in \mathcal{I}} \mapsto g(x) := \sup_\limits{i \in \mathcal{I}}f_i(x)\), 凸函数族 \(\{f_i\}_{i \in \mathcal{I}}\) 逐点取上确界而成的函数也是凸的
  4. stability under partial minimization: \(f(x,y)\) jointly convex in \((x,y)\), then \(g(x) = \inf_\limits{y} f(x,y)\) is convex (suppose g is proper, i.e., > -\(\infty\) everywhere and is finite at least at one point)
  5. stability under perspective: \(f(x) \mapsto g(x,t) = tf(x/t), \mathbf{dom}g = \{(x,t) | x/t \in \mathbf{dom}f, t > 0\}\)

4. Detect convexity

Necessary and Sufficient Convexity Condition for smooth function:

  • 一阶可微的光滑函数 \(f\) 是凸的 \(\iff\) \(f'(x)\) 单调非减
  • 二阶可微的光滑函数 \(f\) 是凸的 \(\iff\) \(f''(x) \ge 0\)

subgradient property is characteristic of convex functions:2018-06-05 08-38-57 的屏幕截图


5. Subgradient

Examples

  • 2018-06-05 10-08-50 的屏幕截图
  • 2018-06-05 10-09-54 的屏幕截图
  • 2018-06-05 10-10-54 的屏幕截图
  • 2018-06-05 10-08-04 的屏幕截图

6. Optimality conditions

凸函数的局部最优等价于全局最优。

第一充要条件(凸函数)
\(x^* \in \mathbf{dom}f​\) is the minimizer \(\iff​\) \(0 \in \partial f(x^*)​\)


7. Strong convexity

A differentiable function f is strongly convex if
\[ f(y) \ge f(x) + \nabla f(x)^T(y-x) + \frac{\mu}{2} \|y-x\|^2 \]

Note

  1. \(f\) is not necessarily differentiable, (see the equivalent definition)
  2. if \(f\) is non-smooth, gradient -> subgradient
  3. strong convexity \(\implies\) strict convexity

Note: Intuitively speaking, strong convexity means that there exists a quartic lower bound on the growth of the function.

Equivalent definition
\[ \begin{align} &(i)~f(y)\ge f(x)+\nabla f(x)^T(y-x)+\frac{\mu}{2}\lVert y-x \rVert^2,~\forall x, y.\\ &(ii)~g(x) = f(x)-\frac{\mu}{2}\lVert x \rVert^2~\text{is convex},~\forall x.\\ &(iii)~\langle \nabla f(x) - \nabla f(y),x-y \rangle \ge \mu \lVert x-y\rVert^2,~\forall x, y.\\ &(iv)~f(\alpha x+ (1-\alpha) y) \le \alpha f(x) + (1-\alpha) f(y) - \frac{\alpha (1-\alpha)\mu}{2}\Vert x-y\rVert^2,~\alpha \in [0,1].\\ &(v)~\nabla^2 f(x) \succeq \mu \boldsymbol{I} \end{align} \]


Lagrange Duality

Consider an optimization problem in standard form (not necessarily convex)
\[ \begin{array}{ll} \underset{x}{\text{minimize}} & f_0(x) \\ \text{subject to} & f_i(x) \le 0, ~i=1,\cdots,m \\ ~ & h_i(x) = 0, ~i=1,\cdots,p \end{array} \]

The Lagrangian is
\[ L(x,\boldsymbol{\lambda},\boldsymbol{\mu}) = f_0(x) + \sum_{i=1}^m \lambda_i f_i(x) + \sum_{i=1}^p \mu_i h_i(x) \]

The Lagrange dual function is defined as
\[ g(\lambda, \mu) = \inf_{x} L(x,\lambda,\mu) \]

2018-06-11 12-48-10 的屏幕截图

Lagrange dual problem
\[ \begin{array}{ll} \underset{\lambda, \mu}{\text{maximize}} & g(\lambda, \mu) \\ \text{subject to} & \boldsymbol{\lambda} \succeq \mathbf{0} \end{array} \]

Weak duality
\[ d^* \le p^* \]

  • \(d^*\): optima of dual problem
  • \(p^*\): optima of primal problem
  • duality gap: \(p^* - d^*\)
  • always hold

Strong dualiy
\[ d^* = p^* \]

  • constraint qualifications \(\implies\) strong duality
  • Slater’s Constraint Qualification: a convex problem is strictly feasible (i.e., \(\exists ~x \in \mathbf{int} \mathcal{D}: x \in \Omega\))

Complementary slackness
2018-06-11 13-05-01 的屏幕截图

KKT conditions
2018-06-11 13-06-37 的屏幕截图

2018-06-11 13-08-39 的屏幕截图


Cones

Tagent cone
Let M be a (nonempty) convex set and \(x^* \in M\), the tagent cone of \(M\) at \(x^*\) is the cone
\[ \begin{split} T_M(x^*) &= \{h \in \mathbb{R}^n | x^* + th \in M, \; \forall t > 0 \} \\ &= \{y \in \mathbb{R}^n ~|~ y - x^* \in M\} \end{split} \]

Note:

  • Geometrically, this is the set of all directions leading from \(x^*\) inside \(M\)
  • convex but not necessarily closed
  • fact: if \(x^*\) is a minimizer, then \(\forall h \in T_M(x^*) \implies h^T \nabla f(x^*) \ge 0\). (因为tangent cone里面都是可行解,所以必须不是下降方向)
  • \(T_M(x^*) = \mathbb{R}^n \iff x^* \in \mathbf{int}M\)

e.g. 多面体
\[ M = \{x | Ax \le b\} = \{x | a_i^Tx \le b_i, \; i = 1,\dots,m\} \]
the tangent cone at \(x^*\) is
\[ T_M(x^*) = \{h~|~a_i^T h \le 0, ~\forall i, ~a_i^T x^* = b_i\} \]


Normal cone: the polar cone of tangent cone
\[ N_M(x^*) = \{g \in \mathbb{R}^n ~|~ \langle g, y-x^*\rangle \le 0, ~\forall y \in M\} \]

Note:

  • normal cone is the polar to tangent cone, i.e.,
    \[ \begin{split} T_M(x^*) &= \{g \in \mathbb{R}^n ~|~ \langle g, y-x^*\rangle \ge 0, ~\forall y \in M\} \\ N_M(x^*) &= \{g \in \mathbb{R}^n ~|~ \langle g, y-x^*\rangle \le 0, ~\forall y \in M\} \end{split} \]
  • fact: if \(x^*\) is a minimizer, then \(-\nabla f(x^*) \in N_M(x^*)\).
  • 2018-06-05 15-11-35 的屏幕截图

Algorithm convergence

~Stepsize RuleConvergence RateIteration Complexity
Gradient descent
strongly convex & smooth\(\eta_t = \frac{2}{\mu + L}\)\(O\left(\frac{\kappa -1}{\kappa +1}\right)^t\)\(O\left(\frac{\log\frac{1}{\epsilon}}{\log\frac{\kappa+1}{\kappa-1}}\right)\)
convex & smooth\(\eta_t = \frac{1}{L}\)\(O(\frac{1}{\sqrt{t}})\)\(O(\frac{1}{\epsilon})\)
Frank-Wolfe
(strongly) convex & smooth\(\eta_t = \frac{1}{t}\)\(O(\frac{1}{\sqrt{t}})\)\(O(\frac{1}{\epsilon})\)
Projected GD
convex & smooth\(\eta_t = \frac{1}{L}\)\(O(\frac{1}{\sqrt{t}})\)\(O(\frac{1}{\epsilon})\)
strongly convex & smooth\(\eta_t = \frac{1}{L}\)\(O\left((1-\frac{1}{\kappa})^t\right)\)\(O(\kappa\log\frac{1}{\epsilon})\)
Subgradient method
convex & Lipschitz\(\eta_t = \frac{1}{\sqrt{t}}\)\(O(\frac{1}{\sqrt{t}})\)\(O(\frac{1}{\epsilon^2})\)
strongly convex & Lipschitz\(\eta_t = \frac{1}{t}\)\(O\left(\frac{1}{t}\right)\)\(O(\frac{1}{\epsilon})\)
Proximal GD
convex & smooth (w.r.t. \(f\))\(\eta_t = \frac{1}{L}\)\(O(\frac{1}{t})\)\(O(\frac{1}{\epsilon})\)
strongly convex & smooth (w.r.t. \(f\))\(\eta_t = \frac{1}{L}\)\(O\left((1-\frac{1}{\kappa})^t\right)\)\(O(\kappa\log\frac{1}{\epsilon})\)

转载于:https://www.cnblogs.com/yychi/p/9398439.html

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值