[LA] Different convexity_different of convex-CSDN博客

本文链接：https://blog.csdn.net/COMEYAN/article/details/50541596

本文介绍了凸优化的基础概念，包括凸函数与严格凸函数的定义及其性质。通过一阶和二阶条件详细阐述了函数凸性的判断准则，并进一步讨论了强凸函数的概念及性质。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Convex and strictly convex
Strong convex

1. Convex and strictly convex

Common used notations about convexity are convex and strictly convex. Their definitions are

Definition 1: [convex]: $f(x)$ is said to be convex if one of the following holds $\forall x,y$

f (λ x + (1 - λ) y) \leq λ f (x) + (1 - λ) f (y)

$f(\lambda x+(1-\lambda)y)\leq \lambda f(x)+(1-\lambda)f(y)$

Definition 2: [strictly convex]: $f(x)$ is said to be strictly convex if one of the following holds $\forall x,y$

f (λ x + (1 - λ) y) < λ f (x) + (1 - λ) f (y)

$f(\lambda x+(1-\lambda)y)< \lambda f(x)+(1-\lambda)f(y)$

And there exist two equivalent definitions:

Theorem 3. [first order condition(1)]: If $f(x)$ is first differentiable, then $f(x)$ is convex iff $\forall x,y$

f (y) \geq f (x) + \nabla f (x) \cdot (y - x)

$f(y) \geq f(x)+\nabla f(x)\cdot (y-x)$
This equivalence holds for strictly convex for

> $>$ .

proof:
necessary: If $f(x)$ is convex and let $\lambda \rightarrow 0$

f (x) \geq f ( λ x + ( 1 - λ ) y ) - ( 1 - λ ) f ( y ) λ = f (y) + f ( y + λ ( x - y ) ) - f ( y ) λ = f (y) + f ( y + λ ( x - y ) ) - f ( y ) λ ( x - y ) \cdot (x - y) = f (y) + \nabla f (y) \cdot (x - y)

$\begin{align} f(x)&\geq \frac{f(\lambda x+(1-\lambda)y ) - (1-\lambda)f(y)}{\lambda}\\ &=f(y) + \frac{f(y+\lambda (x-y)) - f(y)}{\lambda}\\ &=f(y) + \frac{f(y+\lambda (x-y)) - f(y)}{\lambda (x-y)}\cdot(x-y)\\ &=f(y) + \nabla f(y)\cdot(x-y)\\ \end{align}$
sufficient: If the first order condition is satisfied,

f (x) f (y) \geq f (λ x + (1 - λ) y) + \nabla f (λ x + (1 - λ) y) \cdot (1 - λ) (x - y) \geq f (λ x + (1 - λ) y) + \nabla f (λ x + (1 - λ) y) \cdot λ (y - x)

$\begin{align} f(x)&\geq f(\lambda x+(1-\lambda)y)+\nabla f(\lambda x+(1-\lambda)y)\cdot (1-\lambda)(x-y)\\ f(y)&\geq f(\lambda x+(1-\lambda)y)+\nabla f(\lambda x+(1-\lambda)y)\cdot \lambda(y-x)\\ \end{align}$

combining these two together, we get:

λ f (x) + (1 - λ) f (y) \leq f (λ x + (1 - λ) y)

$\begin{align} \lambda f(x)+(1-\lambda)f(y) \leq f(\lambda x+(1-\lambda)y) \end{align}$

**Theorem 4. [first order condition(2)[monotone of $\nabla f(x)$ ]]: $f(x)$ is convex iff $(\nabla f(x)-\nabla f(y))\cdot (x-y)\geq 0$ .
proof: necessary:
If $f(x)$ is convex, then $\forall x,y$ , we have

f (x) \geq f (y) + \nabla f (y) \cdot (x - y) f (y) \geq f (x) + \nabla f (x) \cdot (y - x)

$\begin{align} &f(x)\geq f(y) +\nabla f(y)\cdot (x-y)\\ &f(y)\geq f(x)+\nabla f(x)\cdot (y-x) \end{align}$
adding these two equalities:

f (x) + f (y) \geq f (y) + f (x) + (\nabla f (y) - \nabla f (x)) \cdot (x - y)

$f(x)+f(y)\geq f(y)+f(x)+(\nabla f(y) - \nabla f(x))\cdot (x-y)$
i.e.

(\nabla f (x) - \nabla f (y)) \cdot (x - y) \geq 0

$(\nabla f(x)-\nabla f(y))\cdot (x-y)\geq 0$
sufficient:
Let

g(t)=f(x+t(y−x)) $g(t) = f(x +t(y-x))$ . Then

∇g(x)=∇f(x+t(y−x))⋅(y−x) $\nabla g(x) = \nabla f(x+t(y-x))\cdot (y-x)$

\nabla g (t) - \nabla g (0) = \nabla f (x + t (y - x)) \cdot (y - x) - \nabla f (x) \cdot (y - x) = 1 t (\nabla f (x + t (y - x)) - \nabla f (x)) \cdot t (y - x) \geq 0

$\begin{align} \nabla g(t) - \nabla g(0) & = \nabla f(x+t(y-x))\cdot (y-x) - \nabla f(x)\cdot (y-x) \\ &= \frac{1}{t}(\nabla f(x+t(y-x)) - \nabla f(x))\cdot t(y-x)\\ &\geq 0 \end{align}$
so

∇g(t) $\nabla g(t)$ is monotone increasing.

g (1) \Rightarrow = g (0) + \int 10 \nabla g (t) d t \geq \nabla g (0) f (y) \geq f (x) + \nabla f (x) \cdot (y - x)

$\begin{align} g(1) &=g(0)+ \int_0^1 \nabla g(t) dt\geq \nabla g(0) \\ \Rightarrow& f(y) \geq f(x) +\nabla f(x)\cdot (y-x) \end{align}$

Theorem 5. [second order condition]: If $f(x)$ is second differentiable, then $f(x)$ is convex iff $\forall x$

\nabla 2 f (x) \geq 0

$\nabla^2 f(x)\geq 0$
This equivalence holds for strictly convex for

> $>$ .
proof:
For simply, we firstly prove one variable function situation:
If $h(x): x\in \mathbb{R}$ is convex iff its twice derivative $h''(x)\geq 0$
sufficient:
From $h''(x)\geq 0$ and taylor expansion, we have
$h (y) \geq h (x) + h' (x) (y - x)$ $h(y) \geq h(x)+h'(x)(y-x)$
and from last theorem, we know $h(x)$ is convex.
necessary:
$\forall x\leq z\leq y$ , we have $z=\lambda x+(1-\lambda y)$ with $\lambda = \frac{y-z}{y-x}$
$h (z) = h (λ x + (1 - λ) y) \leq λ h (x) + (1 - λ) h (y) = y - z y - x h (x) + z - x y - x h (y)$ $\begin{align} h(z) &= h(\lambda x+(1-\lambda)y)\\ &\leq \lambda h(x)+(1-\lambda)h(y)\\ &=\frac{y-z}{y-x} h(x)+\frac{z-x}{y-x}h(y) \end{align}$
$\Rightarrow (y - x) h (z) \leq (y - z) h (x) + (z - x) h (y)$ $\Rightarrow (y-x)h(z)\leq (y-z)h(x)+(z-x)h(y)$
$\Rightarrow (y - z) (h (z) - h (x)) \leq (z - x) (h (y) - h (z))$ $\Rightarrow(y-z)(h(z)-h(x))\leq (z-x)(h(y)-h(z))$
$\Rightarrow h ( z ) - h ( x ) z - x \leq h ( y ) - h ( z ) y - z$ $\Rightarrow\frac{h(z)-h(x)}{z-x}\leq \frac{h(y) - h(z)}{y-z}$

So for $t_1\leq x\leq z\leq y\leq t_2$ , we have

$h ( x ) - h ( t 1 ) x - t 1 \leq h ( z ) - h ( x ) z - x \leq h ( y ) - h ( z ) y - z \leq h ( t 2 ) - h ( y ) t 2 - y$ $\frac{h(x)-h(t_1)}{x-t_1}\leq \frac{h(z)-h(x)}{z-x}\leq \frac{h(y) - h(z)}{y-z}\leq \frac{h(t_2) - h(y)}{t_2 - y}$
letting $t_1\rightarrow x$ and $t_2 \rightarrow y$ , we have
$h' (x) \leq h ( z ) - h ( x ) z - x \leq h ( y ) - h ( z ) y - z \leq h' (y)$ $h'(x)\leq \frac{h(z)-h(x)}{z-x}\leq \frac{h(y) - h(z)}{y-z} \leq h'(y)$

So $h'(x)$ is increasing $\rightarrow$ $h''(x)\geq 0$ .

Now we prove for multivariable function. Let $g(t)=f(x+t\ell)$ be one variable function.
sufficient:
From convexity of $f(x)$ ,

$g (λ t 1 + (1 - λ) t 2) = f (x + λ t 1 ℓ + (1 - λ) ℓ) \leq λ f (x + t 1 ℓ) + (1 - λ) f (x + t 2 ℓ) = g (t 1) + g (t 2)$ $g(\lambda t_1+(1-\lambda)t_2) = f(x+\lambda t_1\ell+(1-\lambda)\ell) \leq \lambda f(x+t_1\ell)+(1-\lambda)f(x+t_2\ell) = g(t_1)+g(t_2)$
So $g(t)$ is convex as a one variable function. Then
$g'' (t) = ℓ t \nabla 2 f (x + t ℓ) ℓ \geq 0$ $g''(t) = \ell^t\nabla^2 f(x+t\ell) \ell\geq 0$
So $\nabla 2 f (x) \geq 0$ $\nabla^2 f(x) \geq 0$
necessary:*
Let $g(t) = f(x+t(y-x))$ , then
$g'' (t) = (y - x) t \nabla 2 f (x + t (y - x)) (y - x) \geq 0$ $g''(t)=(y-x)^t \nabla^2 f(x+t(y-x)) (y-x)\geq 0$
So $g(t)$ is convex.

Then

$f (λ x + (1 - λ) y) = f (x + (1 - λ) (y - x)) = g (1 - λ) = g (λ 0 + (1 - λ) 1) \leq λ g (0) + (1 - λ) g (1) = λ g (x) + (1 - λ) f (y)$ $\begin{align} f(\lambda x+(1-\lambda)y) &= f(x+(1-\lambda)(y-x))\\ &=g(1-\lambda) = g(\lambda 0 +(1-\lambda) 1)\\ &\leq \lambda g(0)+(1-\lambda)g(1)\\ &=\lambda g(x)+(1-\lambda)f(y) \end{align}$
So $f(x)$ is convex.

From the proof, we know that the convexity of a function on a convex set is one-dimensional fact.

Intuition:

convex says a function is convex $\geq$ a linear function
strictly convex says a function is convex $>$ a linear function

2. Strong convex

Definition 3: [strong convex]: $f(x)$ is said to be m-strong convex if $f(x)-\frac{m}{2}\|x\|_2^2$ is convex.

Then from last section, we have that:
first order condition (1):

$f (y) \geq f (x) + \nabla f (x) \cdot (y - x) + m 2 ∥ y - x ∥ 22$ $f(y)\geq f(x)+\nabla f(x)\cdot (y-x)+\frac{m}{2}\|y-x\|_2^2$
first order condition (2)[monotone of derivative]:
$(\nabla f (x) - \nabla f (y)) \cdot (x - y) > m ∥ x - y ∥ 22$ $(\nabla f(x) - \nabla f(y)) \cdot (x-y)> m\|x-y\|_2^2$
seconf order condition :
$\nabla 2 f (x) > m \cdot I$ $\nabla^2 f(x)> m\cdot I$

Intuition: strong convex says a function is convex $\geq$ a quadratic function.

Theorem: If a function is strong convex then the first derivative of it is Lipschitz continuous.
proof: Firstly, we claim that the subset $S=\{x, f(x)\leq f(x^{(0)})\}$ is closed. Since $\forall y \in S$ , we have

$f (x (0)) \geq f (y) \geq f (x *) + \nabla f (x *) \cdot (y - x) + m 2 ∥ y - x * ∥ 22 \Rightarrow ∥ y - x * ∥ 22 \leq 2 m f (x (0))$ $\begin{align} &f(x^{(0)})\geq f(y) \geq f(x^*) +\nabla f(x^*)\cdot (y-x)+\frac{m}{2}\|y-x^*\|_2^2\\ &\Rightarrow \| y - x^*\|_2^2 \leq \frac{2}{m} f(x^{(0)}) \end{align}$

And the maximum eigenvalue of $\nabla^2 f(x)$ is continuous, so there exists a upper bound $M$ for $\nabla^2 f(x)$ , which says that $\nabla f(x)$ is lipschitz continuous.